1 <?xml version="1.0"?> <!-- -*- sgml -*- --> 2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> 4 5 6 <chapter id="mc-manual" xreflabel="Memcheck: a memory error detector"> 7 <title>Memcheck: a memory error detector</title> 8 9 <para>To use this tool, you may specify <option>--tool=memcheck</option> 10 on the Valgrind command line. You don't have to, though, since Memcheck 11 is the default tool.</para> 12 13 14 <sect1 id="mc-manual.overview" xreflabel="Overview"> 15 <title>Overview</title> 16 17 <para>Memcheck is a memory error detector. It can detect the following 18 problems that are common in C and C++ programs.</para> 19 20 <itemizedlist> 21 <listitem> 22 <para>Accessing memory you shouldn't, e.g. overrunning and underrunning 23 heap blocks, overrunning the top of the stack, and accessing memory after 24 it has been freed.</para> 25 </listitem> 26 27 <listitem> 28 <para>Using undefined values, i.e. values that have not been initialised, 29 or that have been derived from other undefined values.</para> 30 </listitem> 31 32 <listitem> 33 <para>Incorrect freeing of heap memory, such as double-freeing heap 34 blocks, or mismatched use of 35 <function>malloc</function>/<computeroutput>new</computeroutput>/<computeroutput>new[]</computeroutput> 36 versus 37 <function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput></para> 38 </listitem> 39 40 <listitem> 41 <para>Overlapping <computeroutput>src</computeroutput> and 42 <computeroutput>dst</computeroutput> pointers in 43 <computeroutput>memcpy</computeroutput> and related 44 functions.</para> 45 </listitem> 46 47 <listitem> 48 <para>Memory leaks.</para> 49 </listitem> 50 </itemizedlist> 51 52 <para>Problems like these can be difficult to find by other means, 53 often remaining undetected for long periods, then causing occasional, 54 difficult-to-diagnose crashes.</para> 55 56 </sect1> 57 58 59 60 <sect1 id="mc-manual.errormsgs" 61 xreflabel="Explanation of error messages from Memcheck"> 62 <title>Explanation of error messages from Memcheck</title> 63 64 <para>Memcheck issues a range of error messages. This section presents a 65 quick summary of what error messages mean. The precise behaviour of the 66 error-checking machinery is described in <xref 67 linkend="mc-manual.machine"/>.</para> 68 69 70 <sect2 id="mc-manual.badrw" 71 xreflabel="Illegal read / Illegal write errors"> 72 <title>Illegal read / Illegal write errors</title> 73 74 <para>For example:</para> 75 <programlisting><![CDATA[ 76 Invalid read of size 4 77 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) 78 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) 79 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) 80 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) 81 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd 82 ]]></programlisting> 83 84 <para>This happens when your program reads or writes memory at a place 85 which Memcheck reckons it shouldn't. In this example, the program did a 86 4-byte read at address 0xBFFFF0E0, somewhere within the system-supplied 87 library libpng.so.2.1.0.9, which was called from somewhere else in the 88 same library, called from line 326 of <filename>qpngio.cpp</filename>, 89 and so on.</para> 90 91 <para>Memcheck tries to establish what the illegal address might relate 92 to, since that's often useful. So, if it points into a block of memory 93 which has already been freed, you'll be informed of this, and also where 94 the block was freed. Likewise, if it should turn out to be just off 95 the end of a heap block, a common result of off-by-one-errors in 96 array subscripting, you'll be informed of this fact, and also where the 97 block was allocated. If you use the <option><xref 98 linkend="opt.read-var-info"/></option> option Memcheck will run more slowly 99 but may give a more detailed description of any illegal address.</para> 100 101 <para>In this example, Memcheck can't identify the address. Actually 102 the address is on the stack, but, for some reason, this is not a valid 103 stack address -- it is below the stack pointer and that isn't allowed. 104 In this particular case it's probably caused by GCC generating invalid 105 code, a known bug in some ancient versions of GCC.</para> 106 107 <para>Note that Memcheck only tells you that your program is about to 108 access memory at an illegal address. It can't stop the access from 109 happening. So, if your program makes an access which normally would 110 result in a segmentation fault, you program will still suffer the same 111 fate -- but you will get a message from Memcheck immediately prior to 112 this. In this particular example, reading junk on the stack is 113 non-fatal, and the program stays alive.</para> 114 115 </sect2> 116 117 118 119 <sect2 id="mc-manual.uninitvals" 120 xreflabel="Use of uninitialised values"> 121 <title>Use of uninitialised values</title> 122 123 <para>For example:</para> 124 <programlisting><![CDATA[ 125 Conditional jump or move depends on uninitialised value(s) 126 at 0x402DFA94: _IO_vfprintf (_itoa.h:49) 127 by 0x402E8476: _IO_printf (printf.c:36) 128 by 0x8048472: main (tests/manuel1.c:8) 129 ]]></programlisting> 130 131 <para>An uninitialised-value use error is reported when your program 132 uses a value which hasn't been initialised -- in other words, is 133 undefined. Here, the undefined value is used somewhere inside the 134 <function>printf</function> machinery of the C library. This error was 135 reported when running the following small program:</para> 136 <programlisting><![CDATA[ 137 int main() 138 { 139 int x; 140 printf ("x = %d\n", x); 141 }]]></programlisting> 142 143 <para>It is important to understand that your program can copy around 144 junk (uninitialised) data as much as it likes. Memcheck observes this 145 and keeps track of the data, but does not complain. A complaint is 146 issued only when your program attempts to make use of uninitialised 147 data in a way that might affect your program's externally-visible behaviour. 148 In this example, <varname>x</varname> is uninitialised. Memcheck observes 149 the value being passed to <function>_IO_printf</function> and thence to 150 <function>_IO_vfprintf</function>, but makes no comment. However, 151 <function>_IO_vfprintf</function> has to examine the value of 152 <varname>x</varname> so it can turn it into the corresponding ASCII string, 153 and it is at this point that Memcheck complains.</para> 154 155 <para>Sources of uninitialised data tend to be:</para> 156 <itemizedlist> 157 <listitem> 158 <para>Local variables in procedures which have not been initialised, 159 as in the example above.</para> 160 </listitem> 161 <listitem> 162 <para>The contents of heap blocks (allocated with 163 <function>malloc</function>, <function>new</function>, or a similar 164 function) before you (or a constructor) write something there. 165 </para> 166 </listitem> 167 </itemizedlist> 168 169 <para>To see information on the sources of uninitialised data in your 170 program, use the <option>--track-origins=yes</option> option. This 171 makes Memcheck run more slowly, but can make it much easier to track down 172 the root causes of uninitialised value errors.</para> 173 174 </sect2> 175 176 177 178 <sect2 id="mc-manual.bad-syscall-args" 179 xreflabel="Use of uninitialised or unaddressable values in system 180 calls"> 181 <title>Use of uninitialised or unaddressable values in system 182 calls</title> 183 184 <para>Memcheck checks all parameters to system calls: 185 <itemizedlist> 186 <listitem> 187 <para>It checks all the direct parameters themselves, whether they are 188 initialised.</para> 189 </listitem> 190 <listitem> 191 <para>Also, if a system call needs to read from a buffer provided by 192 your program, Memcheck checks that the entire buffer is addressable 193 and its contents are initialised.</para> 194 </listitem> 195 <listitem> 196 <para>Also, if the system call needs to write to a user-supplied 197 buffer, Memcheck checks that the buffer is addressable.</para> 198 </listitem> 199 </itemizedlist> 200 </para> 201 202 <para>After the system call, Memcheck updates its tracked information to 203 precisely reflect any changes in memory state caused by the system 204 call.</para> 205 206 <para>Here's an example of two system calls with invalid parameters:</para> 207 <programlisting><![CDATA[ 208 #include <stdlib.h> 209 #include <unistd.h> 210 int main( void ) 211 { 212 char* arr = malloc(10); 213 int* arr2 = malloc(sizeof(int)); 214 write( 1 /* stdout */, arr, 10 ); 215 exit(arr2[0]); 216 } 217 ]]></programlisting> 218 219 <para>You get these complaints ...</para> 220 <programlisting><![CDATA[ 221 Syscall param write(buf) points to uninitialised byte(s) 222 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so) 223 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so) 224 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out) 225 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd 226 at 0x259852B0: malloc (vg_replace_malloc.c:130) 227 by 0x80483F1: main (a.c:5) 228 229 Syscall param exit(error_code) contains uninitialised byte(s) 230 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so) 231 by 0x8048426: main (a.c:8) 232 ]]></programlisting> 233 234 <para>... because the program has (a) written uninitialised junk 235 from the heap block to the standard output, and (b) passed an 236 uninitialised value to <function>exit</function>. Note that the first 237 error refers to the memory pointed to by 238 <computeroutput>buf</computeroutput> (not 239 <computeroutput>buf</computeroutput> itself), but the second error 240 refers directly to <computeroutput>exit</computeroutput>'s argument 241 <computeroutput>arr2[0]</computeroutput>.</para> 242 243 </sect2> 244 245 246 <sect2 id="mc-manual.badfrees" xreflabel="Illegal frees"> 247 <title>Illegal frees</title> 248 249 <para>For example:</para> 250 <programlisting><![CDATA[ 251 Invalid free() 252 at 0x4004FFDF: free (vg_clientmalloc.c:577) 253 by 0x80484C7: main (tests/doublefree.c:10) 254 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd 255 at 0x4004FFDF: free (vg_clientmalloc.c:577) 256 by 0x80484C7: main (tests/doublefree.c:10) 257 ]]></programlisting> 258 259 <para>Memcheck keeps track of the blocks allocated by your program 260 with <function>malloc</function>/<computeroutput>new</computeroutput>, 261 so it can know exactly whether or not the argument to 262 <function>free</function>/<computeroutput>delete</computeroutput> is 263 legitimate or not. Here, this test program has freed the same block 264 twice. As with the illegal read/write errors, Memcheck attempts to 265 make sense of the address freed. If, as here, the address is one 266 which has previously been freed, you wil be told that -- making 267 duplicate frees of the same block easy to spot. You will also get this 268 message if you try to free a pointer that doesn't point to the start of a 269 heap block.</para> 270 271 </sect2> 272 273 274 <sect2 id="mc-manual.rudefn" 275 xreflabel="When a heap block is freed with an inappropriate deallocation 276 function"> 277 <title>When a heap block is freed with an inappropriate deallocation 278 function</title> 279 280 <para>In the following example, a block allocated with 281 <function>new[]</function> has wrongly been deallocated with 282 <function>free</function>:</para> 283 <programlisting><![CDATA[ 284 Mismatched free() / delete / delete [] 285 at 0x40043249: free (vg_clientfuncs.c:171) 286 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149) 287 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) 288 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) 289 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd 290 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152) 291 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) 292 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416) 293 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272) 294 ]]></programlisting> 295 296 <para>In <literal>C++</literal> it's important to deallocate memory in a 297 way compatible with how it was allocated. The deal is:</para> 298 <itemizedlist> 299 <listitem> 300 <para>If allocated with 301 <function>malloc</function>, 302 <function>calloc</function>, 303 <function>realloc</function>, 304 <function>valloc</function> or 305 <function>memalign</function>, you must 306 deallocate with <function>free</function>.</para> 307 </listitem> 308 <listitem> 309 <para>If allocated with <function>new</function>, you must deallocate 310 with <function>delete</function>.</para> 311 </listitem> 312 <listitem> 313 <para>If allocated with <function>new[]</function>, you must 314 deallocate with <function>delete[]</function>.</para> 315 </listitem> 316 </itemizedlist> 317 318 <para>The worst thing is that on Linux apparently it doesn't matter if 319 you do mix these up, but the same program may then crash on a 320 different platform, Solaris for example. So it's best to fix it 321 properly. According to the KDE folks "it's amazing how many C++ 322 programmers don't know this".</para> 323 324 <para>The reason behind the requirement is as follows. In some C++ 325 implementations, <function>delete[]</function> must be used for 326 objects allocated by <function>new[]</function> because the compiler 327 stores the size of the array and the pointer-to-member to the 328 destructor of the array's content just before the pointer actually 329 returned. <function>delete</function> doesn't account for this and will get 330 confused, possibly corrupting the heap.</para> 331 332 </sect2> 333 334 335 336 <sect2 id="mc-manual.overlap" 337 xreflabel="Overlapping source and destination blocks"> 338 <title>Overlapping source and destination blocks</title> 339 340 <para>The following C library functions copy some data from one 341 memory block to another (or something similar): 342 <function>memcpy</function>, 343 <function>strcpy</function>, 344 <function>strncpy</function>, 345 <function>strcat</function>, 346 <function>strncat</function>. 347 The blocks pointed to by their <computeroutput>src</computeroutput> and 348 <computeroutput>dst</computeroutput> pointers aren't allowed to overlap. 349 The POSIX standards have wording along the lines "If copying takes place 350 between objects that overlap, the behavior is undefined." Therefore, 351 Memcheck checks for this. 352 </para> 353 354 <para>For example:</para> 355 <programlisting><![CDATA[ 356 ==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21) 357 ==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71) 358 ==27492== by 0x804865A: main (overlap.c:40) 359 ]]></programlisting> 360 361 <para>You don't want the two blocks to overlap because one of them could 362 get partially overwritten by the copying.</para> 363 364 <para>You might think that Memcheck is being overly pedantic reporting 365 this in the case where <computeroutput>dst</computeroutput> is less than 366 <computeroutput>src</computeroutput>. For example, the obvious way to 367 implement <function>memcpy</function> is by copying from the first 368 byte to the last. However, the optimisation guides of some 369 architectures recommend copying from the last byte down to the first. 370 Also, some implementations of <function>memcpy</function> zero 371 <computeroutput>dst</computeroutput> before copying, because zeroing the 372 destination's cache line(s) can improve performance.</para> 373 374 <para>The moral of the story is: if you want to write truly portable 375 code, don't make any assumptions about the language 376 implementation.</para> 377 378 </sect2> 379 380 381 <sect2 id="mc-manual.leaks" xreflabel="Memory leak detection"> 382 <title>Memory leak detection</title> 383 384 <para>Memcheck keeps track of all heap blocks issued in response to 385 calls to 386 <function>malloc</function>/<function>new</function> et al. 387 So when the program exits, it knows which blocks have not been freed. 388 </para> 389 390 <para>If <option>--leak-check</option> is set appropriately, for each 391 remaining block, Memcheck determines if the block is reachable from pointers 392 within the root-set. The root-set consists of (a) general purpose registers 393 of all threads, and (b) initialised, aligned, pointer-sized data words in 394 accessible client memory, including stacks.</para> 395 396 <para>There are two ways a block can be reached. The first is with a 397 "start-pointer", i.e. a pointer to the start of the block. The second is with 398 an "interior-pointer", i.e. a pointer to the middle of the block. There are 399 three ways we know of that an interior-pointer can occur:</para> 400 401 <itemizedlist> 402 <listitem> 403 <para>The pointer might have originally been a start-pointer and have been 404 moved along deliberately (or not deliberately) by the program. In 405 particular, this can happen if your program uses tagged pointers, i.e. 406 if it uses the bottom one, two or three bits of a pointer, which are 407 normally always zero due to alignment, in order to store extra 408 information.</para> 409 </listitem> 410 411 <listitem> 412 <para>It might be a random junk value in memory, entirely unrelated, just 413 a coincidence.</para> 414 </listitem> 415 416 <listitem> 417 <para>It might be a pointer to an array of C++ objects (which possess 418 destructors) allocated with <computeroutput>new[]</computeroutput>. In 419 this case, some compilers store a "magic cookie" containing the array 420 length at the start of the allocated block, and return a pointer to just 421 past that magic cookie, i.e. an interior-pointer. 422 See <ulink url="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html">this 423 page</ulink> for more information.</para> 424 </listitem> 425 </itemizedlist> 426 427 <para>With that in mind, consider the nine possible cases described by the 428 following figure.</para> 429 430 <programlisting><![CDATA[ 431 Pointer chain AAA Category BBB Category 432 ------------- ------------ ------------ 433 (1) RRR ------------> BBB DR 434 (2) RRR ---> AAA ---> BBB DR IR 435 (3) RRR BBB DL 436 (4) RRR AAA ---> BBB DL IL 437 (5) RRR ------?-----> BBB (y)DR, (n)DL 438 (6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL 439 (7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL 440 (8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL 441 (9) RRR AAA -?-> BBB DL (y)IL, (n)DL 442 443 Pointer chain legend: 444 - RRR: a root set node or DR block 445 - AAA, BBB: heap blocks 446 - --->: a start-pointer 447 - -?->: an interior-pointer 448 449 Category legend: 450 - DR: Directly reachable 451 - IR: Indirectly reachable 452 - DL: Directly lost 453 - IL: Indirectly lost 454 - (y)XY: it's XY if the interior-pointer is a real pointer 455 - (n)XY: it's XY if the interior-pointer is not a real pointer 456 - (_)XY: it's XY in either case 457 ]]></programlisting> 458 459 <para>Every possible case can be reduced to one of the above nine. Memcheck 460 merges some of these cases in its output, resulting in the following four 461 categories.</para> 462 463 464 <itemizedlist> 465 466 <listitem> 467 <para>"Still reachable". This covers cases 1 and 2 (for the BBB blocks) 468 above. A start-pointer or chain of start-pointers to the block is 469 found. Since the block is still pointed at, the programmer could, at 470 least in principle, have freed it before program exit. Because these 471 are very common and arguably not a problem, Memcheck won't report such 472 blocks individually unless <option>--show-reachable=yes</option> is 473 specified.</para> 474 </listitem> 475 476 <listitem> 477 <para>"Definitely lost". This covers case 3 (for the BBB blocks) above. 478 This means that no pointer to the block can be found. The block is 479 classified as "lost", because the programmer could not possibly have 480 freed it at program exit, since no pointer to it exists. This is likely 481 a symptom of having lost the pointer at some earlier point in the 482 program. Such cases should be fixed by the programmer.</para> 483 </listitem> 484 485 <listitem> 486 <para>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) 487 above. This means that the block is lost, not because there are no 488 pointers to it, but rather because all the blocks that point to it are 489 themselves lost. For example, if you have a binary tree and the root 490 node is lost, all its children nodes will be indirectly lost. Because 491 the problem will disappear if the definitely lost block that caused the 492 indirect leak is fixed, Memcheck won't report such blocks individually 493 unless <option>--show-reachable=yes</option> is specified.</para> 494 </listitem> 495 496 <listitem> 497 <para>"Possibly lost". This covers cases 5--8 (for the BBB blocks) 498 above. This means that a chain of one or more pointers to the block has 499 been found, but at least one of the pointers is an interior-pointer. 500 This could just be a random value in memory that happens to point into a 501 block, and so you shouldn't consider this ok unless you know you have 502 interior-pointers.</para> 503 </listitem> 504 505 </itemizedlist> 506 507 <para>(Note: This mapping of the nine possible cases onto four categories is 508 not necessarily the best way that leaks could be reported; in particular, 509 interior-pointers are treated inconsistently. It is possible the 510 categorisation may be improved in the future.)</para> 511 512 <para>Furthermore, if suppressions exists for a block, it will be reported 513 as "suppressed" no matter what which of the above four categories it belongs 514 to.</para> 515 516 517 <para>The following is an example leak summary.</para> 518 519 <programlisting><![CDATA[ 520 LEAK SUMMARY: 521 definitely lost: 48 bytes in 3 blocks. 522 indirectly lost: 32 bytes in 2 blocks. 523 possibly lost: 96 bytes in 6 blocks. 524 still reachable: 64 bytes in 4 blocks. 525 suppressed: 0 bytes in 0 blocks. 526 ]]></programlisting> 527 528 <para>If <option>--leak-check=full</option> is specified, 529 Memcheck will give details for each definitely lost or possibly lost block, 530 including where it was allocated. (Actually, it merges results for all 531 blocks that have the same category and sufficiently similar stack traces 532 into a single "loss record". The 533 <option>--leak-resolution</option> lets you control the 534 meaning of "sufficiently similar".) It cannot tell you when or how or why 535 the pointer to a leaked block was lost; you have to work that out for 536 yourself. In general, you should attempt to ensure your programs do not 537 have any definitely lost or possibly lost blocks at exit.</para> 538 539 <para>For example:</para> 540 <programlisting><![CDATA[ 541 8 bytes in 1 blocks are definitely lost in loss record 1 of 14 542 at 0x........: malloc (vg_replace_malloc.c:...) 543 by 0x........: mk (leak-tree.c:11) 544 by 0x........: main (leak-tree.c:39) 545 546 88 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14 547 at 0x........: malloc (vg_replace_malloc.c:...) 548 by 0x........: mk (leak-tree.c:11) 549 by 0x........: main (leak-tree.c:25) 550 ]]></programlisting> 551 552 <para>The first message describes a simple case of a single 8 byte block 553 that has been definitely lost. The second case mentions another 8 byte 554 block that has been definitely lost; the difference is that a further 80 555 bytes in other blocks are indirectly lost because of this lost block. 556 The loss records are not presented in any notable order, so the loss record 557 numbers aren't particularly meaningful.</para> 558 559 <para>If you specify <option>--show-reachable=yes</option>, 560 reachable and indirectly lost blocks will also be shown, as the following 561 two examples show.</para> 562 563 <programlisting><![CDATA[ 564 64 bytes in 4 blocks are still reachable in loss record 2 of 4 565 at 0x........: malloc (vg_replace_malloc.c:177) 566 by 0x........: mk (leak-cases.c:52) 567 by 0x........: main (leak-cases.c:74) 568 569 32 bytes in 2 blocks are indirectly lost in loss record 1 of 4 570 at 0x........: malloc (vg_replace_malloc.c:177) 571 by 0x........: mk (leak-cases.c:52) 572 by 0x........: main (leak-cases.c:80) 573 ]]></programlisting> 574 575 <para>Because there are different kinds of leaks with different severities, an 576 interesting question is this: which leaks should be counted as true "errors" 577 and which should not? The answer to this question affects the numbers printed 578 in the <computeroutput>ERROR SUMMARY</computeroutput> line, and also the effect 579 of the <option>--error-exitcode</option> option. Memcheck uses the following 580 criteria:</para> 581 582 <itemizedlist> 583 <listitem> 584 <para>First, a leak is only counted as a true "error" if 585 <option>--leak-check=full</option> is specified. In other words, an 586 unprinted leak is not considered a true "error". If this were not the 587 case, it would be possible to get a high error count but not have any 588 errors printed, which would be confusing.</para> 589 </listitem> 590 591 <listitem> 592 <para>After that, definitely lost and possibly lost blocks are counted as 593 true "errors". Indirectly lost and still reachable blocks are not counted 594 as true "errors", even if <option>--show-reachable=yes</option> is 595 specified and they are printed; this is because such blocks don't need 596 direct fixing by the programmer. 597 </para> 598 </listitem> 599 </itemizedlist> 600 601 </sect2> 602 603 </sect1> 604 605 606 607 <sect1 id="mc-manual.options" 608 xreflabel="Memcheck Command-Line Options"> 609 <title>Memcheck Command-Line Options</title> 610 611 <!-- start of xi:include in the manpage --> 612 <variablelist id="mc.opts.list"> 613 614 <varlistentry id="opt.leak-check" xreflabel="--leak-check"> 615 <term> 616 <option><![CDATA[--leak-check=<no|summary|yes|full> [default: summary] ]]></option> 617 </term> 618 <listitem> 619 <para>When enabled, search for memory leaks when the client 620 program finishes. If set to <varname>summary</varname>, it says how 621 many leaks occurred. If set to <varname>full</varname> or 622 <varname>yes</varname>, it also gives details of each individual 623 leak.</para> 624 </listitem> 625 </varlistentry> 626 627 <varlistentry id="opt.show-possibly-lost" xreflabel="--show-possibly-lost"> 628 <term> 629 <option><![CDATA[--show-possibly-lost=<yes|no> [default: yes] ]]></option> 630 </term> 631 <listitem> 632 <para>When disabled, the memory leak detector will not show "possibly lost" blocks. 633 </para> 634 </listitem> 635 </varlistentry> 636 637 <varlistentry id="opt.leak-resolution" xreflabel="--leak-resolution"> 638 <term> 639 <option><![CDATA[--leak-resolution=<low|med|high> [default: high] ]]></option> 640 </term> 641 <listitem> 642 <para>When doing leak checking, determines how willing 643 Memcheck is to consider different backtraces to 644 be the same for the purposes of merging multiple leaks into a single 645 leak report. When set to <varname>low</varname>, only the first 646 two entries need match. When <varname>med</varname>, four entries 647 have to match. When <varname>high</varname>, all entries need to 648 match.</para> 649 650 <para>For hardcore leak debugging, you probably want to use 651 <option>--leak-resolution=high</option> together with 652 <option>--num-callers=40</option> or some such large number. 653 </para> 654 655 <para>Note that the <option>--leak-resolution</option> setting 656 does not affect Memcheck's ability to find 657 leaks. It only changes how the results are presented.</para> 658 </listitem> 659 </varlistentry> 660 661 <varlistentry id="opt.show-reachable" xreflabel="--show-reachable"> 662 <term> 663 <option><![CDATA[--show-reachable=<yes|no> [default: no] ]]></option> 664 </term> 665 <listitem> 666 <para>When disabled, the memory leak detector only shows "definitely 667 lost" and "possibly lost" blocks. When enabled, the leak detector also 668 shows "reachable" and "indirectly lost" blocks. (In other words, it 669 shows all blocks, except suppressed ones, so 670 <option>--show-all</option> would be a better name for 671 it.)</para> 672 </listitem> 673 </varlistentry> 674 675 <varlistentry id="opt.undef-value-errors" xreflabel="--undef-value-errors"> 676 <term> 677 <option><![CDATA[--undef-value-errors=<yes|no> [default: yes] ]]></option> 678 </term> 679 <listitem> 680 <para>Controls whether Memcheck reports 681 uses of undefined value errors. Set this to 682 <varname>no</varname> if you don't want to see undefined value 683 errors. It also has the side effect of speeding up 684 Memcheck somewhat. 685 </para> 686 </listitem> 687 </varlistentry> 688 689 <varlistentry id="opt.track-origins" xreflabel="--track-origins"> 690 <term> 691 <option><![CDATA[--track-origins=<yes|no> [default: no] ]]></option> 692 </term> 693 <listitem> 694 <para>Controls whether Memcheck tracks 695 the origin of uninitialised values. By default, it does not, 696 which means that although it can tell you that an 697 uninitialised value is being used in a dangerous way, it 698 cannot tell you where the uninitialised value came from. This 699 often makes it difficult to track down the root problem. 700 </para> 701 <para>When set 702 to <varname>yes</varname>, Memcheck keeps 703 track of the origins of all uninitialised values. Then, when 704 an uninitialised value error is 705 reported, Memcheck will try to show the 706 origin of the value. An origin can be one of the following 707 four places: a heap block, a stack allocation, a client 708 request, or miscellaneous other sources (eg, a call 709 to <varname>brk</varname>). 710 </para> 711 <para>For uninitialised values originating from a heap 712 block, Memcheck shows where the block was 713 allocated. For uninitialised values originating from a stack 714 allocation, Memcheck can tell you which 715 function allocated the value, but no more than that -- typically 716 it shows you the source location of the opening brace of the 717 function. So you should carefully check that all of the 718 function's local variables are initialised properly. 719 </para> 720 <para>Performance overhead: origin tracking is expensive. It 721 halves Memcheck's speed and increases 722 memory use by a minimum of 100MB, and possibly more. 723 Nevertheless it can drastically reduce the effort required to 724 identify the root cause of uninitialised value errors, and so 725 is often a programmer productivity win, despite running 726 more slowly. 727 </para> 728 <para>Accuracy: Memcheck tracks origins 729 quite accurately. To avoid very large space and time 730 overheads, some approximations are made. It is possible, 731 although unlikely, that Memcheck will report an incorrect origin, or 732 not be able to identify any origin. 733 </para> 734 <para>Note that the combination 735 <option>--track-origins=yes</option> 736 and <option>--undef-value-errors=no</option> is 737 nonsensical. Memcheck checks for and 738 rejects this combination at startup. 739 </para> 740 </listitem> 741 </varlistentry> 742 743 <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok"> 744 <term> 745 <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option> 746 </term> 747 <listitem> 748 <para>Controls how Memcheck handles word-sized, 749 word-aligned loads from addresses for which some bytes are 750 addressable and others are not. When <varname>yes</varname>, such 751 loads do not produce an address error. Instead, loaded bytes 752 originating from illegal addresses are marked as uninitialised, and 753 those corresponding to legal addresses are handled in the normal 754 way.</para> 755 756 <para>When <varname>no</varname>, loads from partially invalid 757 addresses are treated the same as loads from completely invalid 758 addresses: an illegal-address error is issued, and the resulting 759 bytes are marked as initialised.</para> 760 761 <para>Note that code that behaves in this way is in violation of 762 the the ISO C/C++ standards, and should be considered broken. If 763 at all possible, such code should be fixed. This option should be 764 used only as a last resort.</para> 765 </listitem> 766 </varlistentry> 767 768 <varlistentry id="opt.freelist-vol" xreflabel="--freelist-vol"> 769 <term> 770 <option><![CDATA[--freelist-vol=<number> [default: 20000000] ]]></option> 771 </term> 772 <listitem> 773 <para>When the client program releases memory using 774 <function>free</function> (in <literal>C</literal>) or 775 <computeroutput>delete</computeroutput> 776 (<literal>C++</literal>), that memory is not immediately made 777 available for re-allocation. Instead, it is marked inaccessible 778 and placed in a queue of freed blocks. The purpose is to defer as 779 long as possible the point at which freed-up memory comes back 780 into circulation. This increases the chance that 781 Memcheck will be able to detect invalid 782 accesses to blocks for some significant period of time after they 783 have been freed.</para> 784 785 <para>This option specifies the maximum total size, in bytes, of the 786 blocks in the queue. The default value is twenty million bytes. 787 Increasing this increases the total amount of memory used by 788 Memcheck but may detect invalid uses of freed 789 blocks which would otherwise go undetected.</para> 790 </listitem> 791 </varlistentry> 792 793 <varlistentry id="opt.freelist-big-blocks" xreflabel="--freelist-big-blocks"> 794 <term> 795 <option><![CDATA[--freelist-big-blocks=<number> [default: 1000000] ]]></option> 796 </term> 797 <listitem> 798 <para>When making blocks from the queue of freed blocks available 799 for re-allocation, Memcheck will in priority re-circulate the blocks 800 with a size greater or equal to <option>--freelist-big-blocks</option>. 801 This ensures that freeing big blocks (in particular freeing blocks bigger than 802 <option>--freelist-vol</option>) does not immediately lead to a re-circulation 803 of all (or a lot of) the small blocks in the free list. In other words, 804 this option increases the likelihood to discover dangling pointers 805 for the "small" blocks, even when big blocks are freed.</para> 806 <para>Setting a value of 0 means that all the blocks are re-circulated 807 in a FIFO order. </para> 808 </listitem> 809 </varlistentry> 810 811 <varlistentry id="opt.workaround-gcc296-bugs" xreflabel="--workaround-gcc296-bugs"> 812 <term> 813 <option><![CDATA[--workaround-gcc296-bugs=<yes|no> [default: no] ]]></option> 814 </term> 815 <listitem> 816 <para>When enabled, assume that reads and writes some small 817 distance below the stack pointer are due to bugs in GCC 2.96, and 818 does not report them. The "small distance" is 256 bytes by 819 default. Note that GCC 2.96 is the default compiler on some ancient 820 Linux distributions (RedHat 7.X) and so you may need to use this 821 option. Do not use it if you do not have to, as it can cause real 822 errors to be overlooked. A better alternative is to use a more 823 recent GCC in which this bug is fixed.</para> 824 825 <para>You may also need to use this option when working with 826 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because 827 GCC generates code which occasionally accesses below the 828 stack pointer, particularly for floating-point to/from integer 829 conversions. This is in violation of the 32-bit PowerPC ELF 830 specification, which makes no provision for locations below the 831 stack pointer to be accessible.</para> 832 </listitem> 833 </varlistentry> 834 835 <varlistentry id="opt.ignore-ranges" xreflabel="--ignore-ranges"> 836 <term> 837 <option><![CDATA[--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] ]]></option> 838 </term> 839 <listitem> 840 <para>Any ranges listed in this option (and multiple ranges can be 841 specified, separated by commas) will be ignored by Memcheck's 842 addressability checking.</para> 843 </listitem> 844 </varlistentry> 845 846 <varlistentry id="opt.malloc-fill" xreflabel="--malloc-fill"> 847 <term> 848 <option><![CDATA[--malloc-fill=<hexnumber> ]]></option> 849 </term> 850 <listitem> 851 <para>Fills blocks allocated 852 by <computeroutput>malloc</computeroutput>, 853 <computeroutput>new</computeroutput>, etc, but not 854 by <computeroutput>calloc</computeroutput>, with the specified 855 byte. This can be useful when trying to shake out obscure 856 memory corruption problems. The allocated area is still 857 regarded by Memcheck as undefined -- this option only affects its 858 contents. Note that <option>--malloc-fill</option> does not 859 affect a block of memory when it is used as argument 860 to client requests VALGRIND_MEMPOOL_ALLOC or 861 VALGRIND_MALLOCLIKE_BLOCK. 862 </para> 863 </listitem> 864 </varlistentry> 865 866 <varlistentry id="opt.free-fill" xreflabel="--free-fill"> 867 <term> 868 <option><![CDATA[--free-fill=<hexnumber> ]]></option> 869 </term> 870 <listitem> 871 <para>Fills blocks freed 872 by <computeroutput>free</computeroutput>, 873 <computeroutput>delete</computeroutput>, etc, with the 874 specified byte value. This can be useful when trying to shake out 875 obscure memory corruption problems. The freed area is still 876 regarded by Memcheck as not valid for access -- this option only 877 affects its contents. Note that <option>--free-fill</option> does not 878 affect a block of memory when it is used as argument to 879 client requests VALGRIND_MEMPOOL_FREE or VALGRIND_FREELIKE_BLOCK. 880 </para> 881 </listitem> 882 </varlistentry> 883 884 </variablelist> 885 <!-- end of xi:include in the manpage --> 886 887 </sect1> 888 889 890 <sect1 id="mc-manual.suppfiles" xreflabel="Writing suppression files"> 891 <title>Writing suppression files</title> 892 893 <para>The basic suppression format is described in 894 <xref linkend="manual-core.suppress"/>.</para> 895 896 <para>The suppression-type (second) line should have the form:</para> 897 <programlisting><![CDATA[ 898 Memcheck:suppression_type]]></programlisting> 899 900 <para>The Memcheck suppression types are as follows:</para> 901 902 <itemizedlist> 903 <listitem> 904 <para><varname>Value1</varname>, 905 <varname>Value2</varname>, 906 <varname>Value4</varname>, 907 <varname>Value8</varname>, 908 <varname>Value16</varname>, 909 meaning an uninitialised-value error when 910 using a value of 1, 2, 4, 8 or 16 bytes.</para> 911 </listitem> 912 913 <listitem> 914 <para><varname>Cond</varname> (or its old 915 name, <varname>Value0</varname>), meaning use 916 of an uninitialised CPU condition code.</para> 917 </listitem> 918 919 <listitem> 920 <para><varname>Addr1</varname>, 921 <varname>Addr2</varname>, 922 <varname>Addr4</varname>, 923 <varname>Addr8</varname>, 924 <varname>Addr16</varname>, 925 meaning an invalid address during a 926 memory access of 1, 2, 4, 8 or 16 bytes respectively.</para> 927 </listitem> 928 929 <listitem> 930 <para><varname>Jump</varname>, meaning an 931 jump to an unaddressable location error.</para> 932 </listitem> 933 934 <listitem> 935 <para><varname>Param</varname>, meaning an 936 invalid system call parameter error.</para> 937 </listitem> 938 939 <listitem> 940 <para><varname>Free</varname>, meaning an 941 invalid or mismatching free.</para> 942 </listitem> 943 944 <listitem> 945 <para><varname>Overlap</varname>, meaning a 946 <computeroutput>src</computeroutput> / 947 <computeroutput>dst</computeroutput> overlap in 948 <function>memcpy</function> or a similar function.</para> 949 </listitem> 950 951 <listitem> 952 <para><varname>Leak</varname>, meaning 953 a memory leak.</para> 954 </listitem> 955 956 </itemizedlist> 957 958 <para><computeroutput>Param</computeroutput> errors have an extra 959 information line at this point, which is the name of the offending 960 system call parameter. No other error kinds have this extra 961 line.</para> 962 963 <para>The first line of the calling context: for <varname>ValueN</varname> 964 and <varname>AddrN</varname> errors, it is either the name of the function 965 in which the error occurred, or, failing that, the full path of the 966 <filename>.so</filename> file 967 or executable containing the error location. For <varname>Free</varname> errors, is the name 968 of the function doing the freeing (eg, <function>free</function>, 969 <function>__builtin_vec_delete</function>, etc). For 970 <varname>Overlap</varname> errors, is the name of the function with the 971 overlapping arguments (eg. <function>memcpy</function>, 972 <function>strcpy</function>, etc).</para> 973 974 <para>Lastly, there's the rest of the calling context.</para> 975 976 </sect1> 977 978 979 980 <sect1 id="mc-manual.machine" 981 xreflabel="Details of Memcheck's checking machinery"> 982 <title>Details of Memcheck's checking machinery</title> 983 984 <para>Read this section if you want to know, in detail, exactly 985 what and how Memcheck is checking.</para> 986 987 988 <sect2 id="mc-manual.value" xreflabel="Valid-value (V) bit"> 989 <title>Valid-value (V) bits</title> 990 991 <para>It is simplest to think of Memcheck implementing a synthetic CPU 992 which is identical to a real CPU, except for one crucial detail. Every 993 bit (literally) of data processed, stored and handled by the real CPU 994 has, in the synthetic CPU, an associated "valid-value" bit, which says 995 whether or not the accompanying bit has a legitimate value. In the 996 discussions which follow, this bit is referred to as the V (valid-value) 997 bit.</para> 998 999 <para>Each byte in the system therefore has a 8 V bits which follow it 1000 wherever it goes. For example, when the CPU loads a word-size item (4 1001 bytes) from memory, it also loads the corresponding 32 V bits from a 1002 bitmap which stores the V bits for the process' entire address space. 1003 If the CPU should later write the whole or some part of that value to 1004 memory at a different address, the relevant V bits will be stored back 1005 in the V-bit bitmap.</para> 1006 1007 <para>In short, each bit in the system has (conceptually) an associated V 1008 bit, which follows it around everywhere, even inside the CPU. Yes, all the 1009 CPU's registers (integer, floating point, vector and condition registers) 1010 have their own V bit vectors. For this to work, Memcheck uses a great deal 1011 of compression to represent the V bits compactly.</para> 1012 1013 <para>Copying values around does not cause Memcheck to check for, or 1014 report on, errors. However, when a value is used in a way which might 1015 conceivably affect your program's externally-visible behaviour, 1016 the associated V bits are immediately checked. If any of these indicate 1017 that the value is undefined (even partially), an error is reported.</para> 1018 1019 <para>Here's an (admittedly nonsensical) example:</para> 1020 <programlisting><![CDATA[ 1021 int i, j; 1022 int a[10], b[10]; 1023 for ( i = 0; i < 10; i++ ) { 1024 j = a[i]; 1025 b[i] = j; 1026 }]]></programlisting> 1027 1028 <para>Memcheck emits no complaints about this, since it merely copies 1029 uninitialised values from <varname>a[]</varname> into 1030 <varname>b[]</varname>, and doesn't use them in a way which could 1031 affect the behaviour of the program. However, if 1032 the loop is changed to:</para> 1033 <programlisting><![CDATA[ 1034 for ( i = 0; i < 10; i++ ) { 1035 j += a[i]; 1036 } 1037 if ( j == 77 ) 1038 printf("hello there\n"); 1039 ]]></programlisting> 1040 1041 <para>then Memcheck will complain, at the 1042 <computeroutput>if</computeroutput>, that the condition depends on 1043 uninitialised values. Note that it <command>doesn't</command> complain 1044 at the <varname>j += a[i];</varname>, since at that point the 1045 undefinedness is not "observable". It's only when a decision has to be 1046 made as to whether or not to do the <function>printf</function> -- an 1047 observable action of your program -- that Memcheck complains.</para> 1048 1049 <para>Most low level operations, such as adds, cause Memcheck to use the 1050 V bits for the operands to calculate the V bits for the result. Even if 1051 the result is partially or wholly undefined, it does not 1052 complain.</para> 1053 1054 <para>Checks on definedness only occur in three places: when a value is 1055 used to generate a memory address, when control flow decision needs to 1056 be made, and when a system call is detected, Memcheck checks definedness 1057 of parameters as required.</para> 1058 1059 <para>If a check should detect undefinedness, an error message is 1060 issued. The resulting value is subsequently regarded as well-defined. 1061 To do otherwise would give long chains of error messages. In other 1062 words, once Memcheck reports an undefined value error, it tries to 1063 avoid reporting further errors derived from that same undefined 1064 value.</para> 1065 1066 <para>This sounds overcomplicated. Why not just check all reads from 1067 memory, and complain if an undefined value is loaded into a CPU 1068 register? Well, that doesn't work well, because perfectly legitimate C 1069 programs routinely copy uninitialised values around in memory, and we 1070 don't want endless complaints about that. Here's the canonical example. 1071 Consider a struct like this:</para> 1072 <programlisting><![CDATA[ 1073 struct S { int x; char c; }; 1074 struct S s1, s2; 1075 s1.x = 42; 1076 s1.c = 'z'; 1077 s2 = s1; 1078 ]]></programlisting> 1079 1080 <para>The question to ask is: how large is <varname>struct S</varname>, 1081 in bytes? An <varname>int</varname> is 4 bytes and a 1082 <varname>char</varname> one byte, so perhaps a <varname>struct 1083 S</varname> occupies 5 bytes? Wrong. All non-toy compilers we know 1084 of will round the size of <varname>struct S</varname> up to a whole 1085 number of words, in this case 8 bytes. Not doing this forces compilers 1086 to generate truly appalling code for accessing arrays of 1087 <varname>struct S</varname>'s on some architectures.</para> 1088 1089 <para>So <varname>s1</varname> occupies 8 bytes, yet only 5 of them will 1090 be initialised. For the assignment <varname>s2 = s1</varname>, GCC 1091 generates code to copy all 8 bytes wholesale into <varname>s2</varname> 1092 without regard for their meaning. If Memcheck simply checked values as 1093 they came out of memory, it would yelp every time a structure assignment 1094 like this happened. So the more complicated behaviour described above 1095 is necessary. This allows GCC to copy 1096 <varname>s1</varname> into <varname>s2</varname> any way it likes, and a 1097 warning will only be emitted if the uninitialised values are later 1098 used.</para> 1099 1100 </sect2> 1101 1102 1103 <sect2 id="mc-manual.vaddress" xreflabel=" Valid-address (A) bits"> 1104 <title>Valid-address (A) bits</title> 1105 1106 <para>Notice that the previous subsection describes how the validity of 1107 values is established and maintained without having to say whether the 1108 program does or does not have the right to access any particular memory 1109 location. We now consider the latter question.</para> 1110 1111 <para>As described above, every bit in memory or in the CPU has an 1112 associated valid-value (V) bit. In addition, all bytes in memory, but 1113 not in the CPU, have an associated valid-address (A) bit. This 1114 indicates whether or not the program can legitimately read or write that 1115 location. It does not give any indication of the validity of the data 1116 at that location -- that's the job of the V bits -- only whether or not 1117 the location may be accessed.</para> 1118 1119 <para>Every time your program reads or writes memory, Memcheck checks 1120 the A bits associated with the address. If any of them indicate an 1121 invalid address, an error is emitted. Note that the reads and writes 1122 themselves do not change the A bits, only consult them.</para> 1123 1124 <para>So how do the A bits get set/cleared? Like this:</para> 1125 1126 <itemizedlist> 1127 <listitem> 1128 <para>When the program starts, all the global data areas are 1129 marked as accessible.</para> 1130 </listitem> 1131 1132 <listitem> 1133 <para>When the program does 1134 <function>malloc</function>/<computeroutput>new</computeroutput>, 1135 the A bits for exactly the area allocated, and not a byte more, 1136 are marked as accessible. Upon freeing the area the A bits are 1137 changed to indicate inaccessibility.</para> 1138 </listitem> 1139 1140 <listitem> 1141 <para>When the stack pointer register (<literal>SP</literal>) moves 1142 up or down, A bits are set. The rule is that the area from 1143 <literal>SP</literal> up to the base of the stack is marked as 1144 accessible, and below <literal>SP</literal> is inaccessible. (If 1145 that sounds illogical, bear in mind that the stack grows down, not 1146 up, on almost all Unix systems, including GNU/Linux.) Tracking 1147 <literal>SP</literal> like this has the useful side-effect that the 1148 section of stack used by a function for local variables etc is 1149 automatically marked accessible on function entry and inaccessible 1150 on exit.</para> 1151 </listitem> 1152 1153 <listitem> 1154 <para>When doing system calls, A bits are changed appropriately. 1155 For example, <literal>mmap</literal> 1156 magically makes files appear in the process' 1157 address space, so the A bits must be updated if <literal>mmap</literal> 1158 succeeds.</para> 1159 </listitem> 1160 1161 <listitem> 1162 <para>Optionally, your program can tell Memcheck about such changes 1163 explicitly, using the client request mechanism described 1164 above.</para> 1165 </listitem> 1166 1167 </itemizedlist> 1168 1169 </sect2> 1170 1171 1172 <sect2 id="mc-manual.together" xreflabel="Putting it all together"> 1173 <title>Putting it all together</title> 1174 1175 <para>Memcheck's checking machinery can be summarised as 1176 follows:</para> 1177 1178 <itemizedlist> 1179 <listitem> 1180 <para>Each byte in memory has 8 associated V (valid-value) bits, 1181 saying whether or not the byte has a defined value, and a single A 1182 (valid-address) bit, saying whether or not the program currently has 1183 the right to read/write that address. As mentioned above, heavy 1184 use of compression means the overhead is typically around 25%.</para> 1185 </listitem> 1186 1187 <listitem> 1188 <para>When memory is read or written, the relevant A bits are 1189 consulted. If they indicate an invalid address, Memcheck emits an 1190 Invalid read or Invalid write error.</para> 1191 </listitem> 1192 1193 <listitem> 1194 <para>When memory is read into the CPU's registers, the relevant V 1195 bits are fetched from memory and stored in the simulated CPU. They 1196 are not consulted.</para> 1197 </listitem> 1198 1199 <listitem> 1200 <para>When a register is written out to memory, the V bits for that 1201 register are written back to memory too.</para> 1202 </listitem> 1203 1204 <listitem> 1205 <para>When values in CPU registers are used to generate a memory 1206 address, or to determine the outcome of a conditional branch, the V 1207 bits for those values are checked, and an error emitted if any of 1208 them are undefined.</para> 1209 </listitem> 1210 1211 <listitem> 1212 <para>When values in CPU registers are used for any other purpose, 1213 Memcheck computes the V bits for the result, but does not check 1214 them.</para> 1215 </listitem> 1216 1217 <listitem> 1218 <para>Once the V bits for a value in the CPU have been checked, they 1219 are then set to indicate validity. This avoids long chains of 1220 errors.</para> 1221 </listitem> 1222 1223 <listitem> 1224 <para>When values are loaded from memory, Memcheck checks the A bits 1225 for that location and issues an illegal-address warning if needed. 1226 In that case, the V bits loaded are forced to indicate Valid, 1227 despite the location being invalid.</para> 1228 1229 <para>This apparently strange choice reduces the amount of confusing 1230 information presented to the user. It avoids the unpleasant 1231 phenomenon in which memory is read from a place which is both 1232 unaddressable and contains invalid values, and, as a result, you get 1233 not only an invalid-address (read/write) error, but also a 1234 potentially large set of uninitialised-value errors, one for every 1235 time the value is used.</para> 1236 1237 <para>There is a hazy boundary case to do with multi-byte loads from 1238 addresses which are partially valid and partially invalid. See 1239 details of the option <option>--partial-loads-ok</option> for details. 1240 </para> 1241 </listitem> 1242 1243 </itemizedlist> 1244 1245 1246 <para>Memcheck intercepts calls to <function>malloc</function>, 1247 <function>calloc</function>, <function>realloc</function>, 1248 <function>valloc</function>, <function>memalign</function>, 1249 <function>free</function>, <computeroutput>new</computeroutput>, 1250 <computeroutput>new[]</computeroutput>, 1251 <computeroutput>delete</computeroutput> and 1252 <computeroutput>delete[]</computeroutput>. The behaviour you get 1253 is:</para> 1254 1255 <itemizedlist> 1256 1257 <listitem> 1258 <para><function>malloc</function>/<function>new</function>/<computeroutput>new[]</computeroutput>: 1259 the returned memory is marked as addressable but not having valid 1260 values. This means you have to write to it before you can read 1261 it.</para> 1262 </listitem> 1263 1264 <listitem> 1265 <para><function>calloc</function>: returned memory is marked both 1266 addressable and valid, since <function>calloc</function> clears 1267 the area to zero.</para> 1268 </listitem> 1269 1270 <listitem> 1271 <para><function>realloc</function>: if the new size is larger than 1272 the old, the new section is addressable but invalid, as with 1273 <function>malloc</function>. If the new size is smaller, the 1274 dropped-off section is marked as unaddressable. You may only pass to 1275 <function>realloc</function> a pointer previously issued to you by 1276 <function>malloc</function>/<function>calloc</function>/<function>realloc</function>.</para> 1277 </listitem> 1278 1279 <listitem> 1280 <para><function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput>: 1281 you may only pass to these functions a pointer previously issued 1282 to you by the corresponding allocation function. Otherwise, 1283 Memcheck complains. If the pointer is indeed valid, Memcheck 1284 marks the entire area it points at as unaddressable, and places 1285 the block in the freed-blocks-queue. The aim is to defer as long 1286 as possible reallocation of this block. Until that happens, all 1287 attempts to access it will elicit an invalid-address error, as you 1288 would hope.</para> 1289 </listitem> 1290 1291 </itemizedlist> 1292 1293 </sect2> 1294 </sect1> 1295 1296 <sect1 id="mc-manual.monitor-commands" xreflabel="Memcheck Monitor Commands"> 1297 <title>Memcheck Monitor Commands</title> 1298 <para>The Memcheck tool provides monitor commands handled by Valgrind's 1299 built-in gdbserver (see <xref linkend="manual-core-adv.gdbserver-commandhandling"/>). 1300 </para> 1301 1302 <itemizedlist> 1303 <listitem> 1304 <para><varname>get_vbits <addr> [<len>]</varname> 1305 shows the definedness (V) bits for <len> (default 1) bytes 1306 starting at <addr>. The definedness of each byte in the 1307 range is given using two hexadecimal digits. These hexadecimal 1308 digits encode the validity of each bit of the corresponding byte, 1309 using 0 if the bit is defined and 1 if the bit is undefined. 1310 If a byte is not addressable, its validity bits are replaced 1311 by <varname>__</varname> (a double underscore). 1312 </para> 1313 <para> 1314 In the following example, <varname>string10</varname> is an array 1315 of 10 characters, in which the even numbered bytes are 1316 undefined. In the below example, the byte corresponding 1317 to <varname>string10[5]</varname> is not addressable. 1318 </para> 1319 <programlisting><![CDATA[ 1320 (gdb) p &string10 1321 $4 = (char (*)[10]) 0x8049e28 1322 (gdb) monitor get_vbits 0x8049e28 10 1323 ff00ff00 ff__ff00 ff00 1324 (gdb) 1325 ]]></programlisting> 1326 1327 <para> The command get_vbits cannot be used with registers. To get 1328 the validity bits of a register, you must start Valgrind with the 1329 option <option>--vgdb-shadow-registers=yes</option>. The validity 1330 bits of a register can be obtained by printing the 'shadow 1' 1331 corresponding register. In the below x86 example, the register 1332 eax has all its bits undefined, while the register ebx is fully 1333 defined. 1334 </para> 1335 <programlisting><![CDATA[ 1336 (gdb) p /x $eaxs1 1337 $9 = 0xffffffff 1338 (gdb) p /x $ebxs1 1339 $10 = 0x0 1340 (gdb) 1341 ]]></programlisting> 1342 1343 </listitem> 1344 1345 <listitem> 1346 <para><varname>make_memory 1347 [noaccess|undefined|defined|Definedifaddressable] <addr> 1348 [<len>]</varname> marks the range of <len> (default 1) 1349 bytes at <addr> as having the given status. Parameter 1350 <varname>noaccess</varname> marks the range as non-accessible, so 1351 Memcheck will report an error on any access to it. 1352 <varname>undefined</varname> or <varname>defined</varname> mark 1353 the area as accessible, but Memcheck regards the bytes in it 1354 respectively as having undefined or defined values. 1355 <varname>Definedifaddressable</varname> marks as defined, bytes in 1356 the range which are already addressible, but makes no change to 1357 the status of bytes in the range which are not addressible. Note 1358 that the first letter of <varname>Definedifaddressable</varname> 1359 is an uppercase D to avoid confusion with <varname>defined</varname>. 1360 </para> 1361 1362 <para> 1363 In the following example, the first byte of the 1364 <varname>string10</varname> is marked as defined: 1365 </para> 1366 <programlisting><![CDATA[ 1367 (gdb) monitor make_memory defined 0x8049e28 1 1368 (gdb) monitor get_vbits 0x8049e28 10 1369 0000ff00 ff00ff00 ff00 1370 (gdb) 1371 ]]></programlisting> 1372 </listitem> 1373 1374 <listitem> 1375 <para><varname>check_memory [addressable|defined] <addr> 1376 [<len>]</varname> checks that the range of <len> 1377 (default 1) bytes at <addr> has the specified accessibility. 1378 It then outputs a description of <addr>. In the following 1379 example, a detailed description is available because the 1380 option <option>--read-var-info=yes</option> was given at Valgrind 1381 startup: 1382 </para> 1383 <programlisting><![CDATA[ 1384 (gdb) monitor check_memory defined 0x8049e28 1 1385 Address 0x8049E28 len 1 defined 1386 ==14698== Location 0x8049e28 is 0 bytes inside string10[0], 1387 ==14698== declared at prog.c:10, in frame #0 of thread 1 1388 (gdb) 1389 ]]></programlisting> 1390 </listitem> 1391 1392 <listitem> 1393 <para><varname>leak_check [full*|summary] 1394 [reachable|possibleleak*|definiteleak] 1395 [increased*|changed|any] 1396 [unlimited*|limited <max_loss_records_output>] 1397 </varname> 1398 performs a leak check. The <varname>*</varname> in the arguments 1399 indicates the default values. </para> 1400 1401 <para> If the first argument is <varname>summary</varname>, only a 1402 summary of the leak search is given; otherwise a full leak report 1403 is produced. A full leak report gives detailed information for 1404 each leak: the stack trace where the leaked blocks were allocated, 1405 the number of blocks leaked and their total size. When a full 1406 report is requested, the next two arguments further specify what 1407 kind of leaks to report. A leak's details are shown if they match 1408 both the second and third argument. A full leak report might 1409 output detailed information for many leaks. The nr of leaks for 1410 which information is output can be controlled using 1411 the <varname>limited</varname> argument followed by the maximum nr 1412 of leak records to output. If this maximum is reached, the leak 1413 search outputs the records with the biggest number of bytes. 1414 </para> 1415 1416 <para>The second argument controls what kind of blocks are shown for 1417 a <varname>full</varname> leak search. The 1418 value <varname>definiteleak</varname> specifies that only 1419 definitely leaked blocks should be shown. The 1420 value <varname>possibleleak</varname> will also show possibly 1421 leaked blocks (those for which only an interior pointer was 1422 found). The value 1423 <varname>reachable</varname> will show all block categories 1424 (reachable, possibly leaked, definitely leaked). 1425 </para> 1426 1427 <para>The third argument controls what kinds of changes are shown 1428 for a <varname>full</varname> leak search. The 1429 value <varname>increased</varname> specifies that only block 1430 allocation stacks with an increased number of leaked bytes or 1431 blocks since the previous leak check should be shown. The 1432 value <varname>changed</varname> specifies that allocation stacks 1433 with any change since the previous leak check should be shown. 1434 The value <varname>any</varname> specifies that all leak entries 1435 should be shown, regardless of any increase or decrease. When 1436 If <varname>increased</varname> or <varname>changed</varname> are 1437 specified, the leak report entries will show the delta relative to 1438 the previous leak report. 1439 </para> 1440 1441 <para>The following example shows usage of the 1442 <varname>leak_check</varname> monitor command on 1443 the <varname>memcheck/tests/leak-cases.c</varname> regression 1444 test. The first command outputs one entry having an increase in 1445 the leaked bytes. The second command is the same as the first 1446 command, but uses the abbreviated forms accepted by GDB and the 1447 Valgrind gdbserver. It only outputs the summary information, as 1448 there was no increase since the previous leak search.</para> 1449 <programlisting><![CDATA[ 1450 (gdb) monitor leak_check full possibleleak increased 1451 ==19520== 16 (+16) bytes in 1 (+1) blocks are possibly lost in loss record 9 of 12 1452 ==19520== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1453 ==19520== by 0x80484D5: mk (leak-cases.c:52) 1454 ==19520== by 0x804855F: f (leak-cases.c:81) 1455 ==19520== by 0x80488E0: main (leak-cases.c:107) 1456 ==19520== 1457 ==19520== LEAK SUMMARY: 1458 ==19520== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1459 ==19520== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1460 ==19520== possibly lost: 32 (+16) bytes in 2 (+1) blocks 1461 ==19520== still reachable: 96 (+16) bytes in 6 (+1) blocks 1462 ==19520== suppressed: 0 (+0) bytes in 0 (+0) blocks 1463 ==19520== Reachable blocks (those to which a pointer was found) are not shown. 1464 ==19520== To see them, add 'reachable any' args to leak_check 1465 ==19520== 1466 (gdb) mo l 1467 ==19520== LEAK SUMMARY: 1468 ==19520== definitely lost: 32 (+0) bytes in 2 (+0) blocks 1469 ==19520== indirectly lost: 16 (+0) bytes in 1 (+0) blocks 1470 ==19520== possibly lost: 32 (+0) bytes in 2 (+0) blocks 1471 ==19520== still reachable: 96 (+0) bytes in 6 (+0) blocks 1472 ==19520== suppressed: 0 (+0) bytes in 0 (+0) blocks 1473 ==19520== Reachable blocks (those to which a pointer was found) are not shown. 1474 ==19520== To see them, add 'reachable any' args to leak_check 1475 ==19520== 1476 (gdb) 1477 ]]></programlisting> 1478 <para>Note that when using Valgrind's gdbserver, it is not 1479 necessary to rerun 1480 with <option>--leak-check=full</option> 1481 <option>--show-reachable=yes</option> to see the reachable 1482 blocks. You can obtain the same information without rerunning by 1483 using the GDB command <computeroutput>monitor leak_check full 1484 reachable any</computeroutput> (or, using 1485 abbreviation: <computeroutput>mo l f r a</computeroutput>). 1486 </para> 1487 </listitem> 1488 1489 <listitem> 1490 <para><varname>block_list <loss_record_nr> </varname> 1491 shows the list of blocks belonging to <loss_record_nr>. 1492 </para> 1493 1494 <para> A leak search merges the allocated blocks in loss records : 1495 a loss record re-groups all blocks having the same state (for 1496 example, Definitely Lost) and the same allocation backtrace. 1497 Each loss record is identified in the leak search result 1498 by a loss record number. 1499 The <varname>block_list</varname> command shows the loss record information 1500 followed by the addresses and sizes of the blocks which have been 1501 merged in the loss record. 1502 </para> 1503 1504 <para> If a directly lost block causes some other blocks to be indirectly 1505 lost, the block_list command will also show these indirectly lost blocks. 1506 The indirectly lost blocks will be indented according to the level of indirection 1507 between the directly lost block and the indirectly lost block(s). 1508 Each indirectly lost block is followed by the reference of its loss record. 1509 </para> 1510 1511 <para> The block_list command can be used on the results of a leak search as long 1512 as no block has been freed after this leak search: as soon as the program frees 1513 a block, a new leak search is needed before block_list can be used again. 1514 </para> 1515 1516 <para> 1517 In the below example, the program leaks a tree structure by losing the pointer to 1518 the block A (top of the tree). 1519 So, the block A is directly lost, causing an indirect 1520 loss of blocks B to G. The first block_list command shows the loss record of A 1521 (a definitely lost block with address 0x4028028, size 16). The addresses and sizes 1522 of the indirectly lost blocks due to block A are shown below the block A. 1523 The second command shows the details of one of the indirect loss records output 1524 by the first command. 1525 </para> 1526 <programlisting><![CDATA[ 1527 A 1528 / \ 1529 B C 1530 / \ / \ 1531 D E F G 1532 ]]></programlisting> 1533 1534 <programlisting><![CDATA[ 1535 (gdb) bt 1536 #0 main () at leak-tree.c:69 1537 (gdb) monitor leak_check full any 1538 ==19552== 112 (16 direct, 96 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7 1539 ==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1540 ==19552== by 0x80484D5: mk (leak-tree.c:28) 1541 ==19552== by 0x80484FC: f (leak-tree.c:41) 1542 ==19552== by 0x8048856: main (leak-tree.c:63) 1543 ==19552== 1544 ==19552== LEAK SUMMARY: 1545 ==19552== definitely lost: 16 bytes in 1 blocks 1546 ==19552== indirectly lost: 96 bytes in 6 blocks 1547 ==19552== possibly lost: 0 bytes in 0 blocks 1548 ==19552== still reachable: 0 bytes in 0 blocks 1549 ==19552== suppressed: 0 bytes in 0 blocks 1550 ==19552== 1551 (gdb) monitor block_list 7 1552 ==19552== 112 (16 direct, 96 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7 1553 ==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1554 ==19552== by 0x80484D5: mk (leak-tree.c:28) 1555 ==19552== by 0x80484FC: f (leak-tree.c:41) 1556 ==19552== by 0x8048856: main (leak-tree.c:63) 1557 ==19552== 0x4028028[16] 1558 ==19552== 0x4028068[16] indirect loss record 1 1559 ==19552== 0x40280E8[16] indirect loss record 3 1560 ==19552== 0x4028128[16] indirect loss record 4 1561 ==19552== 0x40280A8[16] indirect loss record 2 1562 ==19552== 0x4028168[16] indirect loss record 5 1563 ==19552== 0x40281A8[16] indirect loss record 6 1564 (gdb) mo b 2 1565 ==19552== 16 bytes in 1 blocks are indirectly lost in loss record 2 of 7 1566 ==19552== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1567 ==19552== by 0x80484D5: mk (leak-tree.c:28) 1568 ==19552== by 0x8048519: f (leak-tree.c:43) 1569 ==19552== by 0x8048856: main (leak-tree.c:63) 1570 ==19552== 0x40280A8[16] 1571 ==19552== 0x4028168[16] indirect loss record 5 1572 ==19552== 0x40281A8[16] indirect loss record 6 1573 (gdb) 1574 1575 ]]></programlisting> 1576 1577 </listitem> 1578 1579 <listitem> 1580 <para><varname>who_points_at <addr> [<len>]</varname> 1581 shows all the locations where a pointer to addr is found. 1582 If len is equal to 1, the command only shows the locations pointing 1583 exactly at addr (i.e. the "start pointers" to addr). 1584 If len is > 1, "interior pointers" pointing at the len first bytes 1585 will also be shown. 1586 </para> 1587 1588 <para>The locations searched for are the same as the locations 1589 used in the leak search. So, <varname>who_points_at</varname> can a.o. 1590 be used to show why the leak search still can reach a block, or can 1591 search for dangling pointers to a freed block. 1592 Each location pointing at addr (or pointing inside addr if interior pointers 1593 are being searched for) will be described. 1594 </para> 1595 1596 <para>In the below example, the pointers to the 'tree block A' (see example 1597 in command <varname>block_list</varname>) is shown before the tree was leaked. 1598 The descriptions are detailed as the option <option>--read-var-info=yes</option> 1599 was given at Valgrind startup. The second call shows the pointers (start and interior 1600 pointers) to block G. The block G (0x40281A8) is reachable via block C (0x40280a8) 1601 and register ECX of tid 1 (tid is the Valgrind thread id). 1602 It is "interior reachable" via the register EBX. 1603 </para> 1604 1605 <programlisting><![CDATA[ 1606 (gdb) monitor who_points_at 0x4028028 1607 ==20852== Searching for pointers to 0x4028028 1608 ==20852== *0x8049e20 points at 0x4028028 1609 ==20852== Location 0x8049e20 is 0 bytes inside global var "t" 1610 ==20852== declared at leak-tree.c:35 1611 (gdb) monitor who_points_at 0x40281A8 16 1612 ==20852== Searching for pointers pointing in 16 bytes from 0x40281a8 1613 ==20852== *0x40280ac points at 0x40281a8 1614 ==20852== Address 0x40280ac is 4 bytes inside a block of size 16 alloc'd 1615 ==20852== at 0x40070B4: malloc (vg_replace_malloc.c:263) 1616 ==20852== by 0x80484D5: mk (leak-tree.c:28) 1617 ==20852== by 0x8048519: f (leak-tree.c:43) 1618 ==20852== by 0x8048856: main (leak-tree.c:63) 1619 ==20852== tid 1 register ECX points at 0x40281a8 1620 ==20852== tid 1 register EBX interior points at 2 bytes inside 0x40281a8 1621 (gdb) 1622 ]]></programlisting> 1623 </listitem> 1624 1625 1626 </itemizedlist> 1627 1628 </sect1> 1629 1630 <sect1 id="mc-manual.clientreqs" xreflabel="Client requests"> 1631 <title>Client Requests</title> 1632 1633 <para>The following client requests are defined in 1634 <filename>memcheck.h</filename>. 1635 See <filename>memcheck.h</filename> for exact details of their 1636 arguments.</para> 1637 1638 <itemizedlist> 1639 1640 <listitem> 1641 <para><varname>VALGRIND_MAKE_MEM_NOACCESS</varname>, 1642 <varname>VALGRIND_MAKE_MEM_UNDEFINED</varname> and 1643 <varname>VALGRIND_MAKE_MEM_DEFINED</varname>. 1644 These mark address ranges as completely inaccessible, 1645 accessible but containing undefined data, and accessible and 1646 containing defined data, respectively.</para> 1647 </listitem> 1648 1649 <listitem> 1650 <para><varname>VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</varname>. 1651 This is just like <varname>VALGRIND_MAKE_MEM_DEFINED</varname> but only 1652 affects those bytes that are already addressable.</para> 1653 </listitem> 1654 1655 <listitem> 1656 <para><varname>VALGRIND_CHECK_MEM_IS_ADDRESSABLE</varname> and 1657 <varname>VALGRIND_CHECK_MEM_IS_DEFINED</varname>: check immediately 1658 whether or not the given address range has the relevant property, 1659 and if not, print an error message. Also, for the convenience of 1660 the client, returns zero if the relevant property holds; otherwise, 1661 the returned value is the address of the first byte for which the 1662 property is not true. Always returns 0 when not run on 1663 Valgrind.</para> 1664 </listitem> 1665 1666 <listitem> 1667 <para><varname>VALGRIND_CHECK_VALUE_IS_DEFINED</varname>: a quick and easy 1668 way to find out whether Valgrind thinks a particular value 1669 (lvalue, to be precise) is addressable and defined. Prints an error 1670 message if not. It has no return value.</para> 1671 </listitem> 1672 1673 <listitem> 1674 <para><varname>VALGRIND_DO_LEAK_CHECK</varname>: does a full memory leak 1675 check (like <option>--leak-check=full</option>) right now. 1676 This is useful for incrementally checking for leaks between arbitrary 1677 places in the program's execution. It has no return value.</para> 1678 </listitem> 1679 1680 <listitem> 1681 <para><varname>VALGRIND_DO_ADDED_LEAK_CHECK</varname>: same as 1682 <varname> VALGRIND_DO_LEAK_CHECK</varname> but only shows the 1683 entries for which there was an increase in leaked bytes or leaked 1684 number of blocks since the previous leak search. It has no return 1685 value.</para> 1686 </listitem> 1687 1688 <listitem> 1689 <para><varname>VALGRIND_DO_CHANGED_LEAK_CHECK</varname>: same as 1690 <varname>VALGRIND_DO_LEAK_CHECK</varname> but only shows the 1691 entries for which there was an increase or decrease in leaked 1692 bytes or leaked number of blocks since the previous leak search. It 1693 has no return value.</para> 1694 </listitem> 1695 1696 <listitem> 1697 <para><varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>: like 1698 <varname>VALGRIND_DO_LEAK_CHECK</varname>, except it produces only a leak 1699 summary (like <option>--leak-check=summary</option>). 1700 It has no return value.</para> 1701 </listitem> 1702 1703 <listitem> 1704 <para><varname>VALGRIND_COUNT_LEAKS</varname>: fills in the four 1705 arguments with the number of bytes of memory found by the previous 1706 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks), 1707 dubious, reachable and suppressed. This is useful in test harness code, 1708 after calling <varname>VALGRIND_DO_LEAK_CHECK</varname> or 1709 <varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>.</para> 1710 </listitem> 1711 1712 <listitem> 1713 <para><varname>VALGRIND_COUNT_LEAK_BLOCKS</varname>: identical to 1714 <varname>VALGRIND_COUNT_LEAKS</varname> except that it returns the 1715 number of blocks rather than the number of bytes in each 1716 category.</para> 1717 </listitem> 1718 1719 <listitem> 1720 <para><varname>VALGRIND_GET_VBITS</varname> and 1721 <varname>VALGRIND_SET_VBITS</varname>: allow you to get and set the 1722 V (validity) bits for an address range. You should probably only 1723 set V bits that you have got with 1724 <varname>VALGRIND_GET_VBITS</varname>. Only for those who really 1725 know what they are doing.</para> 1726 </listitem> 1727 1728 <listitem> 1729 <para><varname>VALGRIND_CREATE_BLOCK</varname> and 1730 <varname>VALGRIND_DISCARD</varname>. <varname>VALGRIND_CREATE_BLOCK</varname> 1731 takes an address, a number of bytes and a character string. The 1732 specified address range is then associated with that string. When 1733 Memcheck reports an invalid access to an address in the range, it 1734 will describe it in terms of this block rather than in terms of 1735 any other block it knows about. Note that the use of this macro 1736 does not actually change the state of memory in any way -- it 1737 merely gives a name for the range. 1738 </para> 1739 1740 <para>At some point you may want Memcheck to stop reporting errors 1741 in terms of the block named 1742 by <varname>VALGRIND_CREATE_BLOCK</varname>. To make this 1743 possible, <varname>VALGRIND_CREATE_BLOCK</varname> returns a 1744 "block handle", which is a C <varname>int</varname> value. You 1745 can pass this block handle to <varname>VALGRIND_DISCARD</varname>. 1746 After doing so, Valgrind will no longer relate addressing errors 1747 in the specified range to the block. Passing invalid handles to 1748 <varname>VALGRIND_DISCARD</varname> is harmless. 1749 </para> 1750 </listitem> 1751 1752 </itemizedlist> 1753 1754 </sect1> 1755 1756 1757 1758 1759 <sect1 id="mc-manual.mempools" xreflabel="Memory Pools"> 1760 <title>Memory Pools: describing and working with custom allocators</title> 1761 1762 <para>Some programs use custom memory allocators, often for performance 1763 reasons. Left to itself, Memcheck is unable to understand the 1764 behaviour of custom allocation schemes as well as it understands the 1765 standard allocators, and so may miss errors and leaks in your program. What 1766 this section describes is a way to give Memcheck enough of a description of 1767 your custom allocator that it can make at least some sense of what is 1768 happening.</para> 1769 1770 <para>There are many different sorts of custom allocator, so Memcheck 1771 attempts to reason about them using a loose, abstract model. We 1772 use the following terminology when describing custom allocation 1773 systems:</para> 1774 1775 <itemizedlist> 1776 <listitem> 1777 <para>Custom allocation involves a set of independent "memory pools". 1778 </para> 1779 </listitem> 1780 <listitem> 1781 <para>Memcheck's notion of a a memory pool consists of a single "anchor 1782 address" and a set of non-overlapping "chunks" associated with the 1783 anchor address.</para> 1784 </listitem> 1785 <listitem> 1786 <para>Typically a pool's anchor address is the address of a 1787 book-keeping "header" structure.</para> 1788 </listitem> 1789 <listitem> 1790 <para>Typically the pool's chunks are drawn from a contiguous 1791 "superblock" acquired through the system 1792 <function>malloc</function> or 1793 <function>mmap</function>.</para> 1794 </listitem> 1795 1796 </itemizedlist> 1797 1798 <para>Keep in mind that the last two points above say "typically": the 1799 Valgrind mempool client request API is intentionally vague about the 1800 exact structure of a mempool. There is no specific mention made of 1801 headers or superblocks. Nevertheless, the following picture may help 1802 elucidate the intention of the terms in the API:</para> 1803 1804 <programlisting><![CDATA[ 1805 "pool" 1806 (anchor address) 1807 | 1808 v 1809 +--------+---+ 1810 | header | o | 1811 +--------+-|-+ 1812 | 1813 v superblock 1814 +------+---+--------------+---+------------------+ 1815 | |rzB| allocation |rzB| | 1816 +------+---+--------------+---+------------------+ 1817 ^ ^ 1818 | | 1819 "addr" "addr"+"size" 1820 ]]></programlisting> 1821 1822 <para> 1823 Note that the header and the superblock may be contiguous or 1824 discontiguous, and there may be multiple superblocks associated with a 1825 single header; such variations are opaque to Memcheck. The API 1826 only requires that your allocation scheme can present sensible values 1827 of "pool", "addr" and "size".</para> 1828 1829 <para> 1830 Typically, before making client requests related to mempools, a client 1831 program will have allocated such a header and superblock for their 1832 mempool, and marked the superblock NOACCESS using the 1833 <varname>VALGRIND_MAKE_MEM_NOACCESS</varname> client request.</para> 1834 1835 <para> 1836 When dealing with mempools, the goal is to maintain a particular 1837 invariant condition: that Memcheck believes the unallocated portions 1838 of the pool's superblock (including redzones) are NOACCESS. To 1839 maintain this invariant, the client program must ensure that the 1840 superblock starts out in that state; Memcheck cannot make it so, since 1841 Memcheck never explicitly learns about the superblock of a pool, only 1842 the allocated chunks within the pool.</para> 1843 1844 <para> 1845 Once the header and superblock for a pool are established and properly 1846 marked, there are a number of client requests programs can use to 1847 inform Memcheck about changes to the state of a mempool:</para> 1848 1849 <itemizedlist> 1850 1851 <listitem> 1852 <para> 1853 <varname>VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</varname>: 1854 This request registers the address <varname>pool</varname> as the anchor 1855 address for a memory pool. It also provides a size 1856 <varname>rzB</varname>, specifying how large the redzones placed around 1857 chunks allocated from the pool should be. Finally, it provides an 1858 <varname>is_zeroed</varname> argument that specifies whether the pool's 1859 chunks are zeroed (more precisely: defined) when allocated. 1860 </para> 1861 <para> 1862 Upon completion of this request, no chunks are associated with the 1863 pool. The request simply tells Memcheck that the pool exists, so that 1864 subsequent calls can refer to it as a pool. 1865 </para> 1866 </listitem> 1867 1868 <listitem> 1869 <para><varname>VALGRIND_DESTROY_MEMPOOL(pool)</varname>: 1870 This request tells Memcheck that a pool is being torn down. Memcheck 1871 then removes all records of chunks associated with the pool, as well 1872 as its record of the pool's existence. While destroying its records of 1873 a mempool, Memcheck resets the redzones of any live chunks in the pool 1874 to NOACCESS. 1875 </para> 1876 </listitem> 1877 1878 <listitem> 1879 <para><varname>VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</varname>: 1880 This request informs Memcheck that a <varname>size</varname>-byte chunk 1881 has been allocated at <varname>addr</varname>, and associates the chunk with the 1882 specified 1883 <varname>pool</varname>. If the pool was created with nonzero 1884 <varname>rzB</varname> redzones, Memcheck will mark the 1885 <varname>rzB</varname> bytes before and after the chunk as NOACCESS. If 1886 the pool was created with the <varname>is_zeroed</varname> argument set, 1887 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark 1888 the chunk as UNDEFINED. 1889 </para> 1890 </listitem> 1891 1892 <listitem> 1893 <para><varname>VALGRIND_MEMPOOL_FREE(pool, addr)</varname>: 1894 This request informs Memcheck that the chunk at <varname>addr</varname> 1895 should no longer be considered allocated. Memcheck will mark the chunk 1896 associated with <varname>addr</varname> as NOACCESS, and delete its 1897 record of the chunk's existence. 1898 </para> 1899 </listitem> 1900 1901 <listitem> 1902 <para><varname>VALGRIND_MEMPOOL_TRIM(pool, addr, size)</varname>: 1903 This request trims the chunks associated with <varname>pool</varname>. 1904 The request only operates on chunks associated with 1905 <varname>pool</varname>. Trimming is formally defined as:</para> 1906 <itemizedlist> 1907 <listitem> 1908 <para> All chunks entirely inside the range 1909 <varname>addr..(addr+size-1)</varname> are preserved.</para> 1910 </listitem> 1911 <listitem> 1912 <para>All chunks entirely outside the range 1913 <varname>addr..(addr+size-1)</varname> are discarded, as though 1914 <varname>VALGRIND_MEMPOOL_FREE</varname> was called on them. </para> 1915 </listitem> 1916 <listitem> 1917 <para>All other chunks must intersect with the range 1918 <varname>addr..(addr+size-1)</varname>; areas outside the 1919 intersection are marked as NOACCESS, as though they had been 1920 independently freed with 1921 <varname>VALGRIND_MEMPOOL_FREE</varname>.</para> 1922 </listitem> 1923 </itemizedlist> 1924 <para>This is a somewhat rare request, but can be useful in 1925 implementing the type of mass-free operations common in custom 1926 LIFO allocators.</para> 1927 </listitem> 1928 1929 <listitem> 1930 <para><varname>VALGRIND_MOVE_MEMPOOL(poolA, poolB)</varname>: This 1931 request informs Memcheck that the pool previously anchored at 1932 address <varname>poolA</varname> has moved to anchor address 1933 <varname>poolB</varname>. This is a rare request, typically only needed 1934 if you <function>realloc</function> the header of a mempool.</para> 1935 <para>No memory-status bits are altered by this request.</para> 1936 </listitem> 1937 1938 <listitem> 1939 <para> 1940 <varname>VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB, 1941 size)</varname>: This request informs Memcheck that the chunk 1942 previously allocated at address <varname>addrA</varname> within 1943 <varname>pool</varname> has been moved and/or resized, and should be 1944 changed to cover the region <varname>addrB..(addrB+size-1)</varname>. This 1945 is a rare request, typically only needed if you 1946 <function>realloc</function> a superblock or wish to extend a chunk 1947 without changing its memory-status bits. 1948 </para> 1949 <para>No memory-status bits are altered by this request. 1950 </para> 1951 </listitem> 1952 1953 <listitem> 1954 <para><varname>VALGRIND_MEMPOOL_EXISTS(pool)</varname>: 1955 This request informs the caller whether or not Memcheck is currently 1956 tracking a mempool at anchor address <varname>pool</varname>. It 1957 evaluates to 1 when there is a mempool associated with that address, 0 1958 otherwise. This is a rare request, only useful in circumstances when 1959 client code might have lost track of the set of active mempools. 1960 </para> 1961 </listitem> 1962 1963 </itemizedlist> 1964 1965 </sect1> 1966 1967 1968 1969 1970 1971 1972 1973 <sect1 id="mc-manual.mpiwrap" xreflabel="MPI Wrappers"> 1974 <title>Debugging MPI Parallel Programs with Valgrind</title> 1975 1976 <para>Memcheck supports debugging of distributed-memory applications 1977 which use the MPI message passing standard. This support consists of a 1978 library of wrapper functions for the 1979 <computeroutput>PMPI_*</computeroutput> interface. When incorporated 1980 into the application's address space, either by direct linking or by 1981 <computeroutput>LD_PRELOAD</computeroutput>, the wrappers intercept 1982 calls to <computeroutput>PMPI_Send</computeroutput>, 1983 <computeroutput>PMPI_Recv</computeroutput>, etc. They then 1984 use client requests to inform Memcheck of memory state changes caused 1985 by the function being wrapped. This reduces the number of false 1986 positives that Memcheck otherwise typically reports for MPI 1987 applications.</para> 1988 1989 <para>The wrappers also take the opportunity to carefully check 1990 size and definedness of buffers passed as arguments to MPI functions, hence 1991 detecting errors such as passing undefined data to 1992 <computeroutput>PMPI_Send</computeroutput>, or receiving data into a 1993 buffer which is too small.</para> 1994 1995 <para>Unlike most of the rest of Valgrind, the wrapper library is subject to a 1996 BSD-style license, so you can link it into any code base you like. 1997 See the top of <computeroutput>mpi/libmpiwrap.c</computeroutput> 1998 for license details.</para> 1999 2000 2001 <sect2 id="mc-manual.mpiwrap.build" xreflabel="Building MPI Wrappers"> 2002 <title>Building and installing the wrappers</title> 2003 2004 <para> The wrapper library will be built automatically if possible. 2005 Valgrind's configure script will look for a suitable 2006 <computeroutput>mpicc</computeroutput> to build it with. This must be 2007 the same <computeroutput>mpicc</computeroutput> you use to build the 2008 MPI application you want to debug. By default, Valgrind tries 2009 <computeroutput>mpicc</computeroutput>, but you can specify a 2010 different one by using the configure-time option 2011 <option>--with-mpicc</option>. Currently the 2012 wrappers are only buildable with 2013 <computeroutput>mpicc</computeroutput>s which are based on GNU 2014 GCC or Intel's C++ Compiler.</para> 2015 2016 <para>Check that the configure script prints a line like this:</para> 2017 2018 <programlisting><![CDATA[ 2019 checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc 2020 ]]></programlisting> 2021 2022 <para>If it says <computeroutput>... no</computeroutput>, your 2023 <computeroutput>mpicc</computeroutput> has failed to compile and link 2024 a test MPI2 program.</para> 2025 2026 <para>If the configure test succeeds, continue in the usual way with 2027 <computeroutput>make</computeroutput> and <computeroutput>make 2028 install</computeroutput>. The final install tree should then contain 2029 <computeroutput>libmpiwrap-<platform>.so</computeroutput>. 2030 </para> 2031 2032 <para>Compile up a test MPI program (eg, MPI hello-world) and try 2033 this:</para> 2034 2035 <programlisting><![CDATA[ 2036 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 2037 mpirun [args] $prefix/bin/valgrind ./hello 2038 ]]></programlisting> 2039 2040 <para>You should see something similar to the following</para> 2041 2042 <programlisting><![CDATA[ 2043 valgrind MPI wrappers 31901: Active for pid 31901 2044 valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options 2045 ]]></programlisting> 2046 2047 <para>repeated for every process in the group. If you do not see 2048 these, there is an build/installation problem of some kind.</para> 2049 2050 <para> The MPI functions to be wrapped are assumed to be in an ELF 2051 shared object with soname matching 2052 <computeroutput>libmpi.so*</computeroutput>. This is known to be 2053 correct at least for Open MPI and Quadrics MPI, and can easily be 2054 changed if required.</para> 2055 </sect2> 2056 2057 2058 <sect2 id="mc-manual.mpiwrap.gettingstarted" 2059 xreflabel="Getting started with MPI Wrappers"> 2060 <title>Getting started</title> 2061 2062 <para>Compile your MPI application as usual, taking care to link it 2063 using the same <computeroutput>mpicc</computeroutput> that your 2064 Valgrind build was configured with.</para> 2065 2066 <para> 2067 Use the following basic scheme to run your application on Valgrind with 2068 the wrappers engaged:</para> 2069 2070 <programlisting><![CDATA[ 2071 MPIWRAP_DEBUG=[wrapper-args] \ 2072 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 2073 mpirun [mpirun-args] \ 2074 $prefix/bin/valgrind [valgrind-args] \ 2075 [application] [app-args] 2076 ]]></programlisting> 2077 2078 <para>As an alternative to 2079 <computeroutput>LD_PRELOAD</computeroutput>ing 2080 <computeroutput>libmpiwrap-<platform>.so</computeroutput>, you can 2081 simply link it to your application if desired. This should not disturb 2082 native behaviour of your application in any way.</para> 2083 </sect2> 2084 2085 2086 <sect2 id="mc-manual.mpiwrap.controlling" 2087 xreflabel="Controlling the MPI Wrappers"> 2088 <title>Controlling the wrapper library</title> 2089 2090 <para>Environment variable 2091 <computeroutput>MPIWRAP_DEBUG</computeroutput> is consulted at 2092 startup. The default behaviour is to print a starting banner</para> 2093 2094 <programlisting><![CDATA[ 2095 valgrind MPI wrappers 16386: Active for pid 16386 2096 valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options 2097 ]]></programlisting> 2098 2099 <para> and then be relatively quiet.</para> 2100 2101 <para>You can give a list of comma-separated options in 2102 <computeroutput>MPIWRAP_DEBUG</computeroutput>. These are</para> 2103 2104 <itemizedlist> 2105 <listitem> 2106 <para><computeroutput>verbose</computeroutput>: 2107 show entries/exits of all wrappers. Also show extra 2108 debugging info, such as the status of outstanding 2109 <computeroutput>MPI_Request</computeroutput>s resulting 2110 from uncompleted <computeroutput>MPI_Irecv</computeroutput>s.</para> 2111 </listitem> 2112 <listitem> 2113 <para><computeroutput>quiet</computeroutput>: 2114 opposite of <computeroutput>verbose</computeroutput>, only print 2115 anything when the wrappers want 2116 to report a detected programming error, or in case of catastrophic 2117 failure of the wrappers.</para> 2118 </listitem> 2119 <listitem> 2120 <para><computeroutput>warn</computeroutput>: 2121 by default, functions which lack proper wrappers 2122 are not commented on, just silently 2123 ignored. This causes a warning to be printed for each unwrapped 2124 function used, up to a maximum of three warnings per function.</para> 2125 </listitem> 2126 <listitem> 2127 <para><computeroutput>strict</computeroutput>: 2128 print an error message and abort the program if 2129 a function lacking a wrapper is used.</para> 2130 </listitem> 2131 </itemizedlist> 2132 2133 <para> If you want to use Valgrind's XML output facility 2134 (<option>--xml=yes</option>), you should pass 2135 <computeroutput>quiet</computeroutput> in 2136 <computeroutput>MPIWRAP_DEBUG</computeroutput> so as to get rid of any 2137 extraneous printing from the wrappers.</para> 2138 2139 </sect2> 2140 2141 2142 <sect2 id="mc-manual.mpiwrap.limitations.functions" 2143 xreflabel="Functions: Abilities and Limitations"> 2144 <title>Functions</title> 2145 2146 <para>All MPI2 functions except 2147 <computeroutput>MPI_Wtick</computeroutput>, 2148 <computeroutput>MPI_Wtime</computeroutput> and 2149 <computeroutput>MPI_Pcontrol</computeroutput> have wrappers. The 2150 first two are not wrapped because they return a 2151 <computeroutput>double</computeroutput>, which Valgrind's 2152 function-wrap mechanism cannot handle (but it could easily be 2153 extended to do so). <computeroutput>MPI_Pcontrol</computeroutput> cannot be 2154 wrapped as it has variable arity: 2155 <computeroutput>int MPI_Pcontrol(const int level, ...)</computeroutput></para> 2156 2157 <para>Most functions are wrapped with a default wrapper which does 2158 nothing except complain or abort if it is called, depending on 2159 settings in <computeroutput>MPIWRAP_DEBUG</computeroutput> listed 2160 above. The following functions have "real", do-something-useful 2161 wrappers:</para> 2162 2163 <programlisting><![CDATA[ 2164 PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend 2165 2166 PMPI_Recv PMPI_Get_count 2167 2168 PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend 2169 2170 PMPI_Irecv 2171 PMPI_Wait PMPI_Waitall 2172 PMPI_Test PMPI_Testall 2173 2174 PMPI_Iprobe PMPI_Probe 2175 2176 PMPI_Cancel 2177 2178 PMPI_Sendrecv 2179 2180 PMPI_Type_commit PMPI_Type_free 2181 2182 PMPI_Pack PMPI_Unpack 2183 2184 PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall 2185 PMPI_Reduce PMPI_Allreduce PMPI_Op_create 2186 2187 PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size 2188 2189 PMPI_Error_string 2190 PMPI_Init PMPI_Initialized PMPI_Finalize 2191 ]]></programlisting> 2192 2193 <para> A few functions such as 2194 <computeroutput>PMPI_Address</computeroutput> are listed as 2195 <computeroutput>HAS_NO_WRAPPER</computeroutput>. They have no wrapper 2196 at all as there is nothing worth checking, and giving a no-op wrapper 2197 would reduce performance for no reason.</para> 2198 2199 <para> Note that the wrapper library itself can itself generate large 2200 numbers of calls to the MPI implementation, especially when walking 2201 complex types. The most common functions called are 2202 <computeroutput>PMPI_Extent</computeroutput>, 2203 <computeroutput>PMPI_Type_get_envelope</computeroutput>, 2204 <computeroutput>PMPI_Type_get_contents</computeroutput>, and 2205 <computeroutput>PMPI_Type_free</computeroutput>. </para> 2206 </sect2> 2207 2208 <sect2 id="mc-manual.mpiwrap.limitations.types" 2209 xreflabel="Types: Abilities and Limitations"> 2210 <title>Types</title> 2211 2212 <para> MPI-1.1 structured types are supported, and walked exactly. 2213 The currently supported combiners are 2214 <computeroutput>MPI_COMBINER_NAMED</computeroutput>, 2215 <computeroutput>MPI_COMBINER_CONTIGUOUS</computeroutput>, 2216 <computeroutput>MPI_COMBINER_VECTOR</computeroutput>, 2217 <computeroutput>MPI_COMBINER_HVECTOR</computeroutput> 2218 <computeroutput>MPI_COMBINER_INDEXED</computeroutput>, 2219 <computeroutput>MPI_COMBINER_HINDEXED</computeroutput> and 2220 <computeroutput>MPI_COMBINER_STRUCT</computeroutput>. This should 2221 cover all MPI-1.1 types. The mechanism (function 2222 <computeroutput>walk_type</computeroutput>) should extend easily to 2223 cover MPI2 combiners.</para> 2224 2225 <para>MPI defines some named structured types 2226 (<computeroutput>MPI_FLOAT_INT</computeroutput>, 2227 <computeroutput>MPI_DOUBLE_INT</computeroutput>, 2228 <computeroutput>MPI_LONG_INT</computeroutput>, 2229 <computeroutput>MPI_2INT</computeroutput>, 2230 <computeroutput>MPI_SHORT_INT</computeroutput>, 2231 <computeroutput>MPI_LONG_DOUBLE_INT</computeroutput>) which are pairs 2232 of some basic type and a C <computeroutput>int</computeroutput>. 2233 Unfortunately the MPI specification makes it impossible to look inside 2234 these types and see where the fields are. Therefore these wrappers 2235 assume the types are laid out as <computeroutput>struct { float val; 2236 int loc; }</computeroutput> (for 2237 <computeroutput>MPI_FLOAT_INT</computeroutput>), etc, and act 2238 accordingly. This appears to be correct at least for Open MPI 1.0.2 2239 and for Quadrics MPI.</para> 2240 2241 <para>If <computeroutput>strict</computeroutput> is an option specified 2242 in <computeroutput>MPIWRAP_DEBUG</computeroutput>, the application 2243 will abort if an unhandled type is encountered. Otherwise, the 2244 application will print a warning message and continue.</para> 2245 2246 <para>Some effort is made to mark/check memory ranges corresponding to 2247 arrays of values in a single pass. This is important for performance 2248 since asking Valgrind to mark/check any range, no matter how small, 2249 carries quite a large constant cost. This optimisation is applied to 2250 arrays of primitive types (<computeroutput>double</computeroutput>, 2251 <computeroutput>float</computeroutput>, 2252 <computeroutput>int</computeroutput>, 2253 <computeroutput>long</computeroutput>, <computeroutput>long 2254 long</computeroutput>, <computeroutput>short</computeroutput>, 2255 <computeroutput>char</computeroutput>, and <computeroutput>long 2256 double</computeroutput> on platforms where <computeroutput>sizeof(long 2257 double) == 8</computeroutput>). For arrays of all other types, the 2258 wrappers handle each element individually and so there can be a very 2259 large performance cost.</para> 2260 2261 </sect2> 2262 2263 2264 <sect2 id="mc-manual.mpiwrap.writingwrappers" 2265 xreflabel="Writing new MPI Wrappers"> 2266 <title>Writing new wrappers</title> 2267 2268 <para> 2269 For the most part the wrappers are straightforward. The only 2270 significant complexity arises with nonblocking receives.</para> 2271 2272 <para>The issue is that <computeroutput>MPI_Irecv</computeroutput> 2273 states the recv buffer and returns immediately, giving a handle 2274 (<computeroutput>MPI_Request</computeroutput>) for the transaction. 2275 Later the user will have to poll for completion with 2276 <computeroutput>MPI_Wait</computeroutput> etc, and when the 2277 transaction completes successfully, the wrappers have to paint the 2278 recv buffer. But the recv buffer details are not presented to 2279 <computeroutput>MPI_Wait</computeroutput> -- only the handle is. The 2280 library therefore maintains a shadow table which associates 2281 uncompleted <computeroutput>MPI_Request</computeroutput>s with the 2282 corresponding buffer address/count/type. When an operation completes, 2283 the table is searched for the associated address/count/type info, and 2284 memory is marked accordingly.</para> 2285 2286 <para>Access to the table is guarded by a (POSIX pthreads) lock, so as 2287 to make the library thread-safe.</para> 2288 2289 <para>The table is allocated with 2290 <computeroutput>malloc</computeroutput> and never 2291 <computeroutput>free</computeroutput>d, so it will show up in leak 2292 checks.</para> 2293 2294 <para>Writing new wrappers should be fairly easy. The source file is 2295 <computeroutput>mpi/libmpiwrap.c</computeroutput>. If possible, 2296 find an existing wrapper for a function of similar behaviour to the 2297 one you want to wrap, and use it as a starting point. The wrappers 2298 are organised in sections in the same order as the MPI 1.1 spec, to 2299 aid navigation. When adding a wrapper, remember to comment out the 2300 definition of the default wrapper in the long list of defaults at the 2301 bottom of the file (do not remove it, just comment it out).</para> 2302 </sect2> 2303 2304 <sect2 id="mc-manual.mpiwrap.whattoexpect" 2305 xreflabel="What to expect with MPI Wrappers"> 2306 <title>What to expect when using the wrappers</title> 2307 2308 <para>The wrappers should reduce Memcheck's false-error rate on MPI 2309 applications. Because the wrapping is done at the MPI interface, 2310 there will still potentially be a large number of errors reported in 2311 the MPI implementation below the interface. The best you can do is 2312 try to suppress them.</para> 2313 2314 <para>You may also find that the input-side (buffer 2315 length/definedness) checks find errors in your MPI use, for example 2316 passing too short a buffer to 2317 <computeroutput>MPI_Recv</computeroutput>.</para> 2318 2319 <para>Functions which are not wrapped may increase the false 2320 error rate. A possible approach is to run with 2321 <computeroutput>MPI_DEBUG</computeroutput> containing 2322 <computeroutput>warn</computeroutput>. This will show you functions 2323 which lack proper wrappers but which are nevertheless used. You can 2324 then write wrappers for them. 2325 </para> 2326 2327 <para>A known source of potential false errors are the 2328 <computeroutput>PMPI_Reduce</computeroutput> family of functions, when 2329 using a custom (user-defined) reduction function. In a reduction 2330 operation, each node notionally sends data to a "central point" which 2331 uses the specified reduction function to merge the data items into a 2332 single item. Hence, in general, data is passed between nodes and fed 2333 to the reduction function, but the wrapper library cannot mark the 2334 transferred data as initialised before it is handed to the reduction 2335 function, because all that happens "inside" the 2336 <computeroutput>PMPI_Reduce</computeroutput> call. As a result you 2337 may see false positives reported in your reduction function.</para> 2338 2339 </sect2> 2340 2341 </sect1> 2342 2343 2344 2345 2346 2347 </chapter> 2348