1 <?xml version="1.0"?> <!-- -*- sgml -*- --> 2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4 [ <!ENTITY % vg-entities SYSTEM "vg-entities.xml"> %vg-entities; ]> 5 6 7 <chapter id="manual-core-adv" xreflabel="Valgrind's core: advanced topics"> 8 <title>Using and understanding the Valgrind core: Advanced Topics</title> 9 10 <para>This chapter describes advanced aspects of the Valgrind core 11 services, which are mostly of interest to power users who wish to 12 customise and modify Valgrind's default behaviours in certain useful 13 ways. The subjects covered are:</para> 14 15 <itemizedlist> 16 <listitem><para>The "Client Request" mechanism</para></listitem> 17 <listitem><para>Debugging your program using Valgrind's gdbserver 18 and GDB</para></listitem> 19 <listitem><para>Function Wrapping</para></listitem> 20 </itemizedlist> 21 22 23 24 <sect1 id="manual-core-adv.clientreq" 25 xreflabel="The Client Request mechanism"> 26 <title>The Client Request mechanism</title> 27 28 <para>Valgrind has a trapdoor mechanism via which the client 29 program can pass all manner of requests and queries to Valgrind 30 and the current tool. Internally, this is used extensively 31 to make various things work, although that's not visible from the 32 outside.</para> 33 34 <para>For your convenience, a subset of these so-called client 35 requests is provided to allow you to tell Valgrind facts about 36 the behaviour of your program, and also to make queries. 37 In particular, your program can tell Valgrind about things that it 38 otherwise would not know, leading to better results. 39 </para> 40 41 <para>Clients need to include a header file to make this work. 42 Which header file depends on which client requests you use. Some 43 client requests are handled by the core, and are defined in the 44 header file <filename>valgrind/valgrind.h</filename>. Tool-specific 45 header files are named after the tool, e.g. 46 <filename>valgrind/memcheck.h</filename>. Each tool-specific header file 47 includes <filename>valgrind/valgrind.h</filename> so you don't need to 48 include it in your client if you include a tool-specific header. All header 49 files can be found in the <literal>include/valgrind</literal> directory of 50 wherever Valgrind was installed.</para> 51 52 <para>The macros in these header files have the magical property 53 that they generate code in-line which Valgrind can spot. 54 However, the code does nothing when not run on Valgrind, so you 55 are not forced to run your program under Valgrind just because you 56 use the macros in this file. Also, you are not required to link your 57 program with any extra supporting libraries.</para> 58 59 <para>The code added to your binary has negligible performance impact: 60 on x86, amd64, ppc32, ppc64 and ARM, the overhead is 6 simple integer 61 instructions and is probably undetectable except in tight loops. 62 However, if you really wish to compile out the client requests, you 63 can compile with <option>-DNVALGRIND</option> (analogous to 64 <option>-DNDEBUG</option>'s effect on 65 <function>assert</function>). 66 </para> 67 68 <para>You are encouraged to copy the <filename>valgrind/*.h</filename> headers 69 into your project's include directory, so your program doesn't have a 70 compile-time dependency on Valgrind being installed. The Valgrind headers, 71 unlike most of the rest of the code, are under a BSD-style license so you may 72 include them without worrying about license incompatibility.</para> 73 74 <para>Here is a brief description of the macros available in 75 <filename>valgrind.h</filename>, which work with more than one 76 tool (see the tool-specific documentation for explanations of the 77 tool-specific macros).</para> 78 79 <variablelist> 80 81 <varlistentry> 82 <term><command><computeroutput>RUNNING_ON_VALGRIND</computeroutput></command>:</term> 83 <listitem> 84 <para>Returns 1 if running on Valgrind, 0 if running on the 85 real CPU. If you are running Valgrind on itself, returns the 86 number of layers of Valgrind emulation you're running on. 87 </para> 88 </listitem> 89 </varlistentry> 90 91 <varlistentry> 92 <term><command><computeroutput>VALGRIND_DISCARD_TRANSLATIONS</computeroutput>:</command></term> 93 <listitem> 94 <para>Discards translations of code in the specified address 95 range. Useful if you are debugging a JIT compiler or some other 96 dynamic code generation system. After this call, attempts to 97 execute code in the invalidated address range will cause 98 Valgrind to make new translations of that code, which is 99 probably the semantics you want. Note that code invalidations 100 are expensive because finding all the relevant translations 101 quickly is very difficult, so try not to call it often. 102 Note that you can be clever about 103 this: you only need to call it when an area which previously 104 contained code is overwritten with new code. You can choose 105 to write code into fresh memory, and just call this 106 occasionally to discard large chunks of old code all at 107 once.</para> 108 <para> 109 Alternatively, for transparent self-modifying-code support, 110 use<option>--smc-check=all</option>, or run 111 on ppc32/Linux, ppc64/Linux or ARM/Linux. 112 </para> 113 </listitem> 114 </varlistentry> 115 116 <varlistentry> 117 <term><command><computeroutput>VALGRIND_COUNT_ERRORS</computeroutput>:</command></term> 118 <listitem> 119 <para>Returns the number of errors found so far by Valgrind. Can be 120 useful in test harness code when combined with the 121 <option>--log-fd=-1</option> option; this runs Valgrind silently, 122 but the client program can detect when errors occur. Only useful 123 for tools that report errors, e.g. it's useful for Memcheck, but for 124 Cachegrind it will always return zero because Cachegrind doesn't 125 report errors.</para> 126 </listitem> 127 </varlistentry> 128 129 <varlistentry> 130 <term><command><computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>:</command></term> 131 <listitem> 132 <para>If your program manages its own memory instead of using 133 the standard <function>malloc</function> / 134 <function>new</function> / 135 <function>new[]</function>, tools that track 136 information about heap blocks will not do nearly as good a 137 job. For example, Memcheck won't detect nearly as many 138 errors, and the error messages won't be as informative. To 139 improve this situation, use this macro just after your custom 140 allocator allocates some new memory. See the comments in 141 <filename>valgrind.h</filename> for information on how to use 142 it.</para> 143 </listitem> 144 </varlistentry> 145 146 <varlistentry> 147 <term><command><computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput>:</command></term> 148 <listitem> 149 <para>This should be used in conjunction with 150 <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput>. 151 Again, see <filename>valgrind.h</filename> for 152 information on how to use it.</para> 153 </listitem> 154 </varlistentry> 155 156 <varlistentry> 157 <term><command><computeroutput>VALGRIND_RESIZEINPLACE_BLOCK</computeroutput>:</command></term> 158 <listitem> 159 <para>Informs a Valgrind tool that the size of an allocated block has been 160 modified but not its address. See <filename>valgrind.h</filename> for 161 more information on how to use it.</para> 162 </listitem> 163 </varlistentry> 164 165 <varlistentry> 166 <term> 167 <command><computeroutput>VALGRIND_CREATE_MEMPOOL</computeroutput></command>, 168 <command><computeroutput>VALGRIND_DESTROY_MEMPOOL</computeroutput></command>, 169 <command><computeroutput>VALGRIND_MEMPOOL_ALLOC</computeroutput></command>, 170 <command><computeroutput>VALGRIND_MEMPOOL_FREE</computeroutput></command>, 171 <command><computeroutput>VALGRIND_MOVE_MEMPOOL</computeroutput></command>, 172 <command><computeroutput>VALGRIND_MEMPOOL_CHANGE</computeroutput></command>, 173 <command><computeroutput>VALGRIND_MEMPOOL_EXISTS</computeroutput></command>: 174 </term> 175 <listitem> 176 <para>These are similar to 177 <computeroutput>VALGRIND_MALLOCLIKE_BLOCK</computeroutput> and 178 <computeroutput>VALGRIND_FREELIKE_BLOCK</computeroutput> 179 but are tailored towards code that uses memory pools. See 180 <xref linkend="mc-manual.mempools"/> for a detailed description.</para> 181 </listitem> 182 </varlistentry> 183 184 <varlistentry> 185 <term><command><computeroutput>VALGRIND_NON_SIMD_CALL[0123]</computeroutput>:</command></term> 186 <listitem> 187 <para>Executes a function in the client program on the 188 <emphasis>real</emphasis> CPU, not the virtual CPU that Valgrind 189 normally runs code on. The function must take an integer (holding a 190 thread ID) as the first argument and then 0, 1, 2 or 3 more arguments 191 (depending on which client request is used). These are used in various 192 ways internally to Valgrind. They might be useful to client 193 programs.</para> 194 195 <para><command>Warning:</command> Only use these if you 196 <emphasis>really</emphasis> know what you are doing. They aren't 197 entirely reliable, and can cause Valgrind to crash. See 198 <filename>valgrind.h</filename> for more details. 199 </para> 200 </listitem> 201 </varlistentry> 202 203 <varlistentry> 204 <term><command><computeroutput>VALGRIND_PRINTF(format, ...)</computeroutput>:</command></term> 205 <listitem> 206 <para>Print a printf-style message to the Valgrind log file. The 207 message is prefixed with the PID between a pair of 208 <computeroutput>**</computeroutput> markers. (Like all client requests, 209 nothing is output if the client program is not running under Valgrind.) 210 Output is not produced until a newline is encountered, or subsequent 211 Valgrind output is printed; this allows you to build up a single line of 212 output over multiple calls. Returns the number of characters output, 213 excluding the PID prefix.</para> 214 </listitem> 215 </varlistentry> 216 217 <varlistentry> 218 <term><command><computeroutput>VALGRIND_PRINTF_BACKTRACE(format, ...)</computeroutput>:</command></term> 219 <listitem> 220 <para>Like <computeroutput>VALGRIND_PRINTF</computeroutput> (in 221 particular, the return value is identical), but prints a stack backtrace 222 immediately afterwards.</para> 223 </listitem> 224 </varlistentry> 225 226 <varlistentry> 227 <term><command><computeroutput>VALGRIND_STACK_REGISTER(start, end)</computeroutput>:</command></term> 228 <listitem> 229 <para>Registers a new stack. Informs Valgrind that the memory range 230 between start and end is a unique stack. Returns a stack identifier 231 that can be used with other 232 <computeroutput>VALGRIND_STACK_*</computeroutput> calls.</para> 233 <para>Valgrind will use this information to determine if a change to 234 the stack pointer is an item pushed onto the stack or a change over 235 to a new stack. Use this if you're using a user-level thread package 236 and are noticing spurious errors from Valgrind about uninitialized 237 memory reads.</para> 238 239 <para><command>Warning:</command> Unfortunately, this client request is 240 unreliable and best avoided.</para> 241 </listitem> 242 </varlistentry> 243 244 <varlistentry> 245 <term><command><computeroutput>VALGRIND_STACK_DEREGISTER(id)</computeroutput>:</command></term> 246 <listitem> 247 <para>Deregisters a previously registered stack. Informs 248 Valgrind that previously registered memory range with stack id 249 <computeroutput>id</computeroutput> is no longer a stack.</para> 250 251 <para><command>Warning:</command> Unfortunately, this client request is 252 unreliable and best avoided.</para> 253 </listitem> 254 </varlistentry> 255 256 <varlistentry> 257 <term><command><computeroutput>VALGRIND_STACK_CHANGE(id, start, end)</computeroutput>:</command></term> 258 <listitem> 259 <para>Changes a previously registered stack. Informs 260 Valgrind that the previously registered stack with stack id 261 <computeroutput>id</computeroutput> has changed its start and end 262 values. Use this if your user-level thread package implements 263 stack growth.</para> 264 265 <para><command>Warning:</command> Unfortunately, this client request is 266 unreliable and best avoided.</para> 267 </listitem> 268 </varlistentry> 269 270 </variablelist> 271 272 </sect1> 273 274 275 276 277 278 279 280 <sect1 id="manual-core-adv.gdbserver" 281 xreflabel="Debugging your program using Valgrind's gdbserver and GDB"> 282 <title>Debugging your program using Valgrind gdbserver and GDB</title> 283 284 <para>A program running under Valgrind is not executed directly by the 285 CPU. Instead it runs on a synthetic CPU provided by Valgrind. This is 286 why a debugger cannot debug your program when it runs on Valgrind. 287 </para> 288 <para> 289 This section describes how GDB can interact with the 290 Valgrind gdbserver to provide a fully debuggable program under 291 Valgrind. Used in this way, GDB also provides an interactive usage of 292 Valgrind core or tool functionalities, including incremental leak search 293 under Memcheck and on-demand Massif snapshot production. 294 </para> 295 296 <sect2 id="manual-core-adv.gdbserver-simple" 297 xreflabel="gdbserver simple example"> 298 <title>Quick Start: debugging in 3 steps</title> 299 300 <para>The simplest way to get started is to run Valgrind with the 301 flag <option>--vgdb-error=0</option>. Then follow the on-screen 302 directions, which give you the precise commands needed to start GDB 303 and connect it to your program.</para> 304 305 <para>Otherwise, here's a slightly more verbose overview.</para> 306 307 <para>If you want to debug a program with GDB when using the Memcheck 308 tool, start Valgrind like this: 309 <screen><![CDATA[ 310 valgrind --vgdb=yes --vgdb-error=0 prog 311 ]]></screen></para> 312 313 <para>In another shell, start GDB: 314 <screen><![CDATA[ 315 gdb prog 316 ]]></screen></para> 317 318 <para>Then give the following command to GDB: 319 <screen><![CDATA[ 320 (gdb) target remote | vgdb 321 ]]></screen></para> 322 323 <para>You can now debug your program e.g. by inserting a breakpoint 324 and then using the GDB <computeroutput>continue</computeroutput> 325 command.</para> 326 327 <para>This quick start information is enough for basic usage of the 328 Valgrind gdbserver. The sections below describe more advanced 329 functionality provided by the combination of Valgrind and GDB. Note 330 that the command line flag <option>--vgdb=yes</option> can be omitted, 331 as this is the default value. 332 </para> 333 334 </sect2> 335 336 <sect2 id="manual-core-adv.gdbserver-concept" 337 xreflabel="gdbserver"> 338 <title>Valgrind gdbserver overall organisation</title> 339 <para>The GNU GDB debugger is typically used to debug a process 340 running on the same machine. In this mode, GDB uses system calls to 341 control and query the program being debugged. This works well, but 342 only allows GDB to debug a program running on the same computer. 343 </para> 344 345 <para>GDB can also debug processes running on a different computer. 346 To achieve this, GDB defines a protocol (that is, a set of query and 347 reply packets) that facilitates fetching the value of memory or 348 registers, setting breakpoints, etc. A gdbserver is an implementation 349 of this "GDB remote debugging" protocol. To debug a process running 350 on a remote computer, a gdbserver (sometimes called a GDB stub) 351 must run at the remote computer side. 352 </para> 353 354 <para>The Valgrind core provides a built-in gdbserver implementation, 355 which is activated using <option>--vgdb=yes</option> 356 or <option>--vgdb=full</option>. This gdbserver allows the process 357 running on Valgrind's synthetic CPU to be debugged remotely. 358 GDB sends protocol query packets (such as "get register contents") to 359 the Valgrind embedded gdbserver. The gdbserver executes the queries 360 (for example, it will get the register values of the synthetic CPU) 361 and gives the results back to GDB. 362 </para> 363 364 <para>GDB can use various kinds of channels (TCP/IP, serial line, etc) 365 to communicate with the gdbserver. In the case of Valgrind's 366 gdbserver, communication is done via a pipe and a small helper program 367 called <xref linkend="manual-core-adv.vgdb"/>, which acts as an 368 intermediary. If no GDB is in use, vgdb can also be 369 used to send monitor commands to the Valgrind gdbserver from a shell 370 command line. 371 </para> 372 373 </sect2> 374 375 <sect2 id="manual-core-adv.gdbserver-gdb" 376 xreflabel="Connecting GDB to a Valgrind gdbserver"> 377 <title>Connecting GDB to a Valgrind gdbserver</title> 378 <para>To debug a program "<filename>prog</filename>" running under 379 Valgrind, you must ensure that the Valgrind gdbserver is activated by 380 specifying either <option>--vgdb=yes</option> 381 or <option>--vgdb=full</option>. A secondary command line option, 382 <option>--vgdb-error=number</option>, can be used to tell the gdbserver 383 only to become active once the specified number of errors have been 384 reported. A value of zero will therefore cause 385 the gdbserver to become active at startup, which allows you to 386 insert breakpoints before starting the run. For example: 387 <screen><![CDATA[ 388 valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog 389 ]]></screen></para> 390 391 <para>The Valgrind gdbserver is invoked at startup 392 and indicates it is waiting for a connection from a GDB:</para> 393 394 <programlisting><![CDATA[ 395 ==2418== Memcheck, a memory error detector 396 ==2418== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. 397 ==2418== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info 398 ==2418== Command: ./prog 399 ==2418== 400 ==2418== (action at startup) vgdb me ... 401 ]]></programlisting> 402 403 404 <para>GDB (in another shell) can then be connected to the Valgrind gdbserver. 405 For this, GDB must be started on the program <filename>prog</filename>: 406 <screen><![CDATA[ 407 gdb ./prog 408 ]]></screen></para> 409 410 411 <para>You then indicate to GDB that you want to debug a remote target: 412 <screen><![CDATA[ 413 (gdb) target remote | vgdb 414 ]]></screen> 415 GDB then starts a vgdb relay application to communicate with the 416 Valgrind embedded gdbserver:</para> 417 418 <programlisting><![CDATA[ 419 (gdb) target remote | vgdb 420 Remote debugging using | vgdb 421 relaying data between gdb and process 2418 422 Reading symbols from /lib/ld-linux.so.2...done. 423 Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done. 424 Loaded symbols for /lib/ld-linux.so.2 425 [Switching to Thread 2418] 426 0x001f2850 in _start () from /lib/ld-linux.so.2 427 (gdb) 428 ]]></programlisting> 429 430 <para>Note that vgdb is provided as part of the Valgrind 431 distribution. You do not need to install it separately.</para> 432 433 <para>If vgdb detects that there are multiple Valgrind gdbservers that 434 can be connected to, it will list all such servers and their PIDs, and 435 then exit. You can then reissue the GDB "target" command, but 436 specifying the PID of the process you want to debug: 437 </para> 438 439 <programlisting><![CDATA[ 440 (gdb) target remote | vgdb 441 Remote debugging using | vgdb 442 no --pid= arg given and multiple valgrind pids found: 443 use --pid=2479 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog 444 use --pid=2481 for valgrind --tool=memcheck --vgdb=yes --vgdb-error=0 ./prog 445 use --pid=2483 for valgrind --vgdb=yes --vgdb-error=0 ./another_prog 446 Remote communication error: Resource temporarily unavailable. 447 (gdb) target remote | vgdb --pid=2479 448 Remote debugging using | vgdb --pid=2479 449 relaying data between gdb and process 2479 450 Reading symbols from /lib/ld-linux.so.2...done. 451 Reading symbols from /usr/lib/debug/lib/ld-2.11.2.so.debug...done. 452 Loaded symbols for /lib/ld-linux.so.2 453 [Switching to Thread 2479] 454 0x001f2850 in _start () from /lib/ld-linux.so.2 455 (gdb) 456 ]]></programlisting> 457 458 <para>Once GDB is connected to the Valgrind gdbserver, it can be used 459 in the same way as if you were debugging the program natively:</para> 460 <itemizedlist> 461 <listitem> 462 <para>Breakpoints can be inserted or deleted.</para> 463 </listitem> 464 <listitem> 465 <para>Variables and register values can be examined or modified. 466 </para> 467 </listitem> 468 <listitem> 469 <para>Signal handling can be configured (printing, ignoring). 470 </para> 471 </listitem> 472 <listitem> 473 <para>Execution can be controlled (continue, step, next, stepi, etc). 474 </para> 475 </listitem> 476 <listitem> 477 <para>Program execution can be interrupted using Control-C.</para> 478 </listitem> 479 </itemizedlist> 480 481 <para>And so on. Refer to the GDB user manual for a complete 482 description of GDB's functionality. 483 </para> 484 485 </sect2> 486 487 <sect2 id="manual-core-adv.gdbserver-gdb-android" 488 xreflabel="Connecting to an Android gdbserver"> 489 <title>Connecting to an Android gdbserver</title> 490 <para> When developping applications for Android, you will typically use 491 a development system (on which the Android NDK is installed) to compile your 492 application. An Android target system or emulator will be used to run 493 the application. 494 In this setup, Valgrind and vgdb will run on the Android system, 495 while GDB will run on the development system. GDB will connect 496 to the vgdb running on the Android system using the Android NDK 497 'adb forward' application. 498 </para> 499 <para> Example: on the Android system, execute the following: 500 <screen><![CDATA[ 501 valgrind --vgdb-error=0 prog 502 # and then in another shell, run: 503 vgdb --port=1234 504 ]]></screen> 505 </para> 506 507 <para> On the development system, execute the following commands: 508 <screen><![CDATA[ 509 adb forward tcp:1234 tcp:1234 510 gdb prog 511 (gdb) target remote :1234 512 ]]></screen> 513 GDB will use a local tcp/ip connection to connect to the Android adb forwarder. 514 Adb will establish a relay connection between the host system and the Android 515 target system. Pay attention to use the GDB delivered in the 516 Android NDK system (typically, arm-linux-androideabi-gdb), as the host 517 GDB is probably not able to debug Android arm applications. 518 Note that the local port nr (used by GDB) must not necessarily be equal 519 to the port number used by vgdb: adb can forward tcp/ip between different 520 port numbers. 521 </para> 522 523 </sect2> 524 525 <sect2 id="manual-core-adv.gdbserver-commandhandling" 526 xreflabel="Monitor command handling by the Valgrind gdbserver"> 527 <title>Monitor command handling by the Valgrind gdbserver</title> 528 529 <para> The Valgrind gdbserver provides additional Valgrind-specific 530 functionality via "monitor commands". Such monitor commands can 531 be sent from the GDB command line or from the shell command line. See 532 <xref linkend="manual-core-adv.valgrind-monitor-commands"/> for the list 533 of the Valgrind core monitor commands. 534 </para> 535 536 <para>Each tool can also provide tool-specific monitor commands. 537 An example of a tool specific monitor command is the Memcheck monitor 538 command <computeroutput>leak_check full 539 reachable any</computeroutput>. This requests a full reporting of the 540 allocated memory blocks. To have this leak check executed, use the GDB 541 command: 542 <screen><![CDATA[ 543 (gdb) monitor leak_check full reachable any 544 ]]></screen> 545 </para> 546 547 <para>GDB will send the <computeroutput>leak_check</computeroutput> 548 command to the Valgrind gdbserver. The Valgrind gdbserver will 549 execute the monitor command itself, if it recognises it to be a Valgrind core 550 monitor command. If it is not recognised as such, it is assumed to 551 be tool-specific and is handed to the tool for execution. For example: 552 </para> 553 <programlisting><![CDATA[ 554 (gdb) monitor leak_check full reachable any 555 ==2418== 100 bytes in 1 blocks are still reachable in loss record 1 of 1 556 ==2418== at 0x4006E9E: malloc (vg_replace_malloc.c:236) 557 ==2418== by 0x804884F: main (prog.c:88) 558 ==2418== 559 ==2418== LEAK SUMMARY: 560 ==2418== definitely lost: 0 bytes in 0 blocks 561 ==2418== indirectly lost: 0 bytes in 0 blocks 562 ==2418== possibly lost: 0 bytes in 0 blocks 563 ==2418== still reachable: 100 bytes in 1 blocks 564 ==2418== suppressed: 0 bytes in 0 blocks 565 ==2418== 566 (gdb) 567 ]]></programlisting> 568 569 <para>As with other GDB commands, the Valgrind gdbserver will accept 570 abbreviated monitor command names and arguments, as long as the given 571 abbreviation is unambiguous. For example, the above 572 <computeroutput>leak_check</computeroutput> 573 command can also be typed as: 574 <screen><![CDATA[ 575 (gdb) mo l f r a 576 ]]></screen> 577 578 The letters <computeroutput>mo</computeroutput> are recognised by GDB as being 579 an abbreviation for <computeroutput>monitor</computeroutput>. So GDB sends the 580 string <computeroutput>l f r a</computeroutput> to the Valgrind 581 gdbserver. The letters provided in this string are unambiguous for the 582 Valgrind gdbserver. This therefore gives the same output as the 583 unabbreviated command and arguments. If the provided abbreviation is 584 ambiguous, the Valgrind gdbserver will report the list of commands (or 585 argument values) that can match: 586 <programlisting><![CDATA[ 587 (gdb) mo v. n 588 v. can match v.set v.info v.wait v.kill v.translate 589 (gdb) mo v.i n 590 n_errs_found 0 (vgdb-error 0) 591 (gdb) 592 ]]></programlisting> 593 </para> 594 595 <para>Instead of sending a monitor command from GDB, you can also send 596 these from a shell command line. For example, the following command 597 lines, when given in a shell, will cause the same leak search to be executed 598 by the process 3145: 599 <screen><![CDATA[ 600 vgdb --pid=3145 leak_check full reachable any 601 vgdb --pid=3145 l f r a 602 ]]></screen></para> 603 604 <para>Note that the Valgrind gdbserver automatically continues the 605 execution of the program after a standalone invocation of 606 vgdb. Monitor commands sent from GDB do not cause the program to 607 continue: the program execution is controlled explicitly using GDB 608 commands such as "continue" or "next".</para> 609 610 </sect2> 611 612 <sect2 id="manual-core-adv.gdbserver-threads" 613 xreflabel="Valgrind gdbserver thread information"> 614 <title>Valgrind gdbserver thread information</title> 615 616 <para>Valgrind's gdbserver enriches the output of the 617 GDB <computeroutput>info threads</computeroutput> command 618 with Valgrind-specific information. 619 The operating system's thread number is followed 620 by Valgrind's internal index for that thread ("tid") and by 621 the Valgrind scheduler thread state:</para> 622 623 <programlisting><![CDATA[ 624 (gdb) info threads 625 4 Thread 6239 (tid 4 VgTs_Yielding) 0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 626 * 3 Thread 6238 (tid 3 VgTs_Runnable) make_error (s=0x8048b76 "called from London") at prog.c:20 627 2 Thread 6237 (tid 2 VgTs_WaitSys) 0x001f2832 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 628 1 Thread 6234 (tid 1 VgTs_Yielding) main (argc=1, argv=0xbedcc274) at prog.c:105 629 (gdb) 630 ]]></programlisting> 631 632 </sect2> 633 634 <sect2 id="manual-core-adv.gdbserver-shadowregisters" 635 xreflabel="Examining and modifying Valgrind shadow registers"> 636 <title>Examining and modifying Valgrind shadow registers</title> 637 638 <para> When the option <option>--vgdb-shadow-registers=yes</option> is 639 given, the Valgrind gdbserver will let GDB examine and/or modify 640 Valgrind's shadow registers. GDB version 7.1 or later is needed for this 641 to work.</para> 642 643 <para>For each CPU register, the Valgrind core maintains two 644 shadow register sets. These shadow registers can be accessed from 645 GDB by giving a postfix <computeroutput>s1</computeroutput> 646 or <computeroutput>s2</computeroutput> for respectively the first 647 and second shadow register. For example, the x86 register 648 <computeroutput>eax</computeroutput> and its two shadows 649 can be examined using the following commands:</para> 650 651 <programlisting><![CDATA[ 652 (gdb) p $eax 653 $1 = 0 654 (gdb) p $eaxs1 655 $2 = 0 656 (gdb) p $eaxs2 657 $3 = 0 658 (gdb) 659 ]]></programlisting> 660 661 </sect2> 662 663 664 <sect2 id="manual-core-adv.gdbserver-limitations" 665 xreflabel="Limitations of the Valgrind gdbserver"> 666 <title>Limitations of the Valgrind gdbserver</title> 667 668 <para>Debugging with the Valgrind gdbserver is very similar to native 669 debugging. Valgrind's gdbserver implementation is quite 670 complete, and so provides most of the GDB debugging functionality. There 671 are however some limitations and peculiarities:</para> 672 <itemizedlist> 673 <listitem> 674 <para>Precision of "stop-at" commands.</para> 675 <para> 676 GDB commands such as "step", "next", "stepi", breakpoints 677 and watchpoints, will stop the execution of the process. With 678 the option <option>--vgdb=yes</option>, the process might not 679 stop at the exact requested instruction. Instead, it might 680 continue execution of the current basic block and stop at one 681 of the following basic blocks. This is linked to the fact that 682 Valgrind gdbserver has to instrument a block to allow stopping 683 at the exact instruction requested. Currently, 684 re-instrumentation of the block currently being executed is not 685 supported. So, if the action requested by GDB (e.g. single 686 stepping or inserting a breakpoint) implies re-instrumentation 687 of the current block, the GDB action may not be executed 688 precisely. 689 </para> 690 <para> 691 This limitation applies when the basic block 692 currently being executed has not yet been instrumented for debugging. 693 This typically happens when the gdbserver is activated due to the 694 tool reporting an error or to a watchpoint. If the gdbserver 695 block has been activated following a breakpoint, or if a 696 breakpoint has been inserted in the block before its execution, 697 then the block has already been instrumented for debugging. 698 </para> 699 <para> 700 If you use the option <option>--vgdb=full</option>, then GDB 701 "stop-at" commands will be obeyed precisely. The 702 downside is that this requires each instruction to be 703 instrumented with an additional call to a gdbserver helper 704 function, which gives considerable overhead compared to 705 <option>--vgdb=no</option>. Option <option>--vgdb=yes</option> 706 has neglectible overhead compared 707 to <option>--vgdb=no</option>. 708 </para> 709 </listitem> 710 711 <listitem> 712 <para>Hardware watchpoint support by the Valgrind 713 gdbserver.</para> 714 715 <para> The Valgrind gdbserver can simulate hardware watchpoints 716 if the selected tool provides support for it. Currently, 717 only Memcheck provides hardware watchpoint simulation. The 718 hardware watchpoint simulation provided by Memcheck is much 719 faster that GDB software watchpoints, which are implemented by 720 GDB checking the value of the watched zone(s) after each 721 instruction. Hardware watchpoint simulation also provides read 722 watchpoints. The hardware watchpoint simulation by Memcheck has 723 some limitations compared to real hardware 724 watchpoints. However, the number and length of simulated 725 watchpoints are not limited. 726 </para> 727 <para>Typically, the number of (real) hardware watchpoints is 728 limited. For example, the x86 architecture supports a maximum of 729 4 hardware watchpoints, each watchpoint watching 1, 2, 4 or 8 730 bytes. The Valgrind gdbserver does not have any limitation on the 731 number of simulated hardware watchpoints. It also has no 732 limitation on the length of the memory zone being 733 watched. Using GDB version 7.4 or later allow full use of the 734 flexibility of the Valgrind gdbserver's simulated hardware watchpoints. 735 Previous GDB versions do not understand that Valgrind gdbserver 736 watchpoints have no length limit. 737 </para> 738 <para>Memcheck implements hardware watchpoint simulation by 739 marking the watched address ranges as being unaddressable. When 740 a hardware watchpoint is removed, the range is marked as 741 addressable and defined. Hardware watchpoint simulation of 742 addressable-but-undefined memory zones works properly, but has 743 the undesirable side effect of marking the zone as defined when 744 the watchpoint is removed. 745 </para> 746 <para>Write watchpoints might not be reported at the 747 exact instruction that writes the monitored area, 748 unless option <option>--vgdb=full</option> is given. Read watchpoints 749 will always be reported at the exact instruction reading the 750 watched memory. 751 </para> 752 <para>It is better to avoid using hardware watchpoint of not 753 addressable (yet) memory: in such a case, GDB will fall back to 754 extremely slow software watchpoints. Also, if you do not quit GDB 755 between two debugging sessions, the hardware watchpoints of the 756 previous sessions will be re-inserted as software watchpoints if 757 the watched memory zone is not addressable at program startup. 758 </para> 759 </listitem> 760 761 <listitem> 762 <para>Stepping inside shared libraries on ARM.</para> 763 <para>For unknown reasons, stepping inside shared 764 libraries on ARM may fail. A workaround is to use the 765 <computeroutput>ldd</computeroutput> command 766 to find the list of shared libraries and their loading address 767 and inform GDB of the loading address using the GDB command 768 "add-symbol-file". Example: 769 <programlisting><![CDATA[ 770 (gdb) shell ldd ./prog 771 libc.so.6 => /lib/libc.so.6 (0x4002c000) 772 /lib/ld-linux.so.3 (0x40000000) 773 (gdb) add-symbol-file /lib/libc.so.6 0x4002c000 774 add symbol table from file "/lib/libc.so.6" at 775 .text_addr = 0x4002c000 776 (y or n) y 777 Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done. 778 (gdb) 779 ]]></programlisting> 780 </para> 781 </listitem> 782 783 <listitem> 784 <para>GDB version needed for ARM and PPC32/64.</para> 785 <para>You must use a GDB version which is able to read XML 786 target description sent by a gdbserver. This is the standard setup 787 if GDB was configured and built with the "expat" 788 library. If your GDB was not configured with XML support, it 789 will report an error message when using the "target" 790 command. Debugging will not work because GDB will then not be 791 able to fetch the registers from the Valgrind gdbserver. 792 For ARM programs using the Thumb instruction set, you must use 793 a GDB version of 7.1 or later, as earlier versions have problems 794 with next/step/breakpoints in Thumb code. 795 </para> 796 </listitem> 797 798 <listitem> 799 <para>Stack unwinding on PPC32/PPC64. </para> 800 <para>On PPC32/PPC64, stack unwinding for leaf functions 801 (functions that do not call any other functions) works properly 802 only when you give the option 803 <option>--vex-iropt-precise-memory-exns=yes</option>. 804 You must also pass this option in order to get a precise stack when 805 a signal is trapped by GDB. 806 </para> 807 </listitem> 808 809 <listitem> 810 <para>Breakpoints encountered multiple times.</para> 811 <para>Some instructions (e.g. x86 "rep movsb") 812 are translated by Valgrind using a loop. If a breakpoint is placed 813 on such an instruction, the breakpoint will be encountered 814 multiple times -- once for each step of the "implicit" loop 815 implementing the instruction. 816 </para> 817 </listitem> 818 819 <listitem> 820 <para>Execution of Inferior function calls by the Valgrind 821 gdbserver.</para> 822 823 <para>GDB allows the user to "call" functions inside the process 824 being debugged. Such calls are named "inferior calls" in the GDB 825 terminology. A typical use of an inferior call is to execute 826 a function that prints a human-readable version of a complex data 827 structure. To make an inferior call, use the GDB "print" command 828 followed by the function to call and its arguments. As an 829 example, the following GDB command causes an inferior call to the 830 libc "printf" function to be executed by the process 831 being debugged: 832 </para> 833 <programlisting><![CDATA[ 834 (gdb) p printf("process being debugged has pid %d\n", getpid()) 835 $5 = 36 836 (gdb) 837 ]]></programlisting> 838 839 <para>The Valgrind gdbserver supports inferior function calls. 840 Whilst an inferior call is running, the Valgrind tool will report 841 errors as usual. If you do not want to have such errors stop the 842 execution of the inferior call, you can 843 use <computeroutput>v.set vgdb-error</computeroutput> to set a 844 big value before the call, then manually reset it to its original 845 value when the call is complete.</para> 846 847 <para>To execute inferior calls, GDB changes registers such as 848 the program counter, and then continues the execution of the 849 program. In a multithreaded program, all threads are continued, 850 not just the thread instructed to make the inferior call. If 851 another thread reports an error or encounters a breakpoint, the 852 evaluation of the inferior call is abandoned.</para> 853 854 <para>Note that inferior function calls are a powerful GDB 855 feature, but should be used with caution. For example, if 856 the program being debugged is stopped inside the function "printf", 857 forcing a recursive call to printf via an inferior call will 858 very probably create problems. The Valgrind tool might also add 859 another level of complexity to inferior calls, e.g. by reporting 860 tool errors during the Inferior call or due to the 861 instrumentation done. 862 </para> 863 864 </listitem> 865 866 <listitem> 867 <para>Connecting to or interrupting a Valgrind process blocked in 868 a system call.</para> 869 870 <para>Connecting to or interrupting a Valgrind process blocked in 871 a system call requires the "ptrace" system call to be usable. 872 This may be disabled in your kernel for security reasons.</para> 873 874 <para>When running your program, Valgrind's scheduler 875 periodically checks whether there is any work to be handled by 876 the gdbserver. Unfortunately this check is only done if at least 877 one thread of the process is runnable. If all the threads of the 878 process are blocked in a system call, then the checks do not 879 happen, and the Valgrind scheduler will not invoke the gdbserver. 880 In such a case, the vgdb relay application will "force" the 881 gdbserver to be invoked, without the intervention of the Valgrind 882 scheduler. 883 </para> 884 885 <para>Such forced invocation of the Valgrind gdbserver is 886 implemented by vgdb using ptrace system calls. On a properly 887 implemented kernel, the ptrace calls done by vgdb will not 888 influence the behaviour of the program running under Valgrind. 889 If however they do, giving the 890 option <option>--max-invoke-ms=0</option> to the vgdb relay 891 application will disable the usage of ptrace calls. The 892 consequence of disabling ptrace usage in vgdb is that a Valgrind 893 process blocked in a system call cannot be woken up or 894 interrupted from GDB until it executes enough basic blocks to let 895 the Valgrind scheduler's normal checking take effect. 896 </para> 897 898 <para>When ptrace is disabled in vgdb, you can increase the 899 responsiveness of the Valgrind gdbserver to commands or 900 interrupts by giving a lower value to the 901 option <option>--vgdb-poll</option>. If your application is 902 blocked in system calls most of the time, using a very low value 903 for <option>--vgdb-poll</option> will cause a the gdbserver to be 904 invoked sooner. The gdbserver polling done by Valgrind's 905 scheduler is very efficient, so the increased polling frequency 906 should not cause significant performance degradation. 907 </para> 908 909 <para>When ptrace is disabled in vgdb, a query packet sent by GDB 910 may take significant time to be handled by the Valgrind 911 gdbserver. In such cases, GDB might encounter a protocol 912 timeout. To avoid this, 913 you can increase the value of the timeout by using the GDB 914 command "set remotetimeout". 915 </para> 916 917 <para>Ubuntu versions 10.10 and later may restrict the scope of 918 ptrace to the children of the process calling ptrace. As the 919 Valgrind process is not a child of vgdb, such restricted scoping 920 causes the ptrace calls to fail. To avoid that, when Valgrind 921 gdbserver receives the first packet from a vgdb, it calls 922 <computeroutput>prctl(PR_SET_PTRACER, vgdb_pid, 0, 0, 923 0)</computeroutput> to ensure vgdb can reliably use ptrace. 924 Once <computeroutput>vgdb_pid</computeroutput> has been marked as 925 a ptracer, vgdb can then properly force the invocation of 926 Valgrind gdbserver when needed. To ensure the vgdb is set as a 927 ptracer before the Valgrind process gets blocked in a system 928 call, connect your GDB to the Valgrind gdbserver at startup by 929 passing <option>--vgdb-error=0</option> to Valgrind.</para> 930 931 <para>Note that 932 this "set ptracer" technique does not solve the problem in the 933 case where a standalone vgdb process wants to connect to the 934 gdbserver, since the first command to be sent by a standalone 935 vgdb must wake up the Valgrind process before Valgrind gdbserver 936 will mark vgdb as a ptracer. 937 </para> 938 939 <para>Unblocking processes blocked in system calls is not 940 currently implemented on Mac OS X and Android. So you cannot 941 connect to or interrupt a process blocked in a system call on Mac 942 OS X or Android. 943 </para> 944 945 </listitem> 946 947 <listitem> 948 <para>Changing register values.</para> 949 <para>The Valgrind gdbserver will only modify the values of the 950 thread's registers when the thread is in status Runnable or 951 Yielding. In other states (typically, WaitSys), attempts to 952 change register values will fail. Amongst other things, this 953 means that inferior calls are not executed for a thread which is 954 in a system call, since the Valgrind gdbserver does not implement 955 system call restart. 956 </para> 957 </listitem> 958 959 <listitem> 960 <para>Unsupported GDB functionality.</para> 961 <para>GDB provides a lot of debugging functionality and not all 962 of it is supported. Specifically, the following are not 963 supported: reversible debugging and tracepoints. 964 </para> 965 </listitem> 966 967 <listitem> 968 <para>Unknown limitations or problems.</para> 969 <para>The combination of GDB, Valgrind and the Valgrind gdbserver 970 probably has unknown other limitations and problems. If you 971 encounter strange or unexpected behaviour, feel free to report a 972 bug. But first please verify that the limitation or problem is 973 not inherent to GDB or the GDB remote protocol. You may be able 974 to do so by checking the behaviour when using standard gdbserver 975 part of the GDB package. 976 </para> 977 </listitem> 978 979 </itemizedlist> 980 981 </sect2> 982 983 <sect2 id="manual-core-adv.vgdb" 984 xreflabel="vgdb"> 985 <title>vgdb command line options</title> 986 <para> Usage: <computeroutput>vgdb [OPTION]... [[-c] COMMAND]...</computeroutput></para> 987 988 <para> vgdb ("Valgrind to GDB") is a small program that is used as an 989 intermediary between Valgrind and GDB or a shell. 990 Therefore, it has two usage modes: 991 </para> 992 <orderedlist> 993 <listitem id="manual-core-adv.vgdb-standalone" xreflabel="vgdb standalone"> 994 <para>As a standalone utility, it is used from a shell command 995 line to send monitor commands to a process running under 996 Valgrind. For this usage, the vgdb OPTION(s) must be followed by 997 the monitor command to send. To send more than one command, 998 separate them with the <option>-c</option> option. 999 </para> 1000 </listitem> 1001 1002 <listitem id="manual-core-adv.vgdb-relay" xreflabel="vgdb relay"> 1003 <para>In combination with GDB "target remote |" command, it is 1004 used as the relay application between GDB and the Valgrind 1005 gdbserver. For this usage, only OPTION(s) can be given, but no 1006 COMMAND can be given. 1007 </para> 1008 </listitem> 1009 1010 </orderedlist> 1011 1012 <para><computeroutput>vgdb</computeroutput> accepts the following 1013 options:</para> 1014 <itemizedlist> 1015 <listitem> 1016 <para><option>--pid=<number></option>: specifies the PID of 1017 the process to which vgdb must connect to. This option is useful 1018 in case more than one Valgrind gdbserver can be connected to. If 1019 the <option>--pid</option> argument is not given and multiple 1020 Valgrind gdbserver processes are running, vgdb will report the 1021 list of such processes and then exit.</para> 1022 </listitem> 1023 1024 <listitem> 1025 <para><option>--vgdb-prefix</option> must be given to both 1026 Valgrind and vgdb if you want to change the default prefix for the 1027 FIFOs (named pipes) used for communication between the Valgrind 1028 gdbserver and vgdb. </para> 1029 </listitem> 1030 1031 <listitem> 1032 <para><option>--wait=<number></option> instructs vgdb to 1033 search for available Valgrind gdbservers for the specified number 1034 of seconds. This makes it possible start a vgdb process 1035 before starting the Valgrind gdbserver with which you intend the 1036 vgdb to communicate. This option is useful when used in 1037 conjunction with a <option>--vgdb-prefix</option> that is 1038 unique to the process you want to wait for. 1039 Also, if you use the <option>--wait</option> argument in the GDB 1040 "target remote" command, you must set the GDB remotetimeout to a 1041 value bigger than the --wait argument value. See option 1042 <option>--max-invoke-ms</option> (just below) 1043 for an example of setting the remotetimeout value.</para> 1044 </listitem> 1045 1046 <listitem> 1047 <para><option>--max-invoke-ms=<number></option> gives the 1048 number of milliseconds after which vgdb will force the invocation 1049 of gdbserver embedded in Valgrind. The default value is 100 1050 milliseconds. A value of 0 disables forced invocation. The forced 1051 invocation is used when vgdb is connected to a Valgrind gdbserver, 1052 and the Valgrind process has all its threads blocked in a system 1053 call. 1054 </para> 1055 1056 <para>If you specify a large value, you might need to increase the 1057 GDB "remotetimeout" value from its default value of 2 seconds. 1058 You should ensure that the timeout (in seconds) is 1059 bigger than the <option>--max-invoke-ms</option> value. For 1060 example, for <option>--max-invoke-ms=5000</option>, the following 1061 GDB command is suitable: 1062 <screen><![CDATA[ 1063 (gdb) set remotetimeout 6 1064 ]]></screen> 1065 </para> 1066 </listitem> 1067 1068 <listitem> 1069 <para><option>--cmd-time-out=<number></option> instructs a 1070 standalone vgdb to exit if the Valgrind gdbserver it is connected 1071 to does not process a command in the specified number of seconds. 1072 The default value is to never time out.</para> 1073 </listitem> 1074 1075 <listitem> 1076 <para><option>--port=<portnr></option> instructs vgdb to 1077 use tcp/ip and listen for GDB on the specified port nr rather than 1078 to use a pipe to communicate with GDB. Using tcp/ip allows to have 1079 GDB running on one computer and debugging a Valgrind process 1080 running on another target computer. 1081 Example: 1082 <screen><![CDATA[ 1083 # On the target computer, start your program under valgrind using 1084 valgrind --vgdb-error=0 prog 1085 # and then in another shell, run: 1086 vgdb --port=1234 1087 ]]></screen></para> 1088 <para>On the computer which hosts GDB, execute the command: 1089 <screen><![CDATA[ 1090 gdb prog 1091 (gdb) target remote targetip:1234 1092 ]]></screen> 1093 where targetip is the ip address or hostname of the target computer. 1094 </para> 1095 </listitem> 1096 1097 <listitem> 1098 <para><option>-c</option> To give more than one command to a 1099 standalone vgdb, separate the commands by an 1100 option <option>-c</option>. Example: 1101 <screen><![CDATA[ 1102 vgdb v.set log_output -c leak_check any 1103 ]]></screen></para> 1104 </listitem> 1105 1106 <listitem> 1107 <para><option>-l</option> instructs a standalone vgdb to report 1108 the list of the Valgrind gdbserver processes running and then 1109 exit.</para> 1110 </listitem> 1111 1112 <listitem> 1113 <para><option>-D</option> instructs a standalone vgdb to show the 1114 state of the shared memory used by the Valgrind gdbserver. vgdb 1115 will exit after having shown the Valgrind gdbserver shared memory 1116 state.</para> 1117 </listitem> 1118 1119 <listitem> 1120 <para><option>-d</option> instructs vgdb to produce debugging 1121 output. Give multiple <option>-d</option> args to increase the 1122 verbosity. When giving <option>-d</option> to a relay vgdb, you better 1123 redirect the standard error (stderr) of vgdb to a file to avoid 1124 interaction between GDB and vgdb debugging output.</para> 1125 </listitem> 1126 1127 </itemizedlist> 1128 1129 </sect2> 1130 1131 1132 <sect2 id="manual-core-adv.valgrind-monitor-commands" 1133 xreflabel="Valgrind monitor commands"> 1134 <title>Valgrind monitor commands</title> 1135 1136 <para>The Valgrind monitor commands are available regardless of the 1137 Valgrind tool selected. They can be sent either from a shell command 1138 line, by using a standalone vgdb, or from GDB, by using GDB's 1139 "monitor" command.</para> 1140 1141 <itemizedlist> 1142 <listitem> 1143 <para><varname>help [debug]</varname> instructs Valgrind's gdbserver 1144 to give the list of all monitor commands of the Valgrind core and 1145 of the tool. The optional "debug" argument tells to also give help 1146 for the monitor commands aimed at Valgrind internals debugging. 1147 </para> 1148 </listitem> 1149 1150 <listitem> 1151 <para><varname>v.info all_errors</varname> shows all errors found 1152 so far.</para> 1153 </listitem> 1154 <listitem> 1155 <para><varname>v.info last_error</varname> shows the last error 1156 found.</para> 1157 </listitem> 1158 1159 <listitem> 1160 <para><varname>v.info n_errs_found</varname> shows the number of 1161 errors found so far and the current value of the 1162 <option>--vgdb-error</option> 1163 argument.</para> 1164 </listitem> 1165 1166 <listitem> 1167 <para><varname>v.set {gdb_output | log_output | 1168 mixed_output}</varname> allows redirection of the Valgrind output 1169 (e.g. the errors detected by the tool). The default setting is 1170 <computeroutput>mixed_output</computeroutput>.</para> 1171 1172 <para>With <computeroutput>mixed_output</computeroutput>, the 1173 Valgrind output goes to the Valgrind log (typically stderr) while 1174 the output of the interactive GDB monitor commands (e.g. 1175 <computeroutput>v.info last_error</computeroutput>) 1176 is displayed by GDB.</para> 1177 1178 <para>With <computeroutput>gdb_output</computeroutput>, both the 1179 Valgrind output and the interactive GDB monitor commands output are 1180 displayed by GDB.</para> 1181 1182 <para>With <computeroutput>log_output</computeroutput>, both the 1183 Valgrind output and the interactive GDB monitor commands output go 1184 to the Valgrind log.</para> 1185 </listitem> 1186 1187 <listitem> 1188 <para><varname>v.wait [ms (default 0)]</varname> instructs 1189 Valgrind gdbserver to sleep "ms" milli-seconds and then 1190 continue. When sent from a standalone vgdb, if this is the last 1191 command, the Valgrind process will continue the execution of the 1192 guest process. The typical usage of this is to use vgdb to send a 1193 "no-op" command to a Valgrind gdbserver so as to continue the 1194 execution of the guest process. 1195 </para> 1196 </listitem> 1197 1198 <listitem> 1199 <para><varname>v.kill</varname> requests the gdbserver to kill 1200 the process. This can be used from a standalone vgdb to properly 1201 kill a Valgrind process which is currently expecting a vgdb 1202 connection.</para> 1203 </listitem> 1204 1205 <listitem> 1206 <para><varname>v.set vgdb-error <errornr></varname> 1207 dynamically changes the value of the 1208 <option>--vgdb-error</option> argument. A 1209 typical usage of this is to start with 1210 <option>--vgdb-error=0</option> on the 1211 command line, then set a few breakpoints, set the vgdb-error value 1212 to a huge value and continue execution.</para> 1213 </listitem> 1214 1215 </itemizedlist> 1216 1217 <para>The following Valgrind monitor commands are useful for 1218 investigating the behaviour of Valgrind or its gdbserver in case of 1219 problems or bugs.</para> 1220 1221 <itemizedlist> 1222 1223 <listitem> 1224 <para><varname>v.info gdbserver_status</varname> shows the 1225 gdbserver status. In case of problems (e.g. of communications), 1226 this shows the values of some relevant Valgrind gdbserver internal 1227 variables. Note that the variables related to breakpoints and 1228 watchpoints (e.g. the number of breakpoint addresses and the number of 1229 watchpoints) will be zero, as GDB by default removes all 1230 watchpoints and breakpoints when execution stops, and re-inserts 1231 them when resuming the execution of the debugged process. You can 1232 change this GDB behaviour by using the GDB command 1233 <computeroutput>set breakpoint always-inserted on</computeroutput>. 1234 </para> 1235 </listitem> 1236 1237 <listitem> 1238 <para><varname>v.info memory</varname> shows the statistics of 1239 Valgrind's internal heap management. If 1240 option <option>--profile-heap=yes</option> was given, detailed 1241 statistics will be output. 1242 </para> 1243 </listitem> 1244 1245 <listitem> 1246 <para><varname>v.info scheduler</varname> shows the state and 1247 stack trace for all threads, as known by Valgrind. This allows to 1248 compare the stack traces produced by the Valgrind unwinder with 1249 the stack traces produced by GDB+Valgrind gdbserver. Pay attention 1250 that GDB and Valgrind scheduler status have their own thread 1251 numbering scheme. To make the link between the GDB thread 1252 number and the corresponding Valgrind scheduler thread number, 1253 use the GDB command <computeroutput>info 1254 threads</computeroutput>. The output of this command shows the 1255 GDB thread number and the valgrind 'tid'. The 'tid' is the thread number 1256 output by <computeroutput>v.info scheduler</computeroutput>. 1257 When using the callgrind tool, the callgrind monitor command 1258 <computeroutput>status</computeroutput> outputs internal callgrind 1259 information about the stack/call graph it maintains. 1260 </para> 1261 </listitem> 1262 1263 <listitem> 1264 <para><varname>v.set debuglog <intvalue></varname> sets the 1265 Valgrind debug log level to <intvalue>. This allows to 1266 dynamically change the log level of Valgrind e.g. when a problem 1267 is detected.</para> 1268 </listitem> 1269 1270 <listitem> 1271 <para><varname>v.translate <address> 1272 [<traceflags>]</varname> shows the translation of the block 1273 containing <computeroutput>address</computeroutput> with the given 1274 trace flags. The <computeroutput>traceflags</computeroutput> value 1275 bit patterns have similar meaning to Valgrind's 1276 <option>--trace-flags</option> option. It can be given 1277 in hexadecimal (e.g. 0x20) or decimal (e.g. 32) or in binary 1s 1278 and 0s bit (e.g. 0b00100000). The default value of the traceflags 1279 is 0b00100000, corresponding to "show after instrumentation". 1280 The output of this command always goes to the Valgrind 1281 log.</para> 1282 <para>The additional bit flag 0b100000000 (bit 8) 1283 has no equivalent in the <option>--trace-flags</option> option. 1284 It enables tracing of the gdbserver specific instrumentation. Note 1285 that this bit 8 can only enable the addition of gdbserver 1286 instrumentation in the trace. Setting it to 0 will not 1287 disable the tracing of the gdbserver instrumentation if it is 1288 active for some other reason, for example because there is a breakpoint at 1289 this address or because gdbserver is in single stepping 1290 mode.</para> 1291 </listitem> 1292 1293 </itemizedlist> 1294 1295 </sect2> 1296 1297 </sect1> 1298 1299 1300 1301 1302 1303 <sect1 id="manual-core-adv.wrapping" xreflabel="Function Wrapping"> 1304 <title>Function wrapping</title> 1305 1306 <para> 1307 Valgrind allows calls to some specified functions to be intercepted and 1308 rerouted to a different, user-supplied function. This can do whatever it 1309 likes, typically examining the arguments, calling onwards to the original, 1310 and possibly examining the result. Any number of functions may be 1311 wrapped.</para> 1312 1313 <para> 1314 Function wrapping is useful for instrumenting an API in some way. For 1315 example, Helgrind wraps functions in the POSIX pthreads API so it can know 1316 about thread status changes, and the core is able to wrap 1317 functions in the MPI (message-passing) API so it can know 1318 of memory status changes associated with message arrival/departure. 1319 Such information is usually passed to Valgrind by using client 1320 requests in the wrapper functions, although the exact mechanism may vary. 1321 </para> 1322 1323 <sect2 id="manual-core-adv.wrapping.example" xreflabel="A Simple Example"> 1324 <title>A Simple Example</title> 1325 1326 <para>Supposing we want to wrap some function</para> 1327 1328 <programlisting><![CDATA[ 1329 int foo ( int x, int y ) { return x + y; }]]></programlisting> 1330 1331 <para>A wrapper is a function of identical type, but with a special name 1332 which identifies it as the wrapper for <computeroutput>foo</computeroutput>. 1333 Wrappers need to include 1334 supporting macros from <filename>valgrind.h</filename>. 1335 Here is a simple wrapper which prints the arguments and return value:</para> 1336 1337 <programlisting><![CDATA[ 1338 #include <stdio.h> 1339 #include "valgrind.h" 1340 int I_WRAP_SONAME_FNNAME_ZU(NONE,foo)( int x, int y ) 1341 { 1342 int result; 1343 OrigFn fn; 1344 VALGRIND_GET_ORIG_FN(fn); 1345 printf("foo's wrapper: args %d %d\n", x, y); 1346 CALL_FN_W_WW(result, fn, x,y); 1347 printf("foo's wrapper: result %d\n", result); 1348 return result; 1349 } 1350 ]]></programlisting> 1351 1352 <para>To become active, the wrapper merely needs to be present in a text 1353 section somewhere in the same process' address space as the function 1354 it wraps, and for its ELF symbol name to be visible to Valgrind. In 1355 practice, this means either compiling to a 1356 <computeroutput>.o</computeroutput> and linking it in, or 1357 compiling to a <computeroutput>.so</computeroutput> and 1358 <computeroutput>LD_PRELOAD</computeroutput>ing it in. The latter is more 1359 convenient in that it doesn't require relinking.</para> 1360 1361 <para>All wrappers have approximately the above form. There are three 1362 crucial macros:</para> 1363 1364 <para><computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>: 1365 this generates the real name of the wrapper. 1366 This is an encoded name which Valgrind notices when reading symbol 1367 table information. What it says is: I am the wrapper for any function 1368 named <computeroutput>foo</computeroutput> which is found in 1369 an ELF shared object with an empty 1370 ("<computeroutput>NONE</computeroutput>") soname field. The specification 1371 mechanism is powerful in 1372 that wildcards are allowed for both sonames and function names. 1373 The details are discussed below.</para> 1374 1375 <para><computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>: 1376 once in the the wrapper, the first priority is 1377 to get hold of the address of the original (and any other supporting 1378 information needed). This is stored in a value of opaque 1379 type <computeroutput>OrigFn</computeroutput>. 1380 The information is acquired using 1381 <computeroutput>VALGRIND_GET_ORIG_FN</computeroutput>. It is crucial 1382 to make this macro call before calling any other wrapped function 1383 in the same thread.</para> 1384 1385 <para><computeroutput>CALL_FN_W_WW</computeroutput>: eventually we will 1386 want to call the function being 1387 wrapped. Calling it directly does not work, since that just gets us 1388 back to the wrapper and leads to an infinite loop. Instead, the result 1389 lvalue, 1390 <computeroutput>OrigFn</computeroutput> and arguments are 1391 handed to one of a family of macros of the form 1392 <computeroutput>CALL_FN_*</computeroutput>. These 1393 cause Valgrind to call the original and avoid recursion back to the 1394 wrapper.</para> 1395 </sect2> 1396 1397 <sect2 id="manual-core-adv.wrapping.specs" xreflabel="Wrapping Specifications"> 1398 <title>Wrapping Specifications</title> 1399 1400 <para>This scheme has the advantage of being self-contained. A library of 1401 wrappers can be compiled to object code in the normal way, and does 1402 not rely on an external script telling Valgrind which wrappers pertain 1403 to which originals.</para> 1404 1405 <para>Each wrapper has a name which, in the most general case says: I am the 1406 wrapper for any function whose name matches FNPATT and whose ELF 1407 "soname" matches SOPATT. Both FNPATT and SOPATT may contain wildcards 1408 (asterisks) and other characters (spaces, dots, @, etc) which are not 1409 generally regarded as valid C identifier names.</para> 1410 1411 <para>This flexibility is needed to write robust wrappers for POSIX pthread 1412 functions, where typically we are not completely sure of either the 1413 function name or the soname, or alternatively we want to wrap a whole 1414 set of functions at once.</para> 1415 1416 <para>For example, <computeroutput>pthread_create</computeroutput> 1417 in GNU libpthread is usually a 1418 versioned symbol - one whose name ends in, eg, 1419 <computeroutput>@GLIBC_2.3</computeroutput>. Hence we 1420 are not sure what its real name is. We also want to cover any soname 1421 of the form <computeroutput>libpthread.so*</computeroutput>. 1422 So the header of the wrapper will be</para> 1423 1424 <programlisting><![CDATA[ 1425 int I_WRAP_SONAME_FNNAME_ZZ(libpthreadZdsoZd0,pthreadZucreateZAZa) 1426 ( ... formals ... ) 1427 { ... body ... } 1428 ]]></programlisting> 1429 1430 <para>In order to write unusual characters as valid C function names, a 1431 Z-encoding scheme is used. Names are written literally, except that 1432 a capital Z acts as an escape character, with the following encoding:</para> 1433 1434 <programlisting><![CDATA[ 1435 Za encodes * 1436 Zp + 1437 Zc : 1438 Zd . 1439 Zu _ 1440 Zh - 1441 Zs (space) 1442 ZA @ 1443 ZZ Z 1444 ZL ( # only in valgrind 3.3.0 and later 1445 ZR ) # only in valgrind 3.3.0 and later 1446 ]]></programlisting> 1447 1448 <para>Hence <computeroutput>libpthreadZdsoZd0</computeroutput> is an 1449 encoding of the soname <computeroutput>libpthread.so.0</computeroutput> 1450 and <computeroutput>pthreadZucreateZAZa</computeroutput> is an encoding 1451 of the function name <computeroutput>pthread_create@*</computeroutput>. 1452 </para> 1453 1454 <para>The macro <computeroutput>I_WRAP_SONAME_FNNAME_ZZ</computeroutput> 1455 constructs a wrapper name in which 1456 both the soname (first component) and function name (second component) 1457 are Z-encoded. Encoding the function name can be tiresome and is 1458 often unnecessary, so a second macro, 1459 <computeroutput>I_WRAP_SONAME_FNNAME_ZU</computeroutput>, can be 1460 used instead. The <computeroutput>_ZU</computeroutput> variant is 1461 also useful for writing wrappers for 1462 C++ functions, in which the function name is usually already mangled 1463 using some other convention in which Z plays an important role. Having 1464 to encode a second time quickly becomes confusing.</para> 1465 1466 <para>Since the function name field may contain wildcards, it can be 1467 anything, including just <computeroutput>*</computeroutput>. 1468 The same is true for the soname. 1469 However, some ELF objects - specifically, main executables - do not 1470 have sonames. Any object lacking a soname is treated as if its soname 1471 was <computeroutput>NONE</computeroutput>, which is why the original 1472 example above had a name 1473 <computeroutput>I_WRAP_SONAME_FNNAME_ZU(NONE,foo)</computeroutput>.</para> 1474 1475 <para>Note that the soname of an ELF object is not the same as its 1476 file name, although it is often similar. You can find the soname of 1477 an object <computeroutput>libfoo.so</computeroutput> using the command 1478 <computeroutput>readelf -a libfoo.so | grep soname</computeroutput>.</para> 1479 </sect2> 1480 1481 <sect2 id="manual-core-adv.wrapping.semantics" xreflabel="Wrapping Semantics"> 1482 <title>Wrapping Semantics</title> 1483 1484 <para>The ability for a wrapper to replace an infinite family of functions 1485 is powerful but brings complications in situations where ELF objects 1486 appear and disappear (are dlopen'd and dlclose'd) on the fly. 1487 Valgrind tries to maintain sensible behaviour in such situations.</para> 1488 1489 <para>For example, suppose a process has dlopened (an ELF object with 1490 soname) <filename>object1.so</filename>, which contains 1491 <computeroutput>function1</computeroutput>. It starts to use 1492 <computeroutput>function1</computeroutput> immediately.</para> 1493 1494 <para>After a while it dlopens <filename>wrappers.so</filename>, 1495 which contains a wrapper 1496 for <computeroutput>function1</computeroutput> in (soname) 1497 <filename>object1.so</filename>. All subsequent calls to 1498 <computeroutput>function1</computeroutput> are rerouted to the wrapper.</para> 1499 1500 <para>If <filename>wrappers.so</filename> is 1501 later dlclose'd, calls to <computeroutput>function1</computeroutput> are 1502 naturally routed back to the original.</para> 1503 1504 <para>Alternatively, if <filename>object1.so</filename> 1505 is dlclose'd but <filename>wrappers.so</filename> remains, 1506 then the wrapper exported by <filename>wrappers.so</filename> 1507 becomes inactive, since there 1508 is no way to get to it - there is no original to call any more. However, 1509 Valgrind remembers that the wrapper is still present. If 1510 <filename>object1.so</filename> is 1511 eventually dlopen'd again, the wrapper will become active again.</para> 1512 1513 <para>In short, valgrind inspects all code loading/unloading events to 1514 ensure that the set of currently active wrappers remains consistent.</para> 1515 1516 <para>A second possible problem is that of conflicting wrappers. It is 1517 easily possible to load two or more wrappers, both of which claim 1518 to be wrappers for some third function. In such cases Valgrind will 1519 complain about conflicting wrappers when the second one appears, and 1520 will honour only the first one.</para> 1521 </sect2> 1522 1523 <sect2 id="manual-core-adv.wrapping.debugging" xreflabel="Debugging"> 1524 <title>Debugging</title> 1525 1526 <para>Figuring out what's going on given the dynamic nature of wrapping 1527 can be difficult. The 1528 <option>--trace-redir=yes</option> option makes 1529 this possible 1530 by showing the complete state of the redirection subsystem after 1531 every 1532 <function>mmap</function>/<function>munmap</function> 1533 event affecting code (text).</para> 1534 1535 <para>There are two central concepts:</para> 1536 1537 <itemizedlist> 1538 1539 <listitem><para>A "redirection specification" is a binding of 1540 a (soname pattern, fnname pattern) pair to a code address. 1541 These bindings are created by writing functions with names 1542 made with the 1543 <computeroutput>I_WRAP_SONAME_FNNAME_{ZZ,_ZU}</computeroutput> 1544 macros.</para></listitem> 1545 1546 <listitem><para>An "active redirection" is a code-address to 1547 code-address binding currently in effect.</para></listitem> 1548 1549 </itemizedlist> 1550 1551 <para>The state of the wrapping-and-redirection subsystem comprises a set of 1552 specifications and a set of active bindings. The specifications are 1553 acquired/discarded by watching all 1554 <function>mmap</function>/<function>munmap</function> 1555 events on code (text) 1556 sections. The active binding set is (conceptually) recomputed from 1557 the specifications, and all known symbol names, following any change 1558 to the specification set.</para> 1559 1560 <para><option>--trace-redir=yes</option> shows the contents 1561 of both sets following any such event.</para> 1562 1563 <para><option>-v</option> prints a line of text each 1564 time an active specification is used for the first time.</para> 1565 1566 <para>Hence for maximum debugging effectiveness you will need to use both 1567 options.</para> 1568 1569 <para>One final comment. The function-wrapping facility is closely 1570 tied to Valgrind's ability to replace (redirect) specified 1571 functions, for example to redirect calls to 1572 <function>malloc</function> to its 1573 own implementation. Indeed, a replacement function can be 1574 regarded as a wrapper function which does not call the original. 1575 However, to make the implementation more robust, the two kinds 1576 of interception (wrapping vs replacement) are treated differently. 1577 </para> 1578 1579 <para><option>--trace-redir=yes</option> shows 1580 specifications and bindings for both 1581 replacement and wrapper functions. To differentiate the 1582 two, replacement bindings are printed using 1583 <computeroutput>R-></computeroutput> whereas 1584 wraps are printed using <computeroutput>W-></computeroutput>. 1585 </para> 1586 </sect2> 1587 1588 1589 <sect2 id="manual-core-adv.wrapping.limitations-cf" 1590 xreflabel="Limitations - control flow"> 1591 <title>Limitations - control flow</title> 1592 1593 <para>For the most part, the function wrapping implementation is robust. 1594 The only important caveat is: in a wrapper, get hold of 1595 the <computeroutput>OrigFn</computeroutput> information using 1596 <computeroutput>VALGRIND_GET_ORIG_FN</computeroutput> before calling any 1597 other wrapped function. Once you have the 1598 <computeroutput>OrigFn</computeroutput>, arbitrary 1599 calls between, recursion between, and longjumps out of wrappers 1600 should work correctly. There is never any interaction between wrapped 1601 functions and merely replaced functions 1602 (eg <function>malloc</function>), so you can call 1603 <function>malloc</function> etc safely from within wrappers. 1604 </para> 1605 1606 <para>The above comments are true for {x86,amd64,ppc32,arm}-linux. On 1607 ppc64-linux function wrapping is more fragile due to the (arguably 1608 poorly designed) ppc64-linux ABI. This mandates the use of a shadow 1609 stack which tracks entries/exits of both wrapper and replacement 1610 functions. This gives two limitations: firstly, longjumping out of 1611 wrappers will rapidly lead to disaster, since the shadow stack will 1612 not get correctly cleared. Secondly, since the shadow stack has 1613 finite size, recursion between wrapper/replacement functions is only 1614 possible to a limited depth, beyond which Valgrind has to abort the 1615 run. This depth is currently 16 calls.</para> 1616 1617 <para>For all platforms ({x86,amd64,ppc32,ppc64,arm}-linux) all the above 1618 comments apply on a per-thread basis. In other words, wrapping is 1619 thread-safe: each thread must individually observe the above 1620 restrictions, but there is no need for any kind of inter-thread 1621 cooperation.</para> 1622 </sect2> 1623 1624 1625 <sect2 id="manual-core-adv.wrapping.limitations-sigs" 1626 xreflabel="Limitations - original function signatures"> 1627 <title>Limitations - original function signatures</title> 1628 1629 <para>As shown in the above example, to call the original you must use a 1630 macro of the form <computeroutput>CALL_FN_*</computeroutput>. 1631 For technical reasons it is impossible 1632 to create a single macro to deal with all argument types and numbers, 1633 so a family of macros covering the most common cases is supplied. In 1634 what follows, 'W' denotes a machine-word-typed value (a pointer or a 1635 C <computeroutput>long</computeroutput>), 1636 and 'v' denotes C's <computeroutput>void</computeroutput> type. 1637 The currently available macros are:</para> 1638 1639 <programlisting><![CDATA[ 1640 CALL_FN_v_v -- call an original of type void fn ( void ) 1641 CALL_FN_W_v -- call an original of type long fn ( void ) 1642 1643 CALL_FN_v_W -- call an original of type void fn ( long ) 1644 CALL_FN_W_W -- call an original of type long fn ( long ) 1645 1646 CALL_FN_v_WW -- call an original of type void fn ( long, long ) 1647 CALL_FN_W_WW -- call an original of type long fn ( long, long ) 1648 1649 CALL_FN_v_WWW -- call an original of type void fn ( long, long, long ) 1650 CALL_FN_W_WWW -- call an original of type long fn ( long, long, long ) 1651 1652 CALL_FN_W_WWWW -- call an original of type long fn ( long, long, long, long ) 1653 CALL_FN_W_5W -- call an original of type long fn ( long, long, long, long, long ) 1654 CALL_FN_W_6W -- call an original of type long fn ( long, long, long, long, long, long ) 1655 and so on, up to 1656 CALL_FN_W_12W 1657 ]]></programlisting> 1658 1659 <para>The set of supported types can be expanded as needed. It is 1660 regrettable that this limitation exists. Function wrapping has proven 1661 difficult to implement, with a certain apparently unavoidable level of 1662 ickiness. After several implementation attempts, the present 1663 arrangement appears to be the least-worst tradeoff. At least it works 1664 reliably in the presence of dynamic linking and dynamic code 1665 loading/unloading.</para> 1666 1667 <para>You should not attempt to wrap a function of one type signature with a 1668 wrapper of a different type signature. Such trickery will surely lead 1669 to crashes or strange behaviour. This is not a limitation 1670 of the function wrapping implementation, merely a reflection of the 1671 fact that it gives you sweeping powers to shoot yourself in the foot 1672 if you are not careful. Imagine the instant havoc you could wreak by 1673 writing a wrapper which matched any function name in any soname - in 1674 effect, one which claimed to be a wrapper for all functions in the 1675 process.</para> 1676 </sect2> 1677 1678 <sect2 id="manual-core-adv.wrapping.examples" xreflabel="Examples"> 1679 <title>Examples</title> 1680 1681 <para>In the source tree, 1682 <filename>memcheck/tests/wrap[1-8].c</filename> provide a series of 1683 examples, ranging from very simple to quite advanced.</para> 1684 1685 <para><filename>mpi/libmpiwrap.c</filename> is an example 1686 of wrapping a big, complex API (the MPI-2 interface). This file defines 1687 almost 300 different wrappers.</para> 1688 </sect2> 1689 1690 </sect1> 1691 1692 1693 1694 1695 </chapter> 1696