1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 2 <html xmlns="http://www.w3.org/1999/xhtml"> 3 <head> 4 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> 5 <link href="style.css" rel="stylesheet" type="text/css" /> 6 <title>Symbolicating with LLDB</title> 7 </head> 8 9 <body> 10 <div class="www_title"> 11 The <strong>LLDB</strong> Debugger 12 </div> 13 14 <div id="container"> 15 <div id="content"> 16 17 <!--#include virtual="sidebar.incl"--> 18 19 <div id="middle"> 20 <div class="post"> 21 <h1 class="postheader">Manual Symbolication with LLDB</h1> 22 <div class="postcontent"> 23 <p>LLDB is separated into a shared library that contains the core of the debugger, 24 and a driver that implements debugging and a command interpreter. LLDB can be 25 used to symbolicate your crash logs and can often provide more information than 26 other symbolication programs: 27 </p> 28 <ul> 29 <li>Inlined functions</li> 30 <li>Variables that are in scope for an address, along with their locations</li> 31 </ul> 32 <p>The simplest form of symbolication is to load an executable:</p> 33 <code><pre><tt><b>(lldb)</b> target create --no-dependents --arch x86_64 /tmp/a.out 34 </tt></pre></code> 35 <p>We use the "--no-dependents" flag with the "target create" command so 36 that we don't load all of the dependent shared libraries from the current 37 system. When we symbolicate, we are often symbolicating a binary that 38 was running on another system, and even though the main executable might 39 reference shared libraries in "/usr/lib", we often don't want to load 40 the versions on the current computer.</p> 41 <p>Using the "image list" command will show us a list of all shared libraries 42 associated with the current target. As expected, we currently only have a single 43 binary: 44 </p> 45 <code><pre><tt><b>(lldb)</b> image list 46 [ 0] 73431214-6B76-3489-9557-5075F03E36B4 0x0000000100000000 /tmp/a.out 47 /tmp/a.out.dSYM/Contents/Resources/DWARF/a.out 48 </tt></pre></code> 49 50 <p>Now we can look up an address:</p> 51 <code><pre><tt><b>(lldb)</b> image lookup --address 0x100000aa3 52 Address: a.out[0x0000000100000aa3] (a.out.__TEXT.__text + 131) 53 Summary: a.out`main + 67 at main.c:13 54 </tt></pre></code> 55 <p>Since we haven't specified a slide or any load addresses for individual sections 56 in the binary, the address that we use here is a <b>file</b> address. A <b>file</b> 57 address refers to a virtual address as defined by each object file. 58 </p> 59 <p>If we didn't use the "--no-dependents" option with "target create", we would 60 have loaded all dependent shared libraries:<p> 61 <code><pre><tt><b>(lldb)</b> image list 62 [ 0] 73431214-6B76-3489-9557-5075F03E36B4 0x0000000100000000 /tmp/a.out 63 /tmp/a.out.dSYM/Contents/Resources/DWARF/a.out 64 [ 1] 8CBCF9B9-EBB7-365E-A3FF-2F3850763C6B 0x0000000000000000 /usr/lib/system/libsystem_c.dylib 65 [ 2] 62AA0B84-188A-348B-8F9E-3E2DB08DB93C 0x0000000000000000 /usr/lib/system/libsystem_dnssd.dylib 66 [ 3] C0535565-35D1-31A7-A744-63D9F10F12A4 0x0000000000000000 /usr/lib/system/libsystem_kernel.dylib 67 ... 68 </tt></pre></code> 69 70 71 <p>Now if we do a lookup using a <b>file</b> address, this can result in multiple 72 matches since most shared libraries have a virtual address space that starts at zero:</p> 73 <code><pre><tt><b>(lldb)</b> image lookup -a 0x1000 74 Address: a.out[0x0000000000001000] (a.out.__PAGEZERO + 4096) 75 76 Address: libsystem_c.dylib[0x0000000000001000] (libsystem_c.dylib.__TEXT.__text + 928) 77 Summary: libsystem_c.dylib`mcount + 9 78 79 Address: libsystem_dnssd.dylib[0x0000000000001000] (libsystem_dnssd.dylib.__TEXT.__text + 456) 80 Summary: libsystem_dnssd.dylib`ConvertHeaderBytes + 38 81 82 Address: libsystem_kernel.dylib[0x0000000000001000] (libsystem_kernel.dylib.__TEXT.__text + 1116) 83 Summary: libsystem_kernel.dylib`clock_get_time + 102 84 ... 85 </tt></pre></code> 86 <p>To avoid getting multiple file address matches, you can specify the 87 <b>name</b> of the shared library to limit the search:</p> 88 <code><pre><tt><b>(lldb)</b> image lookup -a 0x1000 <b>a.out</b> 89 Address: a.out[0x0000000000001000] (a.out.__PAGEZERO + 4096) 90 </tt></pre></code> 91 </div> 92 <div class="postfooter"></div> 93 </div> 94 <div class="post"> 95 <h1 class="postheader">Defining Load Addresses for Sections</h1> 96 <div class="postcontent"> 97 <p>When symbolicating your crash logs, it can be tedious if you always have to 98 adjust your crashlog-addresses into file addresses. To avoid having to do any 99 conversion, you can set the load address for the sections of the modules in your target. 100 Once you set any section load address, lookups will switch to using 101 <b>load</b> addresses. You can slide all sections in the executable by the same amount, 102 or set the <b>load</b> address for individual sections. The 103 "target modules load --slide" command allows us to set the <b>load</b> address for 104 all sections. 105 <p>Below is an example of sliding all sections in <b>a.out</b> by adding 0x123000 to each section's <b>file</b> address:</p> 106 <code><pre><tt><b>(lldb)</b> target create --no-dependents --arch x86_64 /tmp/a.out 107 <b>(lldb)</b> target modules load --file a.out --slide 0x123000 108 </tt></pre></code> 109 <p>It is often much easier to specify the actual load location of each section by name. 110 Crash logs on Mac OS X have a <b>Binary Images</b> section that specifies 111 that address of the __TEXT segment for each binary. Specifying a slide requires 112 requires that you first find the original (<b>file</b>) address for the __TEXT 113 segment, and subtract the two values. 114 If you specify the 115 address of the __TEXT segment with "target modules load <i>section</i> <i>address</i>", you don't need to do any calculations. To specify 116 the load addresses of sections we can specify one or more section name + address pairs 117 in the "target modules load" command:</p> 118 <code><pre><tt><b>(lldb)</b> target create --no-dependents --arch x86_64 /tmp/a.out 119 <b>(lldb)</b> target modules load --file a.out __TEXT 0x100123000 120 </tt></pre></code> 121 <p>We specified that the <b>__TEXT</b> section is loaded at 0x100123000. 122 Now that we have defined where sections have been loaded in our target, 123 any lookups we do will now use <b>load</b> addresses so we don't have to 124 do any math on the addresses in the crashlog backtraces, we can just use the 125 raw addresses:</p> 126 <code><pre><tt><b>(lldb)</b> image lookup --address 0x100123aa3 127 Address: a.out[0x0000000100000aa3] (a.out.__TEXT.__text + 131) 128 Summary: a.out`main + 67 at main.c:13 129 </tt></pre></code> 130 </div> 131 <div class="postfooter"></div> 132 </div> 133 <div class="post"> 134 <h1 class="postheader">Loading Multiple Executables</h1> 135 <div class="postcontent"> 136 <p>You often have more than one executable involved when you need to symbolicate 137 a crash log. When this happens, you create a target for the main executable 138 or one of the shared libraries, then add more modules to the target using the 139 "target modules add" command.<p> 140 <p>Lets say we have a Darwin crash log that contains the following images: 141 <code><pre><tt>Binary Images: 142 <font color=blue>0x100000000</font> - 0x100000ff7 <A866975B-CA1E-3649-98D0-6C5FAA444ECF> /tmp/a.out 143 <font color=green>0x7fff83f32000</font> - 0x7fff83ffefe7 <8CBCF9B9-EBB7-365E-A3FF-2F3850763C6B> /usr/lib/system/libsystem_c.dylib 144 <font color=red>0x7fff883db000</font> - 0x7fff883e3ff7 <62AA0B84-188A-348B-8F9E-3E2DB08DB93C> /usr/lib/system/libsystem_dnssd.dylib 145 <font color=purple>0x7fff8c0dc000</font> - 0x7fff8c0f7ff7 <C0535565-35D1-31A7-A744-63D9F10F12A4> /usr/lib/system/libsystem_kernel.dylib 146 </tt></pre></code> 147 148 <p>First we create the target using the main executable and then add any extra shared libraries we want:</p> 149 <code><pre><tt><b>(lldb)</b> target create --no-dependents --arch x86_64 /tmp/a.out 150 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_c.dylib 151 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_dnssd.dylib 152 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_kernel.dylib 153 </tt></pre></code> 154 <p>If you have debug symbols in standalone files, such as dSYM files on Mac OS X, you can specify their paths using the <b>--symfile</b> option for the "target create" (recent LLDB releases only) and "target modules add" commands:</p> 155 <code><pre><tt><b>(lldb)</b> target create --no-dependents --arch x86_64 /tmp/a.out <b>--symfile /tmp/a.out.dSYM</b> 156 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_c.dylib <b>--symfile /build/server/a/libsystem_c.dylib.dSYM</b> 157 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_dnssd.dylib <b>--symfile /build/server/b/libsystem_dnssd.dylib.dSYM</b> 158 <b>(lldb)</b> target modules add /usr/lib/system/libsystem_kernel.dylib <b>--symfile /build/server/c/libsystem_kernel.dylib.dSYM</b> 159 </tt></pre></code> 160 <p>Then we set the load addresses for each __TEXT section (note the colors of the load addresses above and below) using the first address from the Binary Images section for each image:</p> 161 <code><pre><tt><b>(lldb)</b> target modules load --file a.out <font color=blue>0x100000000</font> 162 <b>(lldb)</b> target modules load --file libsystem_c.dylib <font color=green>0x7fff83f32000</font> 163 <b>(lldb)</b> target modules load --file libsystem_dnssd.dylib <font color=red>0x7fff883db000</font> 164 <b>(lldb)</b> target modules load --file libsystem_kernel.dylib <font color=purple>0x7fff8c0dc000</font> 165 </tt></pre></code> 166 <p>Now any stack backtraces that haven't been symbolicated can be symbolicated using "image lookup" 167 with the raw backtrace addresses.</p> 168 <p>Given the following raw backtrace:</p> 169 <code><pre><tt>Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 170 0 libsystem_kernel.dylib 0x00007fff8a1e6d46 __kill + 10 171 1 libsystem_c.dylib 0x00007fff84597df0 abort + 177 172 2 libsystem_c.dylib 0x00007fff84598e2a __assert_rtn + 146 173 3 a.out 0x0000000100000f46 main + 70 174 4 libdyld.dylib 0x00007fff8c4197e1 start + 1 175 </tt></pre></code> 176 <p>We can now symbolicate the <b>load</b> addresses:<p> 177 <code><pre><tt><b>(lldb)</b> image lookup -a 0x00007fff8a1e6d46 178 <b>(lldb)</b> image lookup -a 0x00007fff84597df0 179 <b>(lldb)</b> image lookup -a 0x00007fff84598e2a 180 <b>(lldb)</b> image lookup -a 0x0000000100000f46 181 </tt></pre></code> 182 </div> 183 <div class="postfooter"></div> 184 </div> 185 <div class="post"> 186 <h1 class="postheader">Getting Variable Information</h1> 187 <div class="postcontent"> 188 <p>If you add the --verbose flag to the "image lookup --address" command, 189 you can get verbose information which can often include the locations 190 of some of your local variables: 191 <code><pre><tt>><b>(lldb)</b> image lookup --address 0x100123aa3 --verbose 192 Address: a.out[0x0000000100000aa3] (a.out.__TEXT.__text + 110) 193 Summary: a.out`main + 50 at main.c:13 194 Module: file = "/tmp/a.out", arch = "x86_64" 195 CompileUnit: id = {0x00000000}, file = "/tmp/main.c", language = "ISO C:1999" 196 Function: id = {0x0000004f}, name = "main", range = [0x0000000100000bc0-0x0000000100000dc9) 197 FuncType: id = {0x0000004f}, decl = main.c:9, clang_type = "int (int, const char **, const char **, const char **)" 198 Blocks: id = {0x0000004f}, range = [0x100000bc0-0x100000dc9) 199 id = {0x000000ae}, range = [0x100000bf2-0x100000dc4) 200 LineEntry: [0x0000000100000bf2-0x0000000100000bfa): /tmp/main.c:13:23 201 Symbol: id = {0x00000004}, range = [0x0000000100000bc0-0x0000000100000dc9), name="main" 202 Variable: id = {0x000000bf}, name = "path", type= "char [1024]", location = DW_OP_fbreg(-1072), decl = main.c:28 203 Variable: id = {0x00000072}, name = "argc", type= "int", <b>location = r13</b>, decl = main.c:8 204 Variable: id = {0x00000081}, name = "argv", type= "const char **", <b>location = r12</b>, decl = main.c:8 205 Variable: id = {0x00000090}, name = "envp", type= "const char **", <b>location = r15</b>, decl = main.c:8 206 Variable: id = {0x0000009f}, name = "aapl", type= "const char **", <b>location = rbx</b>, decl = main.c:8 207 </tt></pre></code> 208 <p>The interesting part is the variables that are listed. The variables are 209 the parameters and local variables that are in scope for the address that 210 was specified. These variable entries have locations which are shown in bold 211 above. Crash logs often have register information for the first frame in each 212 stack, and being able to reconstruct one or more local variables can often 213 help you decipher more information from a crash log than you normally would be 214 able to. Note that this is really only useful for the first frame, and only if 215 your crash logs have register information for your threads. 216 </div> 217 <div class="postfooter"></div> 218 </div> 219 <div class="post"> 220 <h1 class="postheader">Using Python API to Symbolicate</h1> 221 <div class="postcontent"> 222 <p>All of the commands above can be done through the python script bridge. The code below 223 will recreate the target and add the three shared libraries that we added in the darwin 224 crash log example above: 225 <code><pre><tt>triple = "x86_64-apple-macosx" 226 platform_name = None 227 add_dependents = False 228 target = lldb.debugger.CreateTarget("/tmp/a.out", triple, platform_name, add_dependents, lldb.SBError()) 229 if target: 230 <font color=green># Get the executable module</font> 231 module = target.GetModuleAtIndex(0) 232 target.SetSectionLoadAddress(module.FindSection("__TEXT"), 0x100000000) 233 module = target.AddModule ("/usr/lib/system/libsystem_c.dylib", triple, None, "/build/server/a/libsystem_c.dylib.dSYM") 234 target.SetSectionLoadAddress(module.FindSection("__TEXT"), 0x7fff83f32000) 235 module = target.AddModule ("/usr/lib/system/libsystem_dnssd.dylib", triple, None, "/build/server/b/libsystem_dnssd.dylib.dSYM") 236 target.SetSectionLoadAddress(module.FindSection("__TEXT"), 0x7fff883db000) 237 module = target.AddModule ("/usr/lib/system/libsystem_kernel.dylib", triple, None, "/build/server/c/libsystem_kernel.dylib.dSYM") 238 target.SetSectionLoadAddress(module.FindSection("__TEXT"), 0x7fff8c0dc000) 239 240 load_addr = 0x00007fff8a1e6d46 241 <font color=green># so_addr is a section offset address, or a lldb.SBAddress object</font> 242 so_addr = target.ResolveLoadAddress (load_addr) 243 <font color=green># Get a symbol context for the section offset address which includes 244 # a module, compile unit, function, block, line entry, and symbol</font> 245 sym_ctx = so_addr.GetSymbolContext (lldb.eSymbolContextEverything) 246 print sym_ctx 247 248 </tt></pre></code> 249 </div> 250 <div class="postfooter"></div> 251 </div> 252 <div class="post"> 253 <h1 class="postheader">Use Builtin Python module to Symbolicate</h1> 254 <div class="postcontent"> 255 <p>LLDB includes a module in the <b>lldb</b> package named <b>lldb.utils.symbolication</b>. 256 This module contains a lot of symbolication functions that simplify the symbolication 257 process by allowing you to create objects that represent symbolication class objects such as: 258 <ul> 259 <li>lldb.utils.symbolication.Address</li> 260 <li>lldb.utils.symbolication.Section</li> 261 <li>lldb.utils.symbolication.Image</li> 262 <li>lldb.utils.symbolication.Symbolicator</li> 263 </ul> 264 <h2>lldb.utils.symbolication.Address</h2> 265 <p>This class represents an address that will be symbolicated. It will cache any information 266 that has been looked up: module, compile unit, function, block, line entry, symbol. 267 It does this by having a lldb.SBSymbolContext as a member variable. 268 </p> 269 <h2>lldb.utils.symbolication.Section</h2> 270 <p>This class represents a section that might get loaded in a <b>lldb.utils.symbolication.Image</b>. 271 It has helper functions that allow you to set it from text that might have been extracted from 272 a crash log file. 273 </p> 274 <h2>lldb.utils.symbolication.Image</h2> 275 <p>This class represents a module that might get loaded into the target we use for symbolication. 276 This class contains the executable path, optional symbol file path, the triple, and the list of sections that will need to be loaded 277 if we choose the ask the target to load this image. Many of these objects will never be loaded 278 into the target unless they are needed by symbolication. You often have a crash log that has 279 100 to 200 different shared libraries loaded, but your crash log stack backtraces only use a few 280 of these shared libraries. Only the images that contain stack backtrace addresses need to be loaded 281 in the target in order to symbolicate. 282 </p> 283 <p>Subclasses of this class will want to override the <b>locate_module_and_debug_symbols</b> method: 284 <code><pre><tt>class CustomImage(lldb.utils.symbolication.Image): 285 def locate_module_and_debug_symbols (self): 286 <font color=green># Locate the module and symbol given the info found in the crash log</font> 287 </tt></pre></code> 288 <p>Overriding this function allows clients to find the correct executable module and symbol files as they might reside on a build server.<p> 289 <h2>lldb.utils.symbolication.Symbolicator</h2> 290 <p>This class coordinates the symbolication process by loading only the <b>lldb.utils.symbolication.Image</b> 291 instances that need to be loaded in order to symbolicate an supplied address. 292 </p> 293 <h2>lldb.macosx.crashlog</h2> 294 <p><b>lldb.macosx.crashlog</b> is a package that is distributed on Mac OS X builds that subclasses the above classes. 295 This module parses the information in the Darwin crash logs and creates symbolication objects that 296 represent the images, the sections and the thread frames for the backtraces. It then uses the functions 297 in the lldb.utils.symbolication to symbolicate the crash logs.</p> 298 <p> 299 This module installs a new "crashlog" command into the lldb command interpreter so that you can use 300 it to parse and symbolicate Mac OS X crash logs:</p> 301 <code><pre><tt><b>(lldb)</b> command script import lldb.macosx.crashlog 302 "crashlog" and "save_crashlog" command installed, use the "--help" option for detailed help 303 <b>(lldb)</b> crashlog /tmp/crash.log 304 ... 305 </tt></pre></code> 306 <p>The command that is installed has built in help that shows the 307 options that can be used when symbolicating: 308 <code><pre><tt><b>(lldb)</b> crashlog --help 309 Usage: crashlog [options] <FILE> [FILE ...] 310 311 Symbolicate one or more darwin crash log files to provide source file and line 312 information, inlined stack frames back to the concrete functions, and 313 disassemble the location of the crash for the first frame of the crashed 314 thread. If this script is imported into the LLDB command interpreter, a 315 "crashlog" command will be added to the interpreter for use at the LLDB 316 command line. After a crash log has been parsed and symbolicated, a target 317 will have been created that has all of the shared libraries loaded at the load 318 addresses found in the crash log file. This allows you to explore the program 319 as if it were stopped at the locations described in the crash log and 320 functions can be disassembled and lookups can be performed using the 321 addresses found in the crash log. 322 323 Options: 324 -h, --help show this help message and exit 325 -v, --verbose display verbose debug info 326 -g, --debug display verbose debug logging 327 -a, --load-all load all executable images, not just the images found 328 in the crashed stack frames 329 --images show image list 330 --debug-delay=NSEC pause for NSEC seconds for debugger 331 -c, --crashed-only only symbolicate the crashed thread 332 -d DISASSEMBLE_DEPTH, --disasm-depth=DISASSEMBLE_DEPTH 333 set the depth in stack frames that should be 334 disassembled (default is 1) 335 -D, --disasm-all enabled disassembly of frames on all threads (not just 336 the crashed thread) 337 -B DISASSEMBLE_BEFORE, --disasm-before=DISASSEMBLE_BEFORE 338 the number of instructions to disassemble before the 339 frame PC 340 -A DISASSEMBLE_AFTER, --disasm-after=DISASSEMBLE_AFTER 341 the number of instructions to disassemble after the 342 frame PC 343 -C NLINES, --source-context=NLINES 344 show NLINES source lines of source context (default = 345 4) 346 --source-frames=NFRAMES 347 show source for NFRAMES (default = 4) 348 --source-all show source for all threads, not just the crashed 349 thread 350 -i, --interactive parse all crash logs and enter interactive mode 351 352 </tt></pre></code> 353 <p>The source for the "symbolication" and "crashlog" modules are available in SVN:</p> 354 <ul> 355 <li><a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/symbolication.py">symbolication.py</a></li> 356 <li><a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/python/crashlog.py">crashlog.py</a></li> 357 </ul> 358 </div> 359 <div class="postfooter"></div> 360 </div> 361 </div> 362 </body> 363 </html> 364