1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3 <html> 4 <head> 5 <meta http-equiv="Content-type" content="text/html;charset=UTF-8"> 6 <title>LLVM Programmer's Manual</title> 7 <link rel="stylesheet" href="_static/llvm.css" type="text/css"> 8 </head> 9 <body> 10 11 <h1> 12 LLVM Programmer's Manual 13 </h1> 14 15 <ol> 16 <li><a href="#introduction">Introduction</a></li> 17 <li><a href="#general">General Information</a> 18 <ul> 19 <li><a href="#stl">The C++ Standard Template Library</a></li> 20 <!-- 21 <li>The <tt>-time-passes</tt> option</li> 22 <li>How to use the LLVM Makefile system</li> 23 <li>How to write a regression test</li> 24 25 --> 26 </ul> 27 </li> 28 <li><a href="#apis">Important and useful LLVM APIs</a> 29 <ul> 30 <li><a href="#isa">The <tt>isa<></tt>, <tt>cast<></tt> 31 and <tt>dyn_cast<></tt> templates</a> </li> 32 <li><a href="#string_apis">Passing strings (the <tt>StringRef</tt> 33 and <tt>Twine</tt> classes)</a> 34 <ul> 35 <li><a href="#StringRef">The <tt>StringRef</tt> class</a> </li> 36 <li><a href="#Twine">The <tt>Twine</tt> class</a> </li> 37 </ul> 38 </li> 39 <li><a href="#DEBUG">The <tt>DEBUG()</tt> macro and <tt>-debug</tt> 40 option</a> 41 <ul> 42 <li><a href="#DEBUG_TYPE">Fine grained debug info with <tt>DEBUG_TYPE</tt> 43 and the <tt>-debug-only</tt> option</a> </li> 44 </ul> 45 </li> 46 <li><a href="#Statistic">The <tt>Statistic</tt> class & <tt>-stats</tt> 47 option</a></li> 48 <!-- 49 <li>The <tt>InstVisitor</tt> template 50 <li>The general graph API 51 --> 52 <li><a href="#ViewGraph">Viewing graphs while debugging code</a></li> 53 </ul> 54 </li> 55 <li><a href="#datastructure">Picking the Right Data Structure for a Task</a> 56 <ul> 57 <li><a href="#ds_sequential">Sequential Containers (std::vector, std::list, etc)</a> 58 <ul> 59 <li><a href="#dss_arrayref">llvm/ADT/ArrayRef.h</a></li> 60 <li><a href="#dss_fixedarrays">Fixed Size Arrays</a></li> 61 <li><a href="#dss_heaparrays">Heap Allocated Arrays</a></li> 62 <li><a href="#dss_tinyptrvector">"llvm/ADT/TinyPtrVector.h"</a></li> 63 <li><a href="#dss_smallvector">"llvm/ADT/SmallVector.h"</a></li> 64 <li><a href="#dss_vector"><vector></a></li> 65 <li><a href="#dss_deque"><deque></a></li> 66 <li><a href="#dss_list"><list></a></li> 67 <li><a href="#dss_ilist">llvm/ADT/ilist.h</a></li> 68 <li><a href="#dss_packedvector">llvm/ADT/PackedVector.h</a></li> 69 <li><a href="#dss_other">Other Sequential Container Options</a></li> 70 </ul></li> 71 <li><a href="#ds_string">String-like containers</a> 72 <ul> 73 <li><a href="#dss_stringref">llvm/ADT/StringRef.h</a></li> 74 <li><a href="#dss_twine">llvm/ADT/Twine.h</a></li> 75 <li><a href="#dss_smallstring">llvm/ADT/SmallString.h</a></li> 76 <li><a href="#dss_stdstring">std::string</a></li> 77 </ul></li> 78 <li><a href="#ds_set">Set-Like Containers (std::set, SmallSet, SetVector, etc)</a> 79 <ul> 80 <li><a href="#dss_sortedvectorset">A sorted 'vector'</a></li> 81 <li><a href="#dss_smallset">"llvm/ADT/SmallSet.h"</a></li> 82 <li><a href="#dss_smallptrset">"llvm/ADT/SmallPtrSet.h"</a></li> 83 <li><a href="#dss_denseset">"llvm/ADT/DenseSet.h"</a></li> 84 <li><a href="#dss_sparseset">"llvm/ADT/SparseSet.h"</a></li> 85 <li><a href="#dss_FoldingSet">"llvm/ADT/FoldingSet.h"</a></li> 86 <li><a href="#dss_set"><set></a></li> 87 <li><a href="#dss_setvector">"llvm/ADT/SetVector.h"</a></li> 88 <li><a href="#dss_uniquevector">"llvm/ADT/UniqueVector.h"</a></li> 89 <li><a href="#dss_immutableset">"llvm/ADT/ImmutableSet.h"</a></li> 90 <li><a href="#dss_otherset">Other Set-Like Container Options</a></li> 91 </ul></li> 92 <li><a href="#ds_map">Map-Like Containers (std::map, DenseMap, etc)</a> 93 <ul> 94 <li><a href="#dss_sortedvectormap">A sorted 'vector'</a></li> 95 <li><a href="#dss_stringmap">"llvm/ADT/StringMap.h"</a></li> 96 <li><a href="#dss_indexedmap">"llvm/ADT/IndexedMap.h"</a></li> 97 <li><a href="#dss_densemap">"llvm/ADT/DenseMap.h"</a></li> 98 <li><a href="#dss_valuemap">"llvm/ADT/ValueMap.h"</a></li> 99 <li><a href="#dss_intervalmap">"llvm/ADT/IntervalMap.h"</a></li> 100 <li><a href="#dss_map"><map></a></li> 101 <li><a href="#dss_inteqclasses">"llvm/ADT/IntEqClasses.h"</a></li> 102 <li><a href="#dss_immutablemap">"llvm/ADT/ImmutableMap.h"</a></li> 103 <li><a href="#dss_othermap">Other Map-Like Container Options</a></li> 104 </ul></li> 105 <li><a href="#ds_bit">BitVector-like containers</a> 106 <ul> 107 <li><a href="#dss_bitvector">A dense bitvector</a></li> 108 <li><a href="#dss_smallbitvector">A "small" dense bitvector</a></li> 109 <li><a href="#dss_sparsebitvector">A sparse bitvector</a></li> 110 </ul></li> 111 </ul> 112 </li> 113 <li><a href="#common">Helpful Hints for Common Operations</a> 114 <ul> 115 <li><a href="#inspection">Basic Inspection and Traversal Routines</a> 116 <ul> 117 <li><a href="#iterate_function">Iterating over the <tt>BasicBlock</tt>s 118 in a <tt>Function</tt></a> </li> 119 <li><a href="#iterate_basicblock">Iterating over the <tt>Instruction</tt>s 120 in a <tt>BasicBlock</tt></a> </li> 121 <li><a href="#iterate_institer">Iterating over the <tt>Instruction</tt>s 122 in a <tt>Function</tt></a> </li> 123 <li><a href="#iterate_convert">Turning an iterator into a 124 class pointer</a> </li> 125 <li><a href="#iterate_complex">Finding call sites: a more 126 complex example</a> </li> 127 <li><a href="#calls_and_invokes">Treating calls and invokes 128 the same way</a> </li> 129 <li><a href="#iterate_chains">Iterating over def-use & 130 use-def chains</a> </li> 131 <li><a href="#iterate_preds">Iterating over predecessors & 132 successors of blocks</a></li> 133 </ul> 134 </li> 135 <li><a href="#simplechanges">Making simple changes</a> 136 <ul> 137 <li><a href="#schanges_creating">Creating and inserting new 138 <tt>Instruction</tt>s</a> </li> 139 <li><a href="#schanges_deleting">Deleting <tt>Instruction</tt>s</a> </li> 140 <li><a href="#schanges_replacing">Replacing an <tt>Instruction</tt> 141 with another <tt>Value</tt></a> </li> 142 <li><a href="#schanges_deletingGV">Deleting <tt>GlobalVariable</tt>s</a> </li> 143 </ul> 144 </li> 145 <li><a href="#create_types">How to Create Types</a></li> 146 <!-- 147 <li>Working with the Control Flow Graph 148 <ul> 149 <li>Accessing predecessors and successors of a <tt>BasicBlock</tt> 150 <li> 151 <li> 152 </ul> 153 --> 154 </ul> 155 </li> 156 157 <li><a href="#threading">Threads and LLVM</a> 158 <ul> 159 <li><a href="#startmultithreaded">Entering and Exiting Multithreaded Mode 160 </a></li> 161 <li><a href="#shutdown">Ending execution with <tt>llvm_shutdown()</tt></a></li> 162 <li><a href="#managedstatic">Lazy initialization with <tt>ManagedStatic</tt></a></li> 163 <li><a href="#llvmcontext">Achieving Isolation with <tt>LLVMContext</tt></a></li> 164 <li><a href="#jitthreading">Threads and the JIT</a></li> 165 </ul> 166 </li> 167 168 <li><a href="#advanced">Advanced Topics</a> 169 <ul> 170 171 <li><a href="#SymbolTable">The <tt>ValueSymbolTable</tt> class</a></li> 172 <li><a href="#UserLayout">The <tt>User</tt> and owned <tt>Use</tt> classes' memory layout</a></li> 173 </ul></li> 174 175 <li><a href="#coreclasses">The Core LLVM Class Hierarchy Reference</a> 176 <ul> 177 <li><a href="#Type">The <tt>Type</tt> class</a> </li> 178 <li><a href="#Module">The <tt>Module</tt> class</a></li> 179 <li><a href="#Value">The <tt>Value</tt> class</a> 180 <ul> 181 <li><a href="#User">The <tt>User</tt> class</a> 182 <ul> 183 <li><a href="#Instruction">The <tt>Instruction</tt> class</a></li> 184 <li><a href="#Constant">The <tt>Constant</tt> class</a> 185 <ul> 186 <li><a href="#GlobalValue">The <tt>GlobalValue</tt> class</a> 187 <ul> 188 <li><a href="#Function">The <tt>Function</tt> class</a></li> 189 <li><a href="#GlobalVariable">The <tt>GlobalVariable</tt> class</a></li> 190 </ul> 191 </li> 192 </ul> 193 </li> 194 </ul> 195 </li> 196 <li><a href="#BasicBlock">The <tt>BasicBlock</tt> class</a></li> 197 <li><a href="#Argument">The <tt>Argument</tt> class</a></li> 198 </ul> 199 </li> 200 </ul> 201 </li> 202 </ol> 203 204 <div class="doc_author"> 205 <p>Written by <a href="mailto:sabre (a] nondot.org">Chris Lattner</a>, 206 <a href="mailto:dhurjati (a] cs.uiuc.edu">Dinakar Dhurjati</a>, 207 <a href="mailto:ggreif (a] gmail.com">Gabor Greif</a>, 208 <a href="mailto:jstanley (a] cs.uiuc.edu">Joel Stanley</a>, 209 <a href="mailto:rspencer (a] x10sys.com">Reid Spencer</a> and 210 <a href="mailto:owen (a] apple.com">Owen Anderson</a></p> 211 </div> 212 213 <!-- *********************************************************************** --> 214 <h2> 215 <a name="introduction">Introduction </a> 216 </h2> 217 <!-- *********************************************************************** --> 218 219 <div> 220 221 <p>This document is meant to highlight some of the important classes and 222 interfaces available in the LLVM source-base. This manual is not 223 intended to explain what LLVM is, how it works, and what LLVM code looks 224 like. It assumes that you know the basics of LLVM and are interested 225 in writing transformations or otherwise analyzing or manipulating the 226 code.</p> 227 228 <p>This document should get you oriented so that you can find your 229 way in the continuously growing source code that makes up the LLVM 230 infrastructure. Note that this manual is not intended to serve as a 231 replacement for reading the source code, so if you think there should be 232 a method in one of these classes to do something, but it's not listed, 233 check the source. Links to the <a href="/doxygen/">doxygen</a> sources 234 are provided to make this as easy as possible.</p> 235 236 <p>The first section of this document describes general information that is 237 useful to know when working in the LLVM infrastructure, and the second describes 238 the Core LLVM classes. In the future this manual will be extended with 239 information describing how to use extension libraries, such as dominator 240 information, CFG traversal routines, and useful utilities like the <tt><a 241 href="/doxygen/InstVisitor_8h-source.html">InstVisitor</a></tt> template.</p> 242 243 </div> 244 245 <!-- *********************************************************************** --> 246 <h2> 247 <a name="general">General Information</a> 248 </h2> 249 <!-- *********************************************************************** --> 250 251 <div> 252 253 <p>This section contains general information that is useful if you are working 254 in the LLVM source-base, but that isn't specific to any particular API.</p> 255 256 <!-- ======================================================================= --> 257 <h3> 258 <a name="stl">The C++ Standard Template Library</a> 259 </h3> 260 261 <div> 262 263 <p>LLVM makes heavy use of the C++ Standard Template Library (STL), 264 perhaps much more than you are used to, or have seen before. Because of 265 this, you might want to do a little background reading in the 266 techniques used and capabilities of the library. There are many good 267 pages that discuss the STL, and several books on the subject that you 268 can get, so it will not be discussed in this document.</p> 269 270 <p>Here are some useful links:</p> 271 272 <ol> 273 274 <li><a href="http://www.dinkumware.com/manuals/#Standard C++ Library">Dinkumware 275 C++ Library reference</a> - an excellent reference for the STL and other parts 276 of the standard C++ library.</li> 277 278 <li><a href="http://www.tempest-sw.com/cpp/">C++ In a Nutshell</a> - This is an 279 O'Reilly book in the making. It has a decent Standard Library 280 Reference that rivals Dinkumware's, and is unfortunately no longer free since the 281 book has been published.</li> 282 283 <li><a href="http://www.parashift.com/c++-faq-lite/">C++ Frequently Asked 284 Questions</a></li> 285 286 <li><a href="http://www.sgi.com/tech/stl/">SGI's STL Programmer's Guide</a> - 287 Contains a useful <a 288 href="http://www.sgi.com/tech/stl/stl_introduction.html">Introduction to the 289 STL</a>.</li> 290 291 <li><a href="http://www.research.att.com/%7Ebs/C++.html">Bjarne Stroustrup's C++ 292 Page</a></li> 293 294 <li><a href="http://64.78.49.204/"> 295 Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0 (even better, get 296 the book).</a></li> 297 298 </ol> 299 300 <p>You are also encouraged to take a look at the <a 301 href="CodingStandards.html">LLVM Coding Standards</a> guide which focuses on how 302 to write maintainable code more than where to put your curly braces.</p> 303 304 </div> 305 306 <!-- ======================================================================= --> 307 <h3> 308 <a name="stl">Other useful references</a> 309 </h3> 310 311 <div> 312 313 <ol> 314 <li><a href="http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html">Using 315 static and shared libraries across platforms</a></li> 316 </ol> 317 318 </div> 319 320 </div> 321 322 <!-- *********************************************************************** --> 323 <h2> 324 <a name="apis">Important and useful LLVM APIs</a> 325 </h2> 326 <!-- *********************************************************************** --> 327 328 <div> 329 330 <p>Here we highlight some LLVM APIs that are generally useful and good to 331 know about when writing transformations.</p> 332 333 <!-- ======================================================================= --> 334 <h3> 335 <a name="isa">The <tt>isa<></tt>, <tt>cast<></tt> and 336 <tt>dyn_cast<></tt> templates</a> 337 </h3> 338 339 <div> 340 341 <p>The LLVM source-base makes extensive use of a custom form of RTTI. 342 These templates have many similarities to the C++ <tt>dynamic_cast<></tt> 343 operator, but they don't have some drawbacks (primarily stemming from 344 the fact that <tt>dynamic_cast<></tt> only works on classes that 345 have a v-table). Because they are used so often, you must know what they 346 do and how they work. All of these templates are defined in the <a 347 href="/doxygen/Casting_8h-source.html"><tt>llvm/Support/Casting.h</tt></a> 348 file (note that you very rarely have to include this file directly).</p> 349 350 <dl> 351 <dt><tt>isa<></tt>: </dt> 352 353 <dd><p>The <tt>isa<></tt> operator works exactly like the Java 354 "<tt>instanceof</tt>" operator. It returns true or false depending on whether 355 a reference or pointer points to an instance of the specified class. This can 356 be very useful for constraint checking of various sorts (example below).</p> 357 </dd> 358 359 <dt><tt>cast<></tt>: </dt> 360 361 <dd><p>The <tt>cast<></tt> operator is a "checked cast" operation. It 362 converts a pointer or reference from a base class to a derived class, causing 363 an assertion failure if it is not really an instance of the right type. This 364 should be used in cases where you have some information that makes you believe 365 that something is of the right type. An example of the <tt>isa<></tt> 366 and <tt>cast<></tt> template is:</p> 367 368 <div class="doc_code"> 369 <pre> 370 static bool isLoopInvariant(const <a href="#Value">Value</a> *V, const Loop *L) { 371 if (isa<<a href="#Constant">Constant</a>>(V) || isa<<a href="#Argument">Argument</a>>(V) || isa<<a href="#GlobalValue">GlobalValue</a>>(V)) 372 return true; 373 374 // <i>Otherwise, it must be an instruction...</i> 375 return !L->contains(cast<<a href="#Instruction">Instruction</a>>(V)->getParent()); 376 } 377 </pre> 378 </div> 379 380 <p>Note that you should <b>not</b> use an <tt>isa<></tt> test followed 381 by a <tt>cast<></tt>, for that use the <tt>dyn_cast<></tt> 382 operator.</p> 383 384 </dd> 385 386 <dt><tt>dyn_cast<></tt>:</dt> 387 388 <dd><p>The <tt>dyn_cast<></tt> operator is a "checking cast" operation. 389 It checks to see if the operand is of the specified type, and if so, returns a 390 pointer to it (this operator does not work with references). If the operand is 391 not of the correct type, a null pointer is returned. Thus, this works very 392 much like the <tt>dynamic_cast<></tt> operator in C++, and should be 393 used in the same circumstances. Typically, the <tt>dyn_cast<></tt> 394 operator is used in an <tt>if</tt> statement or some other flow control 395 statement like this:</p> 396 397 <div class="doc_code"> 398 <pre> 399 if (<a href="#AllocationInst">AllocationInst</a> *AI = dyn_cast<<a href="#AllocationInst">AllocationInst</a>>(Val)) { 400 // <i>...</i> 401 } 402 </pre> 403 </div> 404 405 <p>This form of the <tt>if</tt> statement effectively combines together a call 406 to <tt>isa<></tt> and a call to <tt>cast<></tt> into one 407 statement, which is very convenient.</p> 408 409 <p>Note that the <tt>dyn_cast<></tt> operator, like C++'s 410 <tt>dynamic_cast<></tt> or Java's <tt>instanceof</tt> operator, can be 411 abused. In particular, you should not use big chained <tt>if/then/else</tt> 412 blocks to check for lots of different variants of classes. If you find 413 yourself wanting to do this, it is much cleaner and more efficient to use the 414 <tt>InstVisitor</tt> class to dispatch over the instruction type directly.</p> 415 416 </dd> 417 418 <dt><tt>cast_or_null<></tt>: </dt> 419 420 <dd><p>The <tt>cast_or_null<></tt> operator works just like the 421 <tt>cast<></tt> operator, except that it allows for a null pointer as an 422 argument (which it then propagates). This can sometimes be useful, allowing 423 you to combine several null checks into one.</p></dd> 424 425 <dt><tt>dyn_cast_or_null<></tt>: </dt> 426 427 <dd><p>The <tt>dyn_cast_or_null<></tt> operator works just like the 428 <tt>dyn_cast<></tt> operator, except that it allows for a null pointer 429 as an argument (which it then propagates). This can sometimes be useful, 430 allowing you to combine several null checks into one.</p></dd> 431 432 </dl> 433 434 <p>These five templates can be used with any classes, whether they have a 435 v-table or not. To add support for these templates, you simply need to add 436 <tt>classof</tt> static methods to the class you are interested casting 437 to. Describing this is currently outside the scope of this document, but there 438 are lots of examples in the LLVM source base.</p> 439 440 </div> 441 442 443 <!-- ======================================================================= --> 444 <h3> 445 <a name="string_apis">Passing strings (the <tt>StringRef</tt> 446 and <tt>Twine</tt> classes)</a> 447 </h3> 448 449 <div> 450 451 <p>Although LLVM generally does not do much string manipulation, we do have 452 several important APIs which take strings. Two important examples are the 453 Value class -- which has names for instructions, functions, etc. -- and the 454 StringMap class which is used extensively in LLVM and Clang.</p> 455 456 <p>These are generic classes, and they need to be able to accept strings which 457 may have embedded null characters. Therefore, they cannot simply take 458 a <tt>const char *</tt>, and taking a <tt>const std::string&</tt> requires 459 clients to perform a heap allocation which is usually unnecessary. Instead, 460 many LLVM APIs use a <tt>StringRef</tt> or a <tt>const Twine&</tt> for 461 passing strings efficiently.</p> 462 463 <!-- _______________________________________________________________________ --> 464 <h4> 465 <a name="StringRef">The <tt>StringRef</tt> class</a> 466 </h4> 467 468 <div> 469 470 <p>The <tt>StringRef</tt> data type represents a reference to a constant string 471 (a character array and a length) and supports the common operations available 472 on <tt>std:string</tt>, but does not require heap allocation.</p> 473 474 <p>It can be implicitly constructed using a C style null-terminated string, 475 an <tt>std::string</tt>, or explicitly with a character pointer and length. 476 For example, the <tt>StringRef</tt> find function is declared as:</p> 477 478 <pre class="doc_code"> 479 iterator find(StringRef Key); 480 </pre> 481 482 <p>and clients can call it using any one of:</p> 483 484 <pre class="doc_code"> 485 Map.find("foo"); <i>// Lookup "foo"</i> 486 Map.find(std::string("bar")); <i>// Lookup "bar"</i> 487 Map.find(StringRef("\0baz", 4)); <i>// Lookup "\0baz"</i> 488 </pre> 489 490 <p>Similarly, APIs which need to return a string may return a <tt>StringRef</tt> 491 instance, which can be used directly or converted to an <tt>std::string</tt> 492 using the <tt>str</tt> member function. See 493 "<tt><a href="/doxygen/classllvm_1_1StringRef_8h-source.html">llvm/ADT/StringRef.h</a></tt>" 494 for more information.</p> 495 496 <p>You should rarely use the <tt>StringRef</tt> class directly, because it contains 497 pointers to external memory it is not generally safe to store an instance of the 498 class (unless you know that the external storage will not be freed). StringRef is 499 small and pervasive enough in LLVM that it should always be passed by value.</p> 500 501 </div> 502 503 <!-- _______________________________________________________________________ --> 504 <h4> 505 <a name="Twine">The <tt>Twine</tt> class</a> 506 </h4> 507 508 <div> 509 510 <p>The <tt><a href="/doxygen/classllvm_1_1Twine.html">Twine</a></tt> class is an 511 efficient way for APIs to accept concatenated strings. For example, a common 512 LLVM paradigm is to name one instruction based on 513 the name of another instruction with a suffix, for example:</p> 514 515 <div class="doc_code"> 516 <pre> 517 New = CmpInst::Create(<i>...</i>, SO->getName() + ".cmp"); 518 </pre> 519 </div> 520 521 <p>The <tt>Twine</tt> class is effectively a lightweight 522 <a href="http://en.wikipedia.org/wiki/Rope_(computer_science)">rope</a> 523 which points to temporary (stack allocated) objects. Twines can be implicitly 524 constructed as the result of the plus operator applied to strings (i.e., a C 525 strings, an <tt>std::string</tt>, or a <tt>StringRef</tt>). The twine delays 526 the actual concatenation of strings until it is actually required, at which 527 point it can be efficiently rendered directly into a character array. This 528 avoids unnecessary heap allocation involved in constructing the temporary 529 results of string concatenation. See 530 "<tt><a href="/doxygen/Twine_8h_source.html">llvm/ADT/Twine.h</a></tt>" 531 and <a href="#dss_twine">here</a> for more information.</p> 532 533 <p>As with a <tt>StringRef</tt>, <tt>Twine</tt> objects point to external memory 534 and should almost never be stored or mentioned directly. They are intended 535 solely for use when defining a function which should be able to efficiently 536 accept concatenated strings.</p> 537 538 </div> 539 540 </div> 541 542 <!-- ======================================================================= --> 543 <h3> 544 <a name="DEBUG">The <tt>DEBUG()</tt> macro and <tt>-debug</tt> option</a> 545 </h3> 546 547 <div> 548 549 <p>Often when working on your pass you will put a bunch of debugging printouts 550 and other code into your pass. After you get it working, you want to remove 551 it, but you may need it again in the future (to work out new bugs that you run 552 across).</p> 553 554 <p> Naturally, because of this, you don't want to delete the debug printouts, 555 but you don't want them to always be noisy. A standard compromise is to comment 556 them out, allowing you to enable them if you need them in the future.</p> 557 558 <p>The "<tt><a href="/doxygen/Debug_8h-source.html">llvm/Support/Debug.h</a></tt>" 559 file provides a macro named <tt>DEBUG()</tt> that is a much nicer solution to 560 this problem. Basically, you can put arbitrary code into the argument of the 561 <tt>DEBUG</tt> macro, and it is only executed if '<tt>opt</tt>' (or any other 562 tool) is run with the '<tt>-debug</tt>' command line argument:</p> 563 564 <div class="doc_code"> 565 <pre> 566 DEBUG(errs() << "I am here!\n"); 567 </pre> 568 </div> 569 570 <p>Then you can run your pass like this:</p> 571 572 <div class="doc_code"> 573 <pre> 574 $ opt < a.bc > /dev/null -mypass 575 <i><no output></i> 576 $ opt < a.bc > /dev/null -mypass -debug 577 I am here! 578 </pre> 579 </div> 580 581 <p>Using the <tt>DEBUG()</tt> macro instead of a home-brewed solution allows you 582 to not have to create "yet another" command line option for the debug output for 583 your pass. Note that <tt>DEBUG()</tt> macros are disabled for optimized builds, 584 so they do not cause a performance impact at all (for the same reason, they 585 should also not contain side-effects!).</p> 586 587 <p>One additional nice thing about the <tt>DEBUG()</tt> macro is that you can 588 enable or disable it directly in gdb. Just use "<tt>set DebugFlag=0</tt>" or 589 "<tt>set DebugFlag=1</tt>" from the gdb if the program is running. If the 590 program hasn't been started yet, you can always just run it with 591 <tt>-debug</tt>.</p> 592 593 <!-- _______________________________________________________________________ --> 594 <h4> 595 <a name="DEBUG_TYPE">Fine grained debug info with <tt>DEBUG_TYPE</tt> and 596 the <tt>-debug-only</tt> option</a> 597 </h4> 598 599 <div> 600 601 <p>Sometimes you may find yourself in a situation where enabling <tt>-debug</tt> 602 just turns on <b>too much</b> information (such as when working on the code 603 generator). If you want to enable debug information with more fine-grained 604 control, you define the <tt>DEBUG_TYPE</tt> macro and the <tt>-debug</tt> only 605 option as follows:</p> 606 607 <div class="doc_code"> 608 <pre> 609 #undef DEBUG_TYPE 610 DEBUG(errs() << "No debug type\n"); 611 #define DEBUG_TYPE "foo" 612 DEBUG(errs() << "'foo' debug type\n"); 613 #undef DEBUG_TYPE 614 #define DEBUG_TYPE "bar" 615 DEBUG(errs() << "'bar' debug type\n")); 616 #undef DEBUG_TYPE 617 #define DEBUG_TYPE "" 618 DEBUG(errs() << "No debug type (2)\n"); 619 </pre> 620 </div> 621 622 <p>Then you can run your pass like this:</p> 623 624 <div class="doc_code"> 625 <pre> 626 $ opt < a.bc > /dev/null -mypass 627 <i><no output></i> 628 $ opt < a.bc > /dev/null -mypass -debug 629 No debug type 630 'foo' debug type 631 'bar' debug type 632 No debug type (2) 633 $ opt < a.bc > /dev/null -mypass -debug-only=foo 634 'foo' debug type 635 $ opt < a.bc > /dev/null -mypass -debug-only=bar 636 'bar' debug type 637 </pre> 638 </div> 639 640 <p>Of course, in practice, you should only set <tt>DEBUG_TYPE</tt> at the top of 641 a file, to specify the debug type for the entire module (if you do this before 642 you <tt>#include "llvm/Support/Debug.h"</tt>, you don't have to insert the ugly 643 <tt>#undef</tt>'s). Also, you should use names more meaningful than "foo" and 644 "bar", because there is no system in place to ensure that names do not 645 conflict. If two different modules use the same string, they will all be turned 646 on when the name is specified. This allows, for example, all debug information 647 for instruction scheduling to be enabled with <tt>-debug-type=InstrSched</tt>, 648 even if the source lives in multiple files.</p> 649 650 <p>The <tt>DEBUG_WITH_TYPE</tt> macro is also available for situations where you 651 would like to set <tt>DEBUG_TYPE</tt>, but only for one specific <tt>DEBUG</tt> 652 statement. It takes an additional first parameter, which is the type to use. For 653 example, the preceding example could be written as:</p> 654 655 656 <div class="doc_code"> 657 <pre> 658 DEBUG_WITH_TYPE("", errs() << "No debug type\n"); 659 DEBUG_WITH_TYPE("foo", errs() << "'foo' debug type\n"); 660 DEBUG_WITH_TYPE("bar", errs() << "'bar' debug type\n")); 661 DEBUG_WITH_TYPE("", errs() << "No debug type (2)\n"); 662 </pre> 663 </div> 664 665 </div> 666 667 </div> 668 669 <!-- ======================================================================= --> 670 <h3> 671 <a name="Statistic">The <tt>Statistic</tt> class & <tt>-stats</tt> 672 option</a> 673 </h3> 674 675 <div> 676 677 <p>The "<tt><a 678 href="/doxygen/Statistic_8h-source.html">llvm/ADT/Statistic.h</a></tt>" file 679 provides a class named <tt>Statistic</tt> that is used as a unified way to 680 keep track of what the LLVM compiler is doing and how effective various 681 optimizations are. It is useful to see what optimizations are contributing to 682 making a particular program run faster.</p> 683 684 <p>Often you may run your pass on some big program, and you're interested to see 685 how many times it makes a certain transformation. Although you can do this with 686 hand inspection, or some ad-hoc method, this is a real pain and not very useful 687 for big programs. Using the <tt>Statistic</tt> class makes it very easy to 688 keep track of this information, and the calculated information is presented in a 689 uniform manner with the rest of the passes being executed.</p> 690 691 <p>There are many examples of <tt>Statistic</tt> uses, but the basics of using 692 it are as follows:</p> 693 694 <ol> 695 <li><p>Define your statistic like this:</p> 696 697 <div class="doc_code"> 698 <pre> 699 #define <a href="#DEBUG_TYPE">DEBUG_TYPE</a> "mypassname" <i>// This goes before any #includes.</i> 700 STATISTIC(NumXForms, "The # of times I did stuff"); 701 </pre> 702 </div> 703 704 <p>The <tt>STATISTIC</tt> macro defines a static variable, whose name is 705 specified by the first argument. The pass name is taken from the DEBUG_TYPE 706 macro, and the description is taken from the second argument. The variable 707 defined ("NumXForms" in this case) acts like an unsigned integer.</p></li> 708 709 <li><p>Whenever you make a transformation, bump the counter:</p> 710 711 <div class="doc_code"> 712 <pre> 713 ++NumXForms; // <i>I did stuff!</i> 714 </pre> 715 </div> 716 717 </li> 718 </ol> 719 720 <p>That's all you have to do. To get '<tt>opt</tt>' to print out the 721 statistics gathered, use the '<tt>-stats</tt>' option:</p> 722 723 <div class="doc_code"> 724 <pre> 725 $ opt -stats -mypassname < program.bc > /dev/null 726 <i>... statistics output ...</i> 727 </pre> 728 </div> 729 730 <p> When running <tt>opt</tt> on a C file from the SPEC benchmark 731 suite, it gives a report that looks like this:</p> 732 733 <div class="doc_code"> 734 <pre> 735 7646 bitcodewriter - Number of normal instructions 736 725 bitcodewriter - Number of oversized instructions 737 129996 bitcodewriter - Number of bitcode bytes written 738 2817 raise - Number of insts DCEd or constprop'd 739 3213 raise - Number of cast-of-self removed 740 5046 raise - Number of expression trees converted 741 75 raise - Number of other getelementptr's formed 742 138 raise - Number of load/store peepholes 743 42 deadtypeelim - Number of unused typenames removed from symtab 744 392 funcresolve - Number of varargs functions resolved 745 27 globaldce - Number of global variables removed 746 2 adce - Number of basic blocks removed 747 134 cee - Number of branches revectored 748 49 cee - Number of setcc instruction eliminated 749 532 gcse - Number of loads removed 750 2919 gcse - Number of instructions removed 751 86 indvars - Number of canonical indvars added 752 87 indvars - Number of aux indvars removed 753 25 instcombine - Number of dead inst eliminate 754 434 instcombine - Number of insts combined 755 248 licm - Number of load insts hoisted 756 1298 licm - Number of insts hoisted to a loop pre-header 757 3 licm - Number of insts hoisted to multiple loop preds (bad, no loop pre-header) 758 75 mem2reg - Number of alloca's promoted 759 1444 cfgsimplify - Number of blocks simplified 760 </pre> 761 </div> 762 763 <p>Obviously, with so many optimizations, having a unified framework for this 764 stuff is very nice. Making your pass fit well into the framework makes it more 765 maintainable and useful.</p> 766 767 </div> 768 769 <!-- ======================================================================= --> 770 <h3> 771 <a name="ViewGraph">Viewing graphs while debugging code</a> 772 </h3> 773 774 <div> 775 776 <p>Several of the important data structures in LLVM are graphs: for example 777 CFGs made out of LLVM <a href="#BasicBlock">BasicBlock</a>s, CFGs made out of 778 LLVM <a href="CodeGenerator.html#machinebasicblock">MachineBasicBlock</a>s, and 779 <a href="CodeGenerator.html#selectiondag_intro">Instruction Selection 780 DAGs</a>. In many cases, while debugging various parts of the compiler, it is 781 nice to instantly visualize these graphs.</p> 782 783 <p>LLVM provides several callbacks that are available in a debug build to do 784 exactly that. If you call the <tt>Function::viewCFG()</tt> method, for example, 785 the current LLVM tool will pop up a window containing the CFG for the function 786 where each basic block is a node in the graph, and each node contains the 787 instructions in the block. Similarly, there also exists 788 <tt>Function::viewCFGOnly()</tt> (does not include the instructions), the 789 <tt>MachineFunction::viewCFG()</tt> and <tt>MachineFunction::viewCFGOnly()</tt>, 790 and the <tt>SelectionDAG::viewGraph()</tt> methods. Within GDB, for example, 791 you can usually use something like <tt>call DAG.viewGraph()</tt> to pop 792 up a window. Alternatively, you can sprinkle calls to these functions in your 793 code in places you want to debug.</p> 794 795 <p>Getting this to work requires a small amount of configuration. On Unix 796 systems with X11, install the <a href="http://www.graphviz.org">graphviz</a> 797 toolkit, and make sure 'dot' and 'gv' are in your path. If you are running on 798 Mac OS/X, download and install the Mac OS/X <a 799 href="http://www.pixelglow.com/graphviz/">Graphviz program</a>, and add 800 <tt>/Applications/Graphviz.app/Contents/MacOS/</tt> (or wherever you install 801 it) to your path. Once in your system and path are set up, rerun the LLVM 802 configure script and rebuild LLVM to enable this functionality.</p> 803 804 <p><tt>SelectionDAG</tt> has been extended to make it easier to locate 805 <i>interesting</i> nodes in large complex graphs. From gdb, if you 806 <tt>call DAG.setGraphColor(<i>node</i>, "<i>color</i>")</tt>, then the 807 next <tt>call DAG.viewGraph()</tt> would highlight the node in the 808 specified color (choices of colors can be found at <a 809 href="http://www.graphviz.org/doc/info/colors.html">colors</a>.) More 810 complex node attributes can be provided with <tt>call 811 DAG.setGraphAttrs(<i>node</i>, "<i>attributes</i>")</tt> (choices can be 812 found at <a href="http://www.graphviz.org/doc/info/attrs.html">Graph 813 Attributes</a>.) If you want to restart and clear all the current graph 814 attributes, then you can <tt>call DAG.clearGraphAttrs()</tt>. </p> 815 816 <p>Note that graph visualization features are compiled out of Release builds 817 to reduce file size. This means that you need a Debug+Asserts or 818 Release+Asserts build to use these features.</p> 819 820 </div> 821 822 </div> 823 824 <!-- *********************************************************************** --> 825 <h2> 826 <a name="datastructure">Picking the Right Data Structure for a Task</a> 827 </h2> 828 <!-- *********************************************************************** --> 829 830 <div> 831 832 <p>LLVM has a plethora of data structures in the <tt>llvm/ADT/</tt> directory, 833 and we commonly use STL data structures. This section describes the trade-offs 834 you should consider when you pick one.</p> 835 836 <p> 837 The first step is a choose your own adventure: do you want a sequential 838 container, a set-like container, or a map-like container? The most important 839 thing when choosing a container is the algorithmic properties of how you plan to 840 access the container. Based on that, you should use:</p> 841 842 <ul> 843 <li>a <a href="#ds_map">map-like</a> container if you need efficient look-up 844 of an value based on another value. Map-like containers also support 845 efficient queries for containment (whether a key is in the map). Map-like 846 containers generally do not support efficient reverse mapping (values to 847 keys). If you need that, use two maps. Some map-like containers also 848 support efficient iteration through the keys in sorted order. Map-like 849 containers are the most expensive sort, only use them if you need one of 850 these capabilities.</li> 851 852 <li>a <a href="#ds_set">set-like</a> container if you need to put a bunch of 853 stuff into a container that automatically eliminates duplicates. Some 854 set-like containers support efficient iteration through the elements in 855 sorted order. Set-like containers are more expensive than sequential 856 containers. 857 </li> 858 859 <li>a <a href="#ds_sequential">sequential</a> container provides 860 the most efficient way to add elements and keeps track of the order they are 861 added to the collection. They permit duplicates and support efficient 862 iteration, but do not support efficient look-up based on a key. 863 </li> 864 865 <li>a <a href="#ds_string">string</a> container is a specialized sequential 866 container or reference structure that is used for character or byte 867 arrays.</li> 868 869 <li>a <a href="#ds_bit">bit</a> container provides an efficient way to store and 870 perform set operations on sets of numeric id's, while automatically 871 eliminating duplicates. Bit containers require a maximum of 1 bit for each 872 identifier you want to store. 873 </li> 874 </ul> 875 876 <p> 877 Once the proper category of container is determined, you can fine tune the 878 memory use, constant factors, and cache behaviors of access by intelligently 879 picking a member of the category. Note that constant factors and cache behavior 880 can be a big deal. If you have a vector that usually only contains a few 881 elements (but could contain many), for example, it's much better to use 882 <a href="#dss_smallvector">SmallVector</a> than <a href="#dss_vector">vector</a> 883 . Doing so avoids (relatively) expensive malloc/free calls, which dwarf the 884 cost of adding the elements to the container. </p> 885 886 <!-- ======================================================================= --> 887 <h3> 888 <a name="ds_sequential">Sequential Containers (std::vector, std::list, etc)</a> 889 </h3> 890 891 <div> 892 There are a variety of sequential containers available for you, based on your 893 needs. Pick the first in this section that will do what you want. 894 895 <!-- _______________________________________________________________________ --> 896 <h4> 897 <a name="dss_arrayref">llvm/ADT/ArrayRef.h</a> 898 </h4> 899 900 <div> 901 <p>The llvm::ArrayRef class is the preferred class to use in an interface that 902 accepts a sequential list of elements in memory and just reads from them. By 903 taking an ArrayRef, the API can be passed a fixed size array, an std::vector, 904 an llvm::SmallVector and anything else that is contiguous in memory. 905 </p> 906 </div> 907 908 909 910 <!-- _______________________________________________________________________ --> 911 <h4> 912 <a name="dss_fixedarrays">Fixed Size Arrays</a> 913 </h4> 914 915 <div> 916 <p>Fixed size arrays are very simple and very fast. They are good if you know 917 exactly how many elements you have, or you have a (low) upper bound on how many 918 you have.</p> 919 </div> 920 921 <!-- _______________________________________________________________________ --> 922 <h4> 923 <a name="dss_heaparrays">Heap Allocated Arrays</a> 924 </h4> 925 926 <div> 927 <p>Heap allocated arrays (new[] + delete[]) are also simple. They are good if 928 the number of elements is variable, if you know how many elements you will need 929 before the array is allocated, and if the array is usually large (if not, 930 consider a <a href="#dss_smallvector">SmallVector</a>). The cost of a heap 931 allocated array is the cost of the new/delete (aka malloc/free). Also note that 932 if you are allocating an array of a type with a constructor, the constructor and 933 destructors will be run for every element in the array (re-sizable vectors only 934 construct those elements actually used).</p> 935 </div> 936 937 <!-- _______________________________________________________________________ --> 938 <h4> 939 <a name="dss_tinyptrvector">"llvm/ADT/TinyPtrVector.h"</a> 940 </h4> 941 942 943 <div> 944 <p><tt>TinyPtrVector<Type></tt> is a highly specialized collection class 945 that is optimized to avoid allocation in the case when a vector has zero or one 946 elements. It has two major restrictions: 1) it can only hold values of pointer 947 type, and 2) it cannot hold a null pointer.</p> 948 949 <p>Since this container is highly specialized, it is rarely used.</p> 950 951 </div> 952 953 <!-- _______________________________________________________________________ --> 954 <h4> 955 <a name="dss_smallvector">"llvm/ADT/SmallVector.h"</a> 956 </h4> 957 958 <div> 959 <p><tt>SmallVector<Type, N></tt> is a simple class that looks and smells 960 just like <tt>vector<Type></tt>: 961 it supports efficient iteration, lays out elements in memory order (so you can 962 do pointer arithmetic between elements), supports efficient push_back/pop_back 963 operations, supports efficient random access to its elements, etc.</p> 964 965 <p>The advantage of SmallVector is that it allocates space for 966 some number of elements (N) <b>in the object itself</b>. Because of this, if 967 the SmallVector is dynamically smaller than N, no malloc is performed. This can 968 be a big win in cases where the malloc/free call is far more expensive than the 969 code that fiddles around with the elements.</p> 970 971 <p>This is good for vectors that are "usually small" (e.g. the number of 972 predecessors/successors of a block is usually less than 8). On the other hand, 973 this makes the size of the SmallVector itself large, so you don't want to 974 allocate lots of them (doing so will waste a lot of space). As such, 975 SmallVectors are most useful when on the stack.</p> 976 977 <p>SmallVector also provides a nice portable and efficient replacement for 978 <tt>alloca</tt>.</p> 979 980 </div> 981 982 <!-- _______________________________________________________________________ --> 983 <h4> 984 <a name="dss_vector"><vector></a> 985 </h4> 986 987 <div> 988 <p> 989 std::vector is well loved and respected. It is useful when SmallVector isn't: 990 when the size of the vector is often large (thus the small optimization will 991 rarely be a benefit) or if you will be allocating many instances of the vector 992 itself (which would waste space for elements that aren't in the container). 993 vector is also useful when interfacing with code that expects vectors :). 994 </p> 995 996 <p>One worthwhile note about std::vector: avoid code like this:</p> 997 998 <div class="doc_code"> 999 <pre> 1000 for ( ... ) { 1001 std::vector<foo> V; 1002 // make use of V. 1003 } 1004 </pre> 1005 </div> 1006 1007 <p>Instead, write this as:</p> 1008 1009 <div class="doc_code"> 1010 <pre> 1011 std::vector<foo> V; 1012 for ( ... ) { 1013 // make use of V. 1014 V.clear(); 1015 } 1016 </pre> 1017 </div> 1018 1019 <p>Doing so will save (at least) one heap allocation and free per iteration of 1020 the loop.</p> 1021 1022 </div> 1023 1024 <!-- _______________________________________________________________________ --> 1025 <h4> 1026 <a name="dss_deque"><deque></a> 1027 </h4> 1028 1029 <div> 1030 <p>std::deque is, in some senses, a generalized version of std::vector. Like 1031 std::vector, it provides constant time random access and other similar 1032 properties, but it also provides efficient access to the front of the list. It 1033 does not guarantee continuity of elements within memory.</p> 1034 1035 <p>In exchange for this extra flexibility, std::deque has significantly higher 1036 constant factor costs than std::vector. If possible, use std::vector or 1037 something cheaper.</p> 1038 </div> 1039 1040 <!-- _______________________________________________________________________ --> 1041 <h4> 1042 <a name="dss_list"><list></a> 1043 </h4> 1044 1045 <div> 1046 <p>std::list is an extremely inefficient class that is rarely useful. 1047 It performs a heap allocation for every element inserted into it, thus having an 1048 extremely high constant factor, particularly for small data types. std::list 1049 also only supports bidirectional iteration, not random access iteration.</p> 1050 1051 <p>In exchange for this high cost, std::list supports efficient access to both 1052 ends of the list (like std::deque, but unlike std::vector or SmallVector). In 1053 addition, the iterator invalidation characteristics of std::list are stronger 1054 than that of a vector class: inserting or removing an element into the list does 1055 not invalidate iterator or pointers to other elements in the list.</p> 1056 </div> 1057 1058 <!-- _______________________________________________________________________ --> 1059 <h4> 1060 <a name="dss_ilist">llvm/ADT/ilist.h</a> 1061 </h4> 1062 1063 <div> 1064 <p><tt>ilist<T></tt> implements an 'intrusive' doubly-linked list. It is 1065 intrusive, because it requires the element to store and provide access to the 1066 prev/next pointers for the list.</p> 1067 1068 <p><tt>ilist</tt> has the same drawbacks as <tt>std::list</tt>, and additionally 1069 requires an <tt>ilist_traits</tt> implementation for the element type, but it 1070 provides some novel characteristics. In particular, it can efficiently store 1071 polymorphic objects, the traits class is informed when an element is inserted or 1072 removed from the list, and <tt>ilist</tt>s are guaranteed to support a 1073 constant-time splice operation.</p> 1074 1075 <p>These properties are exactly what we want for things like 1076 <tt>Instruction</tt>s and basic blocks, which is why these are implemented with 1077 <tt>ilist</tt>s.</p> 1078 1079 Related classes of interest are explained in the following subsections: 1080 <ul> 1081 <li><a href="#dss_ilist_traits">ilist_traits</a></li> 1082 <li><a href="#dss_iplist">iplist</a></li> 1083 <li><a href="#dss_ilist_node">llvm/ADT/ilist_node.h</a></li> 1084 <li><a href="#dss_ilist_sentinel">Sentinels</a></li> 1085 </ul> 1086 </div> 1087 1088 <!-- _______________________________________________________________________ --> 1089 <h4> 1090 <a name="dss_packedvector">llvm/ADT/PackedVector.h</a> 1091 </h4> 1092 1093 <div> 1094 <p> 1095 Useful for storing a vector of values using only a few number of bits for each 1096 value. Apart from the standard operations of a vector-like container, it can 1097 also perform an 'or' set operation. 1098 </p> 1099 1100 <p>For example:</p> 1101 1102 <div class="doc_code"> 1103 <pre> 1104 enum State { 1105 None = 0x0, 1106 FirstCondition = 0x1, 1107 SecondCondition = 0x2, 1108 Both = 0x3 1109 }; 1110 1111 State get() { 1112 PackedVector<State, 2> Vec1; 1113 Vec1.push_back(FirstCondition); 1114 1115 PackedVector<State, 2> Vec2; 1116 Vec2.push_back(SecondCondition); 1117 1118 Vec1 |= Vec2; 1119 return Vec1[0]; // returns 'Both'. 1120 } 1121 </pre> 1122 </div> 1123 1124 </div> 1125 1126 <!-- _______________________________________________________________________ --> 1127 <h4> 1128 <a name="dss_ilist_traits">ilist_traits</a> 1129 </h4> 1130 1131 <div> 1132 <p><tt>ilist_traits<T></tt> is <tt>ilist<T></tt>'s customization 1133 mechanism. <tt>iplist<T></tt> (and consequently <tt>ilist<T></tt>) 1134 publicly derive from this traits class.</p> 1135 </div> 1136 1137 <!-- _______________________________________________________________________ --> 1138 <h4> 1139 <a name="dss_iplist">iplist</a> 1140 </h4> 1141 1142 <div> 1143 <p><tt>iplist<T></tt> is <tt>ilist<T></tt>'s base and as such 1144 supports a slightly narrower interface. Notably, inserters from 1145 <tt>T&</tt> are absent.</p> 1146 1147 <p><tt>ilist_traits<T></tt> is a public base of this class and can be 1148 used for a wide variety of customizations.</p> 1149 </div> 1150 1151 <!-- _______________________________________________________________________ --> 1152 <h4> 1153 <a name="dss_ilist_node">llvm/ADT/ilist_node.h</a> 1154 </h4> 1155 1156 <div> 1157 <p><tt>ilist_node<T></tt> implements a the forward and backward links 1158 that are expected by the <tt>ilist<T></tt> (and analogous containers) 1159 in the default manner.</p> 1160 1161 <p><tt>ilist_node<T></tt>s are meant to be embedded in the node type 1162 <tt>T</tt>, usually <tt>T</tt> publicly derives from 1163 <tt>ilist_node<T></tt>.</p> 1164 </div> 1165 1166 <!-- _______________________________________________________________________ --> 1167 <h4> 1168 <a name="dss_ilist_sentinel">Sentinels</a> 1169 </h4> 1170 1171 <div> 1172 <p><tt>ilist</tt>s have another specialty that must be considered. To be a good 1173 citizen in the C++ ecosystem, it needs to support the standard container 1174 operations, such as <tt>begin</tt> and <tt>end</tt> iterators, etc. Also, the 1175 <tt>operator--</tt> must work correctly on the <tt>end</tt> iterator in the 1176 case of non-empty <tt>ilist</tt>s.</p> 1177 1178 <p>The only sensible solution to this problem is to allocate a so-called 1179 <i>sentinel</i> along with the intrusive list, which serves as the <tt>end</tt> 1180 iterator, providing the back-link to the last element. However conforming to the 1181 C++ convention it is illegal to <tt>operator++</tt> beyond the sentinel and it 1182 also must not be dereferenced.</p> 1183 1184 <p>These constraints allow for some implementation freedom to the <tt>ilist</tt> 1185 how to allocate and store the sentinel. The corresponding policy is dictated 1186 by <tt>ilist_traits<T></tt>. By default a <tt>T</tt> gets heap-allocated 1187 whenever the need for a sentinel arises.</p> 1188 1189 <p>While the default policy is sufficient in most cases, it may break down when 1190 <tt>T</tt> does not provide a default constructor. Also, in the case of many 1191 instances of <tt>ilist</tt>s, the memory overhead of the associated sentinels 1192 is wasted. To alleviate the situation with numerous and voluminous 1193 <tt>T</tt>-sentinels, sometimes a trick is employed, leading to <i>ghostly 1194 sentinels</i>.</p> 1195 1196 <p>Ghostly sentinels are obtained by specially-crafted <tt>ilist_traits<T></tt> 1197 which superpose the sentinel with the <tt>ilist</tt> instance in memory. Pointer 1198 arithmetic is used to obtain the sentinel, which is relative to the 1199 <tt>ilist</tt>'s <tt>this</tt> pointer. The <tt>ilist</tt> is augmented by an 1200 extra pointer, which serves as the back-link of the sentinel. This is the only 1201 field in the ghostly sentinel which can be legally accessed.</p> 1202 </div> 1203 1204 <!-- _______________________________________________________________________ --> 1205 <h4> 1206 <a name="dss_other">Other Sequential Container options</a> 1207 </h4> 1208 1209 <div> 1210 <p>Other STL containers are available, such as std::string.</p> 1211 1212 <p>There are also various STL adapter classes such as std::queue, 1213 std::priority_queue, std::stack, etc. These provide simplified access to an 1214 underlying container but don't affect the cost of the container itself.</p> 1215 1216 </div> 1217 </div> 1218 1219 <!-- ======================================================================= --> 1220 <h3> 1221 <a name="ds_string">String-like containers</a> 1222 </h3> 1223 1224 <div> 1225 1226 <p> 1227 There are a variety of ways to pass around and use strings in C and C++, and 1228 LLVM adds a few new options to choose from. Pick the first option on this list 1229 that will do what you need, they are ordered according to their relative cost. 1230 </p> 1231 <p> 1232 Note that is is generally preferred to <em>not</em> pass strings around as 1233 "<tt>const char*</tt>"'s. These have a number of problems, including the fact 1234 that they cannot represent embedded nul ("\0") characters, and do not have a 1235 length available efficiently. The general replacement for '<tt>const 1236 char*</tt>' is StringRef. 1237 </p> 1238 1239 <p>For more information on choosing string containers for APIs, please see 1240 <a href="#string_apis">Passing strings</a>.</p> 1241 1242 1243 <!-- _______________________________________________________________________ --> 1244 <h4> 1245 <a name="dss_stringref">llvm/ADT/StringRef.h</a> 1246 </h4> 1247 1248 <div> 1249 <p> 1250 The StringRef class is a simple value class that contains a pointer to a 1251 character and a length, and is quite related to the <a 1252 href="#dss_arrayref">ArrayRef</a> class (but specialized for arrays of 1253 characters). Because StringRef carries a length with it, it safely handles 1254 strings with embedded nul characters in it, getting the length does not require 1255 a strlen call, and it even has very convenient APIs for slicing and dicing the 1256 character range that it represents. 1257 </p> 1258 1259 <p> 1260 StringRef is ideal for passing simple strings around that are known to be live, 1261 either because they are C string literals, std::string, a C array, or a 1262 SmallVector. Each of these cases has an efficient implicit conversion to 1263 StringRef, which doesn't result in a dynamic strlen being executed. 1264 </p> 1265 1266 <p>StringRef has a few major limitations which make more powerful string 1267 containers useful:</p> 1268 1269 <ol> 1270 <li>You cannot directly convert a StringRef to a 'const char*' because there is 1271 no way to add a trailing nul (unlike the .c_str() method on various stronger 1272 classes).</li> 1273 1274 1275 <li>StringRef doesn't own or keep alive the underlying string bytes. 1276 As such it can easily lead to dangling pointers, and is not suitable for 1277 embedding in datastructures in most cases (instead, use an std::string or 1278 something like that).</li> 1279 1280 <li>For the same reason, StringRef cannot be used as the return value of a 1281 method if the method "computes" the result string. Instead, use 1282 std::string.</li> 1283 1284 <li>StringRef's do not allow you to mutate the pointed-to string bytes and it 1285 doesn't allow you to insert or remove bytes from the range. For editing 1286 operations like this, it interoperates with the <a 1287 href="#dss_twine">Twine</a> class.</li> 1288 </ol> 1289 1290 <p>Because of its strengths and limitations, it is very common for a function to 1291 take a StringRef and for a method on an object to return a StringRef that 1292 points into some string that it owns.</p> 1293 1294 </div> 1295 1296 <!-- _______________________________________________________________________ --> 1297 <h4> 1298 <a name="dss_twine">llvm/ADT/Twine.h</a> 1299 </h4> 1300 1301 <div> 1302 <p> 1303 The Twine class is used as an intermediary datatype for APIs that want to take 1304 a string that can be constructed inline with a series of concatenations. 1305 Twine works by forming recursive instances of the Twine datatype (a simple 1306 value object) on the stack as temporary objects, linking them together into a 1307 tree which is then linearized when the Twine is consumed. Twine is only safe 1308 to use as the argument to a function, and should always be a const reference, 1309 e.g.: 1310 </p> 1311 1312 <pre> 1313 void foo(const Twine &T); 1314 ... 1315 StringRef X = ... 1316 unsigned i = ... 1317 foo(X + "." + Twine(i)); 1318 </pre> 1319 1320 <p>This example forms a string like "blarg.42" by concatenating the values 1321 together, and does not form intermediate strings containing "blarg" or 1322 "blarg.". 1323 </p> 1324 1325 <p>Because Twine is constructed with temporary objects on the stack, and 1326 because these instances are destroyed at the end of the current statement, 1327 it is an inherently dangerous API. For example, this simple variant contains 1328 undefined behavior and will probably crash:</p> 1329 1330 <pre> 1331 void foo(const Twine &T); 1332 ... 1333 StringRef X = ... 1334 unsigned i = ... 1335 const Twine &Tmp = X + "." + Twine(i); 1336 foo(Tmp); 1337 </pre> 1338 1339 <p>... because the temporaries are destroyed before the call. That said, 1340 Twine's are much more efficient than intermediate std::string temporaries, and 1341 they work really well with StringRef. Just be aware of their limitations.</p> 1342 1343 </div> 1344 1345 1346 <!-- _______________________________________________________________________ --> 1347 <h4> 1348 <a name="dss_smallstring">llvm/ADT/SmallString.h</a> 1349 </h4> 1350 1351 <div> 1352 1353 <p>SmallString is a subclass of <a href="#dss_smallvector">SmallVector</a> that 1354 adds some convenience APIs like += that takes StringRef's. SmallString avoids 1355 allocating memory in the case when the preallocated space is enough to hold its 1356 data, and it calls back to general heap allocation when required. Since it owns 1357 its data, it is very safe to use and supports full mutation of the string.</p> 1358 1359 <p>Like SmallVector's, the big downside to SmallString is their sizeof. While 1360 they are optimized for small strings, they themselves are not particularly 1361 small. This means that they work great for temporary scratch buffers on the 1362 stack, but should not generally be put into the heap: it is very rare to 1363 see a SmallString as the member of a frequently-allocated heap data structure 1364 or returned by-value. 1365 </p> 1366 1367 </div> 1368 1369 <!-- _______________________________________________________________________ --> 1370 <h4> 1371 <a name="dss_stdstring">std::string</a> 1372 </h4> 1373 1374 <div> 1375 1376 <p>The standard C++ std::string class is a very general class that (like 1377 SmallString) owns its underlying data. sizeof(std::string) is very reasonable 1378 so it can be embedded into heap data structures and returned by-value. 1379 On the other hand, std::string is highly inefficient for inline editing (e.g. 1380 concatenating a bunch of stuff together) and because it is provided by the 1381 standard library, its performance characteristics depend a lot of the host 1382 standard library (e.g. libc++ and MSVC provide a highly optimized string 1383 class, GCC contains a really slow implementation). 1384 </p> 1385 1386 <p>The major disadvantage of std::string is that almost every operation that 1387 makes them larger can allocate memory, which is slow. As such, it is better 1388 to use SmallVector or Twine as a scratch buffer, but then use std::string to 1389 persist the result.</p> 1390 1391 1392 </div> 1393 1394 <!-- end of strings --> 1395 </div> 1396 1397 1398 <!-- ======================================================================= --> 1399 <h3> 1400 <a name="ds_set">Set-Like Containers (std::set, SmallSet, SetVector, etc)</a> 1401 </h3> 1402 1403 <div> 1404 1405 <p>Set-like containers are useful when you need to canonicalize multiple values 1406 into a single representation. There are several different choices for how to do 1407 this, providing various trade-offs.</p> 1408 1409 <!-- _______________________________________________________________________ --> 1410 <h4> 1411 <a name="dss_sortedvectorset">A sorted 'vector'</a> 1412 </h4> 1413 1414 <div> 1415 1416 <p>If you intend to insert a lot of elements, then do a lot of queries, a 1417 great approach is to use a vector (or other sequential container) with 1418 std::sort+std::unique to remove duplicates. This approach works really well if 1419 your usage pattern has these two distinct phases (insert then query), and can be 1420 coupled with a good choice of <a href="#ds_sequential">sequential container</a>. 1421 </p> 1422 1423 <p> 1424 This combination provides the several nice properties: the result data is 1425 contiguous in memory (good for cache locality), has few allocations, is easy to 1426 address (iterators in the final vector are just indices or pointers), and can be 1427 efficiently queried with a standard binary or radix search.</p> 1428 1429 </div> 1430 1431 <!-- _______________________________________________________________________ --> 1432 <h4> 1433 <a name="dss_smallset">"llvm/ADT/SmallSet.h"</a> 1434 </h4> 1435 1436 <div> 1437 1438 <p>If you have a set-like data structure that is usually small and whose elements 1439 are reasonably small, a <tt>SmallSet<Type, N></tt> is a good choice. This set 1440 has space for N elements in place (thus, if the set is dynamically smaller than 1441 N, no malloc traffic is required) and accesses them with a simple linear search. 1442 When the set grows beyond 'N' elements, it allocates a more expensive representation that 1443 guarantees efficient access (for most types, it falls back to std::set, but for 1444 pointers it uses something far better, <a 1445 href="#dss_smallptrset">SmallPtrSet</a>).</p> 1446 1447 <p>The magic of this class is that it handles small sets extremely efficiently, 1448 but gracefully handles extremely large sets without loss of efficiency. The 1449 drawback is that the interface is quite small: it supports insertion, queries 1450 and erasing, but does not support iteration.</p> 1451 1452 </div> 1453 1454 <!-- _______________________________________________________________________ --> 1455 <h4> 1456 <a name="dss_smallptrset">"llvm/ADT/SmallPtrSet.h"</a> 1457 </h4> 1458 1459 <div> 1460 1461 <p>SmallPtrSet has all the advantages of <tt>SmallSet</tt> (and a <tt>SmallSet</tt> of pointers is 1462 transparently implemented with a <tt>SmallPtrSet</tt>), but also supports iterators. If 1463 more than 'N' insertions are performed, a single quadratically 1464 probed hash table is allocated and grows as needed, providing extremely 1465 efficient access (constant time insertion/deleting/queries with low constant 1466 factors) and is very stingy with malloc traffic.</p> 1467 1468 <p>Note that, unlike <tt>std::set</tt>, the iterators of <tt>SmallPtrSet</tt> are invalidated 1469 whenever an insertion occurs. Also, the values visited by the iterators are not 1470 visited in sorted order.</p> 1471 1472 </div> 1473 1474 <!-- _______________________________________________________________________ --> 1475 <h4> 1476 <a name="dss_denseset">"llvm/ADT/DenseSet.h"</a> 1477 </h4> 1478 1479 <div> 1480 1481 <p> 1482 DenseSet is a simple quadratically probed hash table. It excels at supporting 1483 small values: it uses a single allocation to hold all of the pairs that 1484 are currently inserted in the set. DenseSet is a great way to unique small 1485 values that are not simple pointers (use <a 1486 href="#dss_smallptrset">SmallPtrSet</a> for pointers). Note that DenseSet has 1487 the same requirements for the value type that <a 1488 href="#dss_densemap">DenseMap</a> has. 1489 </p> 1490 1491 </div> 1492 1493 <!-- _______________________________________________________________________ --> 1494 <h4> 1495 <a name="dss_sparseset">"llvm/ADT/SparseSet.h"</a> 1496 </h4> 1497 1498 <div> 1499 1500 <p>SparseSet holds a small number of objects identified by unsigned keys of 1501 moderate size. It uses a lot of memory, but provides operations that are 1502 almost as fast as a vector. Typical keys are physical registers, virtual 1503 registers, or numbered basic blocks.</p> 1504 1505 <p>SparseSet is useful for algorithms that need very fast clear/find/insert/erase 1506 and fast iteration over small sets. It is not intended for building composite 1507 data structures.</p> 1508 1509 </div> 1510 1511 <!-- _______________________________________________________________________ --> 1512 <h4> 1513 <a name="dss_FoldingSet">"llvm/ADT/FoldingSet.h"</a> 1514 </h4> 1515 1516 <div> 1517 1518 <p> 1519 FoldingSet is an aggregate class that is really good at uniquing 1520 expensive-to-create or polymorphic objects. It is a combination of a chained 1521 hash table with intrusive links (uniqued objects are required to inherit from 1522 FoldingSetNode) that uses <a href="#dss_smallvector">SmallVector</a> as part of 1523 its ID process.</p> 1524 1525 <p>Consider a case where you want to implement a "getOrCreateFoo" method for 1526 a complex object (for example, a node in the code generator). The client has a 1527 description of *what* it wants to generate (it knows the opcode and all the 1528 operands), but we don't want to 'new' a node, then try inserting it into a set 1529 only to find out it already exists, at which point we would have to delete it 1530 and return the node that already exists. 1531 </p> 1532 1533 <p>To support this style of client, FoldingSet perform a query with a 1534 FoldingSetNodeID (which wraps SmallVector) that can be used to describe the 1535 element that we want to query for. The query either returns the element 1536 matching the ID or it returns an opaque ID that indicates where insertion should 1537 take place. Construction of the ID usually does not require heap traffic.</p> 1538 1539 <p>Because FoldingSet uses intrusive links, it can support polymorphic objects 1540 in the set (for example, you can have SDNode instances mixed with LoadSDNodes). 1541 Because the elements are individually allocated, pointers to the elements are 1542 stable: inserting or removing elements does not invalidate any pointers to other 1543 elements. 1544 </p> 1545 1546 </div> 1547 1548 <!-- _______________________________________________________________________ --> 1549 <h4> 1550 <a name="dss_set"><set></a> 1551 </h4> 1552 1553 <div> 1554 1555 <p><tt>std::set</tt> is a reasonable all-around set class, which is decent at 1556 many things but great at nothing. std::set allocates memory for each element 1557 inserted (thus it is very malloc intensive) and typically stores three pointers 1558 per element in the set (thus adding a large amount of per-element space 1559 overhead). It offers guaranteed log(n) performance, which is not particularly 1560 fast from a complexity standpoint (particularly if the elements of the set are 1561 expensive to compare, like strings), and has extremely high constant factors for 1562 lookup, insertion and removal.</p> 1563 1564 <p>The advantages of std::set are that its iterators are stable (deleting or 1565 inserting an element from the set does not affect iterators or pointers to other 1566 elements) and that iteration over the set is guaranteed to be in sorted order. 1567 If the elements in the set are large, then the relative overhead of the pointers 1568 and malloc traffic is not a big deal, but if the elements of the set are small, 1569 std::set is almost never a good choice.</p> 1570 1571 </div> 1572 1573 <!-- _______________________________________________________________________ --> 1574 <h4> 1575 <a name="dss_setvector">"llvm/ADT/SetVector.h"</a> 1576 </h4> 1577 1578 <div> 1579 <p>LLVM's SetVector<Type> is an adapter class that combines your choice of 1580 a set-like container along with a <a href="#ds_sequential">Sequential 1581 Container</a>. The important property 1582 that this provides is efficient insertion with uniquing (duplicate elements are 1583 ignored) with iteration support. It implements this by inserting elements into 1584 both a set-like container and the sequential container, using the set-like 1585 container for uniquing and the sequential container for iteration. 1586 </p> 1587 1588 <p>The difference between SetVector and other sets is that the order of 1589 iteration is guaranteed to match the order of insertion into the SetVector. 1590 This property is really important for things like sets of pointers. Because 1591 pointer values are non-deterministic (e.g. vary across runs of the program on 1592 different machines), iterating over the pointers in the set will 1593 not be in a well-defined order.</p> 1594 1595 <p> 1596 The drawback of SetVector is that it requires twice as much space as a normal 1597 set and has the sum of constant factors from the set-like container and the 1598 sequential container that it uses. Use it *only* if you need to iterate over 1599 the elements in a deterministic order. SetVector is also expensive to delete 1600 elements out of (linear time), unless you use it's "pop_back" method, which is 1601 faster. 1602 </p> 1603 1604 <p><tt>SetVector</tt> is an adapter class that defaults to 1605 using <tt>std::vector</tt> and a size 16 <tt>SmallSet</tt> for the underlying 1606 containers, so it is quite expensive. However, 1607 <tt>"llvm/ADT/SetVector.h"</tt> also provides a <tt>SmallSetVector</tt> 1608 class, which defaults to using a <tt>SmallVector</tt> and <tt>SmallSet</tt> 1609 of a specified size. If you use this, and if your sets are dynamically 1610 smaller than <tt>N</tt>, you will save a lot of heap traffic.</p> 1611 1612 </div> 1613 1614 <!-- _______________________________________________________________________ --> 1615 <h4> 1616 <a name="dss_uniquevector">"llvm/ADT/UniqueVector.h"</a> 1617 </h4> 1618 1619 <div> 1620 1621 <p> 1622 UniqueVector is similar to <a href="#dss_setvector">SetVector</a>, but it 1623 retains a unique ID for each element inserted into the set. It internally 1624 contains a map and a vector, and it assigns a unique ID for each value inserted 1625 into the set.</p> 1626 1627 <p>UniqueVector is very expensive: its cost is the sum of the cost of 1628 maintaining both the map and vector, it has high complexity, high constant 1629 factors, and produces a lot of malloc traffic. It should be avoided.</p> 1630 1631 </div> 1632 1633 <!-- _______________________________________________________________________ --> 1634 <h4> 1635 <a name="dss_immutableset">"llvm/ADT/ImmutableSet.h"</a> 1636 </h4> 1637 1638 <div> 1639 1640 <p> 1641 ImmutableSet is an immutable (functional) set implementation based on an AVL 1642 tree. 1643 Adding or removing elements is done through a Factory object and results in the 1644 creation of a new ImmutableSet object. 1645 If an ImmutableSet already exists with the given contents, then the existing one 1646 is returned; equality is compared with a FoldingSetNodeID. 1647 The time and space complexity of add or remove operations is logarithmic in the 1648 size of the original set. 1649 1650 <p> 1651 There is no method for returning an element of the set, you can only check for 1652 membership. 1653 1654 </div> 1655 1656 1657 <!-- _______________________________________________________________________ --> 1658 <h4> 1659 <a name="dss_otherset">Other Set-Like Container Options</a> 1660 </h4> 1661 1662 <div> 1663 1664 <p> 1665 The STL provides several other options, such as std::multiset and the various 1666 "hash_set" like containers (whether from C++ TR1 or from the SGI library). We 1667 never use hash_set and unordered_set because they are generally very expensive 1668 (each insertion requires a malloc) and very non-portable. 1669 </p> 1670 1671 <p>std::multiset is useful if you're not interested in elimination of 1672 duplicates, but has all the drawbacks of std::set. A sorted vector (where you 1673 don't delete duplicate entries) or some other approach is almost always 1674 better.</p> 1675 1676 </div> 1677 1678 </div> 1679 1680 <!-- ======================================================================= --> 1681 <h3> 1682 <a name="ds_map">Map-Like Containers (std::map, DenseMap, etc)</a> 1683 </h3> 1684 1685 <div> 1686 Map-like containers are useful when you want to associate data to a key. As 1687 usual, there are a lot of different ways to do this. :) 1688 1689 <!-- _______________________________________________________________________ --> 1690 <h4> 1691 <a name="dss_sortedvectormap">A sorted 'vector'</a> 1692 </h4> 1693 1694 <div> 1695 1696 <p> 1697 If your usage pattern follows a strict insert-then-query approach, you can 1698 trivially use the same approach as <a href="#dss_sortedvectorset">sorted vectors 1699 for set-like containers</a>. The only difference is that your query function 1700 (which uses std::lower_bound to get efficient log(n) lookup) should only compare 1701 the key, not both the key and value. This yields the same advantages as sorted 1702 vectors for sets. 1703 </p> 1704 </div> 1705 1706 <!-- _______________________________________________________________________ --> 1707 <h4> 1708 <a name="dss_stringmap">"llvm/ADT/StringMap.h"</a> 1709 </h4> 1710 1711 <div> 1712 1713 <p> 1714 Strings are commonly used as keys in maps, and they are difficult to support 1715 efficiently: they are variable length, inefficient to hash and compare when 1716 long, expensive to copy, etc. StringMap is a specialized container designed to 1717 cope with these issues. It supports mapping an arbitrary range of bytes to an 1718 arbitrary other object.</p> 1719 1720 <p>The StringMap implementation uses a quadratically-probed hash table, where 1721 the buckets store a pointer to the heap allocated entries (and some other 1722 stuff). The entries in the map must be heap allocated because the strings are 1723 variable length. The string data (key) and the element object (value) are 1724 stored in the same allocation with the string data immediately after the element 1725 object. This container guarantees the "<tt>(char*)(&Value+1)</tt>" points 1726 to the key string for a value.</p> 1727 1728 <p>The StringMap is very fast for several reasons: quadratic probing is very 1729 cache efficient for lookups, the hash value of strings in buckets is not 1730 recomputed when looking up an element, StringMap rarely has to touch the 1731 memory for unrelated objects when looking up a value (even when hash collisions 1732 happen), hash table growth does not recompute the hash values for strings 1733 already in the table, and each pair in the map is store in a single allocation 1734 (the string data is stored in the same allocation as the Value of a pair).</p> 1735 1736 <p>StringMap also provides query methods that take byte ranges, so it only ever 1737 copies a string if a value is inserted into the table.</p> 1738 1739 <p>StringMap iteratation order, however, is not guaranteed to be deterministic, 1740 so any uses which require that should instead use a std::map.</p> 1741 </div> 1742 1743 <!-- _______________________________________________________________________ --> 1744 <h4> 1745 <a name="dss_indexedmap">"llvm/ADT/IndexedMap.h"</a> 1746 </h4> 1747 1748 <div> 1749 <p> 1750 IndexedMap is a specialized container for mapping small dense integers (or 1751 values that can be mapped to small dense integers) to some other type. It is 1752 internally implemented as a vector with a mapping function that maps the keys to 1753 the dense integer range. 1754 </p> 1755 1756 <p> 1757 This is useful for cases like virtual registers in the LLVM code generator: they 1758 have a dense mapping that is offset by a compile-time constant (the first 1759 virtual register ID).</p> 1760 1761 </div> 1762 1763 <!-- _______________________________________________________________________ --> 1764 <h4> 1765 <a name="dss_densemap">"llvm/ADT/DenseMap.h"</a> 1766 </h4> 1767 1768 <div> 1769 1770 <p> 1771 DenseMap is a simple quadratically probed hash table. It excels at supporting 1772 small keys and values: it uses a single allocation to hold all of the pairs that 1773 are currently inserted in the map. DenseMap is a great way to map pointers to 1774 pointers, or map other small types to each other. 1775 </p> 1776 1777 <p> 1778 There are several aspects of DenseMap that you should be aware of, however. The 1779 iterators in a DenseMap are invalidated whenever an insertion occurs, unlike 1780 map. Also, because DenseMap allocates space for a large number of key/value 1781 pairs (it starts with 64 by default), it will waste a lot of space if your keys 1782 or values are large. Finally, you must implement a partial specialization of 1783 DenseMapInfo for the key that you want, if it isn't already supported. This 1784 is required to tell DenseMap about two special marker values (which can never be 1785 inserted into the map) that it needs internally.</p> 1786 1787 <p> 1788 DenseMap's find_as() method supports lookup operations using an alternate key 1789 type. This is useful in cases where the normal key type is expensive to 1790 construct, but cheap to compare against. The DenseMapInfo is responsible for 1791 defining the appropriate comparison and hashing methods for each alternate 1792 key type used. 1793 </p> 1794 1795 </div> 1796 1797 <!-- _______________________________________________________________________ --> 1798 <h4> 1799 <a name="dss_valuemap">"llvm/ADT/ValueMap.h"</a> 1800 </h4> 1801 1802 <div> 1803 1804 <p> 1805 ValueMap is a wrapper around a <a href="#dss_densemap">DenseMap</a> mapping 1806 Value*s (or subclasses) to another type. When a Value is deleted or RAUW'ed, 1807 ValueMap will update itself so the new version of the key is mapped to the same 1808 value, just as if the key were a WeakVH. You can configure exactly how this 1809 happens, and what else happens on these two events, by passing 1810 a <code>Config</code> parameter to the ValueMap template.</p> 1811 1812 </div> 1813 1814 <!-- _______________________________________________________________________ --> 1815 <h4> 1816 <a name="dss_intervalmap">"llvm/ADT/IntervalMap.h"</a> 1817 </h4> 1818 1819 <div> 1820 1821 <p> IntervalMap is a compact map for small keys and values. It maps key 1822 intervals instead of single keys, and it will automatically coalesce adjacent 1823 intervals. When then map only contains a few intervals, they are stored in the 1824 map object itself to avoid allocations.</p> 1825 1826 <p> The IntervalMap iterators are quite big, so they should not be passed around 1827 as STL iterators. The heavyweight iterators allow a smaller data structure.</p> 1828 1829 </div> 1830 1831 <!-- _______________________________________________________________________ --> 1832 <h4> 1833 <a name="dss_map"><map></a> 1834 </h4> 1835 1836 <div> 1837 1838 <p> 1839 std::map has similar characteristics to <a href="#dss_set">std::set</a>: it uses 1840 a single allocation per pair inserted into the map, it offers log(n) lookup with 1841 an extremely large constant factor, imposes a space penalty of 3 pointers per 1842 pair in the map, etc.</p> 1843 1844 <p>std::map is most useful when your keys or values are very large, if you need 1845 to iterate over the collection in sorted order, or if you need stable iterators 1846 into the map (i.e. they don't get invalidated if an insertion or deletion of 1847 another element takes place).</p> 1848 1849 </div> 1850 1851 <!-- _______________________________________________________________________ --> 1852 <h4> 1853 <a name="dss_inteqclasses">"llvm/ADT/IntEqClasses.h"</a> 1854 </h4> 1855 1856 <div> 1857 1858 <p>IntEqClasses provides a compact representation of equivalence classes of 1859 small integers. Initially, each integer in the range 0..n-1 has its own 1860 equivalence class. Classes can be joined by passing two class representatives to 1861 the join(a, b) method. Two integers are in the same class when findLeader() 1862 returns the same representative.</p> 1863 1864 <p>Once all equivalence classes are formed, the map can be compressed so each 1865 integer 0..n-1 maps to an equivalence class number in the range 0..m-1, where m 1866 is the total number of equivalence classes. The map must be uncompressed before 1867 it can be edited again.</p> 1868 1869 </div> 1870 1871 <!-- _______________________________________________________________________ --> 1872 <h4> 1873 <a name="dss_immutablemap">"llvm/ADT/ImmutableMap.h"</a> 1874 </h4> 1875 1876 <div> 1877 1878 <p> 1879 ImmutableMap is an immutable (functional) map implementation based on an AVL 1880 tree. 1881 Adding or removing elements is done through a Factory object and results in the 1882 creation of a new ImmutableMap object. 1883 If an ImmutableMap already exists with the given key set, then the existing one 1884 is returned; equality is compared with a FoldingSetNodeID. 1885 The time and space complexity of add or remove operations is logarithmic in the 1886 size of the original map. 1887 1888 </div> 1889 1890 <!-- _______________________________________________________________________ --> 1891 <h4> 1892 <a name="dss_othermap">Other Map-Like Container Options</a> 1893 </h4> 1894 1895 <div> 1896 1897 <p> 1898 The STL provides several other options, such as std::multimap and the various 1899 "hash_map" like containers (whether from C++ TR1 or from the SGI library). We 1900 never use hash_set and unordered_set because they are generally very expensive 1901 (each insertion requires a malloc) and very non-portable.</p> 1902 1903 <p>std::multimap is useful if you want to map a key to multiple values, but has 1904 all the drawbacks of std::map. A sorted vector or some other approach is almost 1905 always better.</p> 1906 1907 </div> 1908 1909 </div> 1910 1911 <!-- ======================================================================= --> 1912 <h3> 1913 <a name="ds_bit">Bit storage containers (BitVector, SparseBitVector)</a> 1914 </h3> 1915 1916 <div> 1917 <p>Unlike the other containers, there are only two bit storage containers, and 1918 choosing when to use each is relatively straightforward.</p> 1919 1920 <p>One additional option is 1921 <tt>std::vector<bool></tt>: we discourage its use for two reasons 1) the 1922 implementation in many common compilers (e.g. commonly available versions of 1923 GCC) is extremely inefficient and 2) the C++ standards committee is likely to 1924 deprecate this container and/or change it significantly somehow. In any case, 1925 please don't use it.</p> 1926 1927 <!-- _______________________________________________________________________ --> 1928 <h4> 1929 <a name="dss_bitvector">BitVector</a> 1930 </h4> 1931 1932 <div> 1933 <p> The BitVector container provides a dynamic size set of bits for manipulation. 1934 It supports individual bit setting/testing, as well as set operations. The set 1935 operations take time O(size of bitvector), but operations are performed one word 1936 at a time, instead of one bit at a time. This makes the BitVector very fast for 1937 set operations compared to other containers. Use the BitVector when you expect 1938 the number of set bits to be high (IE a dense set). 1939 </p> 1940 </div> 1941 1942 <!-- _______________________________________________________________________ --> 1943 <h4> 1944 <a name="dss_smallbitvector">SmallBitVector</a> 1945 </h4> 1946 1947 <div> 1948 <p> The SmallBitVector container provides the same interface as BitVector, but 1949 it is optimized for the case where only a small number of bits, less than 1950 25 or so, are needed. It also transparently supports larger bit counts, but 1951 slightly less efficiently than a plain BitVector, so SmallBitVector should 1952 only be used when larger counts are rare. 1953 </p> 1954 1955 <p> 1956 At this time, SmallBitVector does not support set operations (and, or, xor), 1957 and its operator[] does not provide an assignable lvalue. 1958 </p> 1959 </div> 1960 1961 <!-- _______________________________________________________________________ --> 1962 <h4> 1963 <a name="dss_sparsebitvector">SparseBitVector</a> 1964 </h4> 1965 1966 <div> 1967 <p> The SparseBitVector container is much like BitVector, with one major 1968 difference: Only the bits that are set, are stored. This makes the 1969 SparseBitVector much more space efficient than BitVector when the set is sparse, 1970 as well as making set operations O(number of set bits) instead of O(size of 1971 universe). The downside to the SparseBitVector is that setting and testing of random bits is O(N), and on large SparseBitVectors, this can be slower than BitVector. In our implementation, setting or testing bits in sorted order 1972 (either forwards or reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends on size) of the current bit is also O(1). As a general statement, testing/setting bits in a SparseBitVector is O(distance away from last set bit). 1973 </p> 1974 </div> 1975 1976 </div> 1977 1978 </div> 1979 1980 <!-- *********************************************************************** --> 1981 <h2> 1982 <a name="common">Helpful Hints for Common Operations</a> 1983 </h2> 1984 <!-- *********************************************************************** --> 1985 1986 <div> 1987 1988 <p>This section describes how to perform some very simple transformations of 1989 LLVM code. This is meant to give examples of common idioms used, showing the 1990 practical side of LLVM transformations. <p> Because this is a "how-to" section, 1991 you should also read about the main classes that you will be working with. The 1992 <a href="#coreclasses">Core LLVM Class Hierarchy Reference</a> contains details 1993 and descriptions of the main classes that you should know about.</p> 1994 1995 <!-- NOTE: this section should be heavy on example code --> 1996 <!-- ======================================================================= --> 1997 <h3> 1998 <a name="inspection">Basic Inspection and Traversal Routines</a> 1999 </h3> 2000 2001 <div> 2002 2003 <p>The LLVM compiler infrastructure have many different data structures that may 2004 be traversed. Following the example of the C++ standard template library, the 2005 techniques used to traverse these various data structures are all basically the 2006 same. For a enumerable sequence of values, the <tt>XXXbegin()</tt> function (or 2007 method) returns an iterator to the start of the sequence, the <tt>XXXend()</tt> 2008 function returns an iterator pointing to one past the last valid element of the 2009 sequence, and there is some <tt>XXXiterator</tt> data type that is common 2010 between the two operations.</p> 2011 2012 <p>Because the pattern for iteration is common across many different aspects of 2013 the program representation, the standard template library algorithms may be used 2014 on them, and it is easier to remember how to iterate. First we show a few common 2015 examples of the data structures that need to be traversed. Other data 2016 structures are traversed in very similar ways.</p> 2017 2018 <!-- _______________________________________________________________________ --> 2019 <h4> 2020 <a name="iterate_function">Iterating over the </a><a 2021 href="#BasicBlock"><tt>BasicBlock</tt></a>s in a <a 2022 href="#Function"><tt>Function</tt></a> 2023 </h4> 2024 2025 <div> 2026 2027 <p>It's quite common to have a <tt>Function</tt> instance that you'd like to 2028 transform in some way; in particular, you'd like to manipulate its 2029 <tt>BasicBlock</tt>s. To facilitate this, you'll need to iterate over all of 2030 the <tt>BasicBlock</tt>s that constitute the <tt>Function</tt>. The following is 2031 an example that prints the name of a <tt>BasicBlock</tt> and the number of 2032 <tt>Instruction</tt>s it contains:</p> 2033 2034 <div class="doc_code"> 2035 <pre> 2036 // <i>func is a pointer to a Function instance</i> 2037 for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i) 2038 // <i>Print out the name of the basic block if it has one, and then the</i> 2039 // <i>number of instructions that it contains</i> 2040 errs() << "Basic block (name=" << i->getName() << ") has " 2041 << i->size() << " instructions.\n"; 2042 </pre> 2043 </div> 2044 2045 <p>Note that i can be used as if it were a pointer for the purposes of 2046 invoking member functions of the <tt>Instruction</tt> class. This is 2047 because the indirection operator is overloaded for the iterator 2048 classes. In the above code, the expression <tt>i->size()</tt> is 2049 exactly equivalent to <tt>(*i).size()</tt> just like you'd expect.</p> 2050 2051 </div> 2052 2053 <!-- _______________________________________________________________________ --> 2054 <h4> 2055 <a name="iterate_basicblock">Iterating over the </a><a 2056 href="#Instruction"><tt>Instruction</tt></a>s in a <a 2057 href="#BasicBlock"><tt>BasicBlock</tt></a> 2058 </h4> 2059 2060 <div> 2061 2062 <p>Just like when dealing with <tt>BasicBlock</tt>s in <tt>Function</tt>s, it's 2063 easy to iterate over the individual instructions that make up 2064 <tt>BasicBlock</tt>s. Here's a code snippet that prints out each instruction in 2065 a <tt>BasicBlock</tt>:</p> 2066 2067 <div class="doc_code"> 2068 <pre> 2069 // <i>blk is a pointer to a BasicBlock instance</i> 2070 for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i) 2071 // <i>The next statement works since operator<<(ostream&,...)</i> 2072 // <i>is overloaded for Instruction&</i> 2073 errs() << *i << "\n"; 2074 </pre> 2075 </div> 2076 2077 <p>However, this isn't really the best way to print out the contents of a 2078 <tt>BasicBlock</tt>! Since the ostream operators are overloaded for virtually 2079 anything you'll care about, you could have just invoked the print routine on the 2080 basic block itself: <tt>errs() << *blk << "\n";</tt>.</p> 2081 2082 </div> 2083 2084 <!-- _______________________________________________________________________ --> 2085 <h4> 2086 <a name="iterate_institer">Iterating over the </a><a 2087 href="#Instruction"><tt>Instruction</tt></a>s in a <a 2088 href="#Function"><tt>Function</tt></a> 2089 </h4> 2090 2091 <div> 2092 2093 <p>If you're finding that you commonly iterate over a <tt>Function</tt>'s 2094 <tt>BasicBlock</tt>s and then that <tt>BasicBlock</tt>'s <tt>Instruction</tt>s, 2095 <tt>InstIterator</tt> should be used instead. You'll need to include <a 2096 href="/doxygen/InstIterator_8h-source.html"><tt>llvm/Support/InstIterator.h</tt></a>, 2097 and then instantiate <tt>InstIterator</tt>s explicitly in your code. Here's a 2098 small example that shows how to dump all instructions in a function to the standard error stream:<p> 2099 2100 <div class="doc_code"> 2101 <pre> 2102 #include "<a href="/doxygen/InstIterator_8h-source.html">llvm/Support/InstIterator.h</a>" 2103 2104 // <i>F is a pointer to a Function instance</i> 2105 for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) 2106 errs() << *I << "\n"; 2107 </pre> 2108 </div> 2109 2110 <p>Easy, isn't it? You can also use <tt>InstIterator</tt>s to fill a 2111 work list with its initial contents. For example, if you wanted to 2112 initialize a work list to contain all instructions in a <tt>Function</tt> 2113 F, all you would need to do is something like:</p> 2114 2115 <div class="doc_code"> 2116 <pre> 2117 std::set<Instruction*> worklist; 2118 // or better yet, SmallPtrSet<Instruction*, 64> worklist; 2119 2120 for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I) 2121 worklist.insert(&*I); 2122 </pre> 2123 </div> 2124 2125 <p>The STL set <tt>worklist</tt> would now contain all instructions in the 2126 <tt>Function</tt> pointed to by F.</p> 2127 2128 </div> 2129 2130 <!-- _______________________________________________________________________ --> 2131 <h4> 2132 <a name="iterate_convert">Turning an iterator into a class pointer (and 2133 vice-versa)</a> 2134 </h4> 2135 2136 <div> 2137 2138 <p>Sometimes, it'll be useful to grab a reference (or pointer) to a class 2139 instance when all you've got at hand is an iterator. Well, extracting 2140 a reference or a pointer from an iterator is very straight-forward. 2141 Assuming that <tt>i</tt> is a <tt>BasicBlock::iterator</tt> and <tt>j</tt> 2142 is a <tt>BasicBlock::const_iterator</tt>:</p> 2143 2144 <div class="doc_code"> 2145 <pre> 2146 Instruction& inst = *i; // <i>Grab reference to instruction reference</i> 2147 Instruction* pinst = &*i; // <i>Grab pointer to instruction reference</i> 2148 const Instruction& inst = *j; 2149 </pre> 2150 </div> 2151 2152 <p>However, the iterators you'll be working with in the LLVM framework are 2153 special: they will automatically convert to a ptr-to-instance type whenever they 2154 need to. Instead of dereferencing the iterator and then taking the address of 2155 the result, you can simply assign the iterator to the proper pointer type and 2156 you get the dereference and address-of operation as a result of the assignment 2157 (behind the scenes, this is a result of overloading casting mechanisms). Thus 2158 the last line of the last example,</p> 2159 2160 <div class="doc_code"> 2161 <pre> 2162 Instruction *pinst = &*i; 2163 </pre> 2164 </div> 2165 2166 <p>is semantically equivalent to</p> 2167 2168 <div class="doc_code"> 2169 <pre> 2170 Instruction *pinst = i; 2171 </pre> 2172 </div> 2173 2174 <p>It's also possible to turn a class pointer into the corresponding iterator, 2175 and this is a constant time operation (very efficient). The following code 2176 snippet illustrates use of the conversion constructors provided by LLVM 2177 iterators. By using these, you can explicitly grab the iterator of something 2178 without actually obtaining it via iteration over some structure:</p> 2179 2180 <div class="doc_code"> 2181 <pre> 2182 void printNextInstruction(Instruction* inst) { 2183 BasicBlock::iterator it(inst); 2184 ++it; // <i>After this line, it refers to the instruction after *inst</i> 2185 if (it != inst->getParent()->end()) errs() << *it << "\n"; 2186 } 2187 </pre> 2188 </div> 2189 2190 <p>Unfortunately, these implicit conversions come at a cost; they prevent 2191 these iterators from conforming to standard iterator conventions, and thus 2192 from being usable with standard algorithms and containers. For example, they 2193 prevent the following code, where <tt>B</tt> is a <tt>BasicBlock</tt>, 2194 from compiling:</p> 2195 2196 <div class="doc_code"> 2197 <pre> 2198 llvm::SmallVector<llvm::Instruction *, 16>(B->begin(), B->end()); 2199 </pre> 2200 </div> 2201 2202 <p>Because of this, these implicit conversions may be removed some day, 2203 and <tt>operator*</tt> changed to return a pointer instead of a reference.</p> 2204 2205 </div> 2206 2207 <!--_______________________________________________________________________--> 2208 <h4> 2209 <a name="iterate_complex">Finding call sites: a slightly more complex 2210 example</a> 2211 </h4> 2212 2213 <div> 2214 2215 <p>Say that you're writing a FunctionPass and would like to count all the 2216 locations in the entire module (that is, across every <tt>Function</tt>) where a 2217 certain function (i.e., some <tt>Function</tt>*) is already in scope. As you'll 2218 learn later, you may want to use an <tt>InstVisitor</tt> to accomplish this in a 2219 much more straight-forward manner, but this example will allow us to explore how 2220 you'd do it if you didn't have <tt>InstVisitor</tt> around. In pseudo-code, this 2221 is what we want to do:</p> 2222 2223 <div class="doc_code"> 2224 <pre> 2225 initialize callCounter to zero 2226 for each Function f in the Module 2227 for each BasicBlock b in f 2228 for each Instruction i in b 2229 if (i is a CallInst and calls the given function) 2230 increment callCounter 2231 </pre> 2232 </div> 2233 2234 <p>And the actual code is (remember, because we're writing a 2235 <tt>FunctionPass</tt>, our <tt>FunctionPass</tt>-derived class simply has to 2236 override the <tt>runOnFunction</tt> method):</p> 2237 2238 <div class="doc_code"> 2239 <pre> 2240 Function* targetFunc = ...; 2241 2242 class OurFunctionPass : public FunctionPass { 2243 public: 2244 OurFunctionPass(): callCounter(0) { } 2245 2246 virtual runOnFunction(Function& F) { 2247 for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) { 2248 for (BasicBlock::iterator i = b->begin(), ie = b->end(); i != ie; ++i) { 2249 if (<a href="#CallInst">CallInst</a>* callInst = <a href="#isa">dyn_cast</a><<a 2250 href="#CallInst">CallInst</a>>(&*i)) { 2251 // <i>We know we've encountered a call instruction, so we</i> 2252 // <i>need to determine if it's a call to the</i> 2253 // <i>function pointed to by m_func or not.</i> 2254 if (callInst->getCalledFunction() == targetFunc) 2255 ++callCounter; 2256 } 2257 } 2258 } 2259 } 2260 2261 private: 2262 unsigned callCounter; 2263 }; 2264 </pre> 2265 </div> 2266 2267 </div> 2268 2269 <!--_______________________________________________________________________--> 2270 <h4> 2271 <a name="calls_and_invokes">Treating calls and invokes the same way</a> 2272 </h4> 2273 2274 <div> 2275 2276 <p>You may have noticed that the previous example was a bit oversimplified in 2277 that it did not deal with call sites generated by 'invoke' instructions. In 2278 this, and in other situations, you may find that you want to treat 2279 <tt>CallInst</tt>s and <tt>InvokeInst</tt>s the same way, even though their 2280 most-specific common base class is <tt>Instruction</tt>, which includes lots of 2281 less closely-related things. For these cases, LLVM provides a handy wrapper 2282 class called <a 2283 href="http://llvm.org/doxygen/classllvm_1_1CallSite.html"><tt>CallSite</tt></a>. 2284 It is essentially a wrapper around an <tt>Instruction</tt> pointer, with some 2285 methods that provide functionality common to <tt>CallInst</tt>s and 2286 <tt>InvokeInst</tt>s.</p> 2287 2288 <p>This class has "value semantics": it should be passed by value, not by 2289 reference and it should not be dynamically allocated or deallocated using 2290 <tt>operator new</tt> or <tt>operator delete</tt>. It is efficiently copyable, 2291 assignable and constructable, with costs equivalents to that of a bare pointer. 2292 If you look at its definition, it has only a single pointer member.</p> 2293 2294 </div> 2295 2296 <!--_______________________________________________________________________--> 2297 <h4> 2298 <a name="iterate_chains">Iterating over def-use & use-def chains</a> 2299 </h4> 2300 2301 <div> 2302 2303 <p>Frequently, we might have an instance of the <a 2304 href="/doxygen/classllvm_1_1Value.html">Value Class</a> and we want to 2305 determine which <tt>User</tt>s use the <tt>Value</tt>. The list of all 2306 <tt>User</tt>s of a particular <tt>Value</tt> is called a <i>def-use</i> chain. 2307 For example, let's say we have a <tt>Function*</tt> named <tt>F</tt> to a 2308 particular function <tt>foo</tt>. Finding all of the instructions that 2309 <i>use</i> <tt>foo</tt> is as simple as iterating over the <i>def-use</i> chain 2310 of <tt>F</tt>:</p> 2311 2312 <div class="doc_code"> 2313 <pre> 2314 Function *F = ...; 2315 2316 for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i) 2317 if (Instruction *Inst = dyn_cast<Instruction>(*i)) { 2318 errs() << "F is used in instruction:\n"; 2319 errs() << *Inst << "\n"; 2320 } 2321 </pre> 2322 </div> 2323 2324 <p>Note that dereferencing a <tt>Value::use_iterator</tt> is not a very cheap 2325 operation. Instead of performing <tt>*i</tt> above several times, consider 2326 doing it only once in the loop body and reusing its result.</p> 2327 2328 <p>Alternatively, it's common to have an instance of the <a 2329 href="/doxygen/classllvm_1_1User.html">User Class</a> and need to know what 2330 <tt>Value</tt>s are used by it. The list of all <tt>Value</tt>s used by a 2331 <tt>User</tt> is known as a <i>use-def</i> chain. Instances of class 2332 <tt>Instruction</tt> are common <tt>User</tt>s, so we might want to iterate over 2333 all of the values that a particular instruction uses (that is, the operands of 2334 the particular <tt>Instruction</tt>):</p> 2335 2336 <div class="doc_code"> 2337 <pre> 2338 Instruction *pi = ...; 2339 2340 for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) { 2341 Value *v = *i; 2342 // <i>...</i> 2343 } 2344 </pre> 2345 </div> 2346 2347 <p>Declaring objects as <tt>const</tt> is an important tool of enforcing 2348 mutation free algorithms (such as analyses, etc.). For this purpose above 2349 iterators come in constant flavors as <tt>Value::const_use_iterator</tt> 2350 and <tt>Value::const_op_iterator</tt>. They automatically arise when 2351 calling <tt>use/op_begin()</tt> on <tt>const Value*</tt>s or 2352 <tt>const User*</tt>s respectively. Upon dereferencing, they return 2353 <tt>const Use*</tt>s. Otherwise the above patterns remain unchanged.</p> 2354 2355 </div> 2356 2357 <!--_______________________________________________________________________--> 2358 <h4> 2359 <a name="iterate_preds">Iterating over predecessors & 2360 successors of blocks</a> 2361 </h4> 2362 2363 <div> 2364 2365 <p>Iterating over the predecessors and successors of a block is quite easy 2366 with the routines defined in <tt>"llvm/Support/CFG.h"</tt>. Just use code like 2367 this to iterate over all predecessors of BB:</p> 2368 2369 <div class="doc_code"> 2370 <pre> 2371 #include "llvm/Support/CFG.h" 2372 BasicBlock *BB = ...; 2373 2374 for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) { 2375 BasicBlock *Pred = *PI; 2376 // <i>...</i> 2377 } 2378 </pre> 2379 </div> 2380 2381 <p>Similarly, to iterate over successors use 2382 succ_iterator/succ_begin/succ_end.</p> 2383 2384 </div> 2385 2386 </div> 2387 2388 <!-- ======================================================================= --> 2389 <h3> 2390 <a name="simplechanges">Making simple changes</a> 2391 </h3> 2392 2393 <div> 2394 2395 <p>There are some primitive transformation operations present in the LLVM 2396 infrastructure that are worth knowing about. When performing 2397 transformations, it's fairly common to manipulate the contents of basic 2398 blocks. This section describes some of the common methods for doing so 2399 and gives example code.</p> 2400 2401 <!--_______________________________________________________________________--> 2402 <h4> 2403 <a name="schanges_creating">Creating and inserting new 2404 <tt>Instruction</tt>s</a> 2405 </h4> 2406 2407 <div> 2408 2409 <p><i>Instantiating Instructions</i></p> 2410 2411 <p>Creation of <tt>Instruction</tt>s is straight-forward: simply call the 2412 constructor for the kind of instruction to instantiate and provide the necessary 2413 parameters. For example, an <tt>AllocaInst</tt> only <i>requires</i> a 2414 (const-ptr-to) <tt>Type</tt>. Thus:</p> 2415 2416 <div class="doc_code"> 2417 <pre> 2418 AllocaInst* ai = new AllocaInst(Type::Int32Ty); 2419 </pre> 2420 </div> 2421 2422 <p>will create an <tt>AllocaInst</tt> instance that represents the allocation of 2423 one integer in the current stack frame, at run time. Each <tt>Instruction</tt> 2424 subclass is likely to have varying default parameters which change the semantics 2425 of the instruction, so refer to the <a 2426 href="/doxygen/classllvm_1_1Instruction.html">doxygen documentation for the subclass of 2427 Instruction</a> that you're interested in instantiating.</p> 2428 2429 <p><i>Naming values</i></p> 2430 2431 <p>It is very useful to name the values of instructions when you're able to, as 2432 this facilitates the debugging of your transformations. If you end up looking 2433 at generated LLVM machine code, you definitely want to have logical names 2434 associated with the results of instructions! By supplying a value for the 2435 <tt>Name</tt> (default) parameter of the <tt>Instruction</tt> constructor, you 2436 associate a logical name with the result of the instruction's execution at 2437 run time. For example, say that I'm writing a transformation that dynamically 2438 allocates space for an integer on the stack, and that integer is going to be 2439 used as some kind of index by some other code. To accomplish this, I place an 2440 <tt>AllocaInst</tt> at the first point in the first <tt>BasicBlock</tt> of some 2441 <tt>Function</tt>, and I'm intending to use it within the same 2442 <tt>Function</tt>. I might do:</p> 2443 2444 <div class="doc_code"> 2445 <pre> 2446 AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc"); 2447 </pre> 2448 </div> 2449 2450 <p>where <tt>indexLoc</tt> is now the logical name of the instruction's 2451 execution value, which is a pointer to an integer on the run time stack.</p> 2452 2453 <p><i>Inserting instructions</i></p> 2454 2455 <p>There are essentially two ways to insert an <tt>Instruction</tt> 2456 into an existing sequence of instructions that form a <tt>BasicBlock</tt>:</p> 2457 2458 <ul> 2459 <li>Insertion into an explicit instruction list 2460 2461 <p>Given a <tt>BasicBlock* pb</tt>, an <tt>Instruction* pi</tt> within that 2462 <tt>BasicBlock</tt>, and a newly-created instruction we wish to insert 2463 before <tt>*pi</tt>, we do the following: </p> 2464 2465 <div class="doc_code"> 2466 <pre> 2467 BasicBlock *pb = ...; 2468 Instruction *pi = ...; 2469 Instruction *newInst = new Instruction(...); 2470 2471 pb->getInstList().insert(pi, newInst); // <i>Inserts newInst before pi in pb</i> 2472 </pre> 2473 </div> 2474 2475 <p>Appending to the end of a <tt>BasicBlock</tt> is so common that 2476 the <tt>Instruction</tt> class and <tt>Instruction</tt>-derived 2477 classes provide constructors which take a pointer to a 2478 <tt>BasicBlock</tt> to be appended to. For example code that 2479 looked like: </p> 2480 2481 <div class="doc_code"> 2482 <pre> 2483 BasicBlock *pb = ...; 2484 Instruction *newInst = new Instruction(...); 2485 2486 pb->getInstList().push_back(newInst); // <i>Appends newInst to pb</i> 2487 </pre> 2488 </div> 2489 2490 <p>becomes: </p> 2491 2492 <div class="doc_code"> 2493 <pre> 2494 BasicBlock *pb = ...; 2495 Instruction *newInst = new Instruction(..., pb); 2496 </pre> 2497 </div> 2498 2499 <p>which is much cleaner, especially if you are creating 2500 long instruction streams.</p></li> 2501 2502 <li>Insertion into an implicit instruction list 2503 2504 <p><tt>Instruction</tt> instances that are already in <tt>BasicBlock</tt>s 2505 are implicitly associated with an existing instruction list: the instruction 2506 list of the enclosing basic block. Thus, we could have accomplished the same 2507 thing as the above code without being given a <tt>BasicBlock</tt> by doing: 2508 </p> 2509 2510 <div class="doc_code"> 2511 <pre> 2512 Instruction *pi = ...; 2513 Instruction *newInst = new Instruction(...); 2514 2515 pi->getParent()->getInstList().insert(pi, newInst); 2516 </pre> 2517 </div> 2518 2519 <p>In fact, this sequence of steps occurs so frequently that the 2520 <tt>Instruction</tt> class and <tt>Instruction</tt>-derived classes provide 2521 constructors which take (as a default parameter) a pointer to an 2522 <tt>Instruction</tt> which the newly-created <tt>Instruction</tt> should 2523 precede. That is, <tt>Instruction</tt> constructors are capable of 2524 inserting the newly-created instance into the <tt>BasicBlock</tt> of a 2525 provided instruction, immediately before that instruction. Using an 2526 <tt>Instruction</tt> constructor with a <tt>insertBefore</tt> (default) 2527 parameter, the above code becomes:</p> 2528 2529 <div class="doc_code"> 2530 <pre> 2531 Instruction* pi = ...; 2532 Instruction* newInst = new Instruction(..., pi); 2533 </pre> 2534 </div> 2535 2536 <p>which is much cleaner, especially if you're creating a lot of 2537 instructions and adding them to <tt>BasicBlock</tt>s.</p></li> 2538 </ul> 2539 2540 </div> 2541 2542 <!--_______________________________________________________________________--> 2543 <h4> 2544 <a name="schanges_deleting">Deleting <tt>Instruction</tt>s</a> 2545 </h4> 2546 2547 <div> 2548 2549 <p>Deleting an instruction from an existing sequence of instructions that form a 2550 <a href="#BasicBlock"><tt>BasicBlock</tt></a> is very straight-forward: just 2551 call the instruction's eraseFromParent() method. For example:</p> 2552 2553 <div class="doc_code"> 2554 <pre> 2555 <a href="#Instruction">Instruction</a> *I = .. ; 2556 I->eraseFromParent(); 2557 </pre> 2558 </div> 2559 2560 <p>This unlinks the instruction from its containing basic block and deletes 2561 it. If you'd just like to unlink the instruction from its containing basic 2562 block but not delete it, you can use the <tt>removeFromParent()</tt> method.</p> 2563 2564 </div> 2565 2566 <!--_______________________________________________________________________--> 2567 <h4> 2568 <a name="schanges_replacing">Replacing an <tt>Instruction</tt> with another 2569 <tt>Value</tt></a> 2570 </h4> 2571 2572 <div> 2573 2574 <h5><i>Replacing individual instructions</i></h5> 2575 2576 <p>Including "<a href="/doxygen/BasicBlockUtils_8h-source.html">llvm/Transforms/Utils/BasicBlockUtils.h</a>" 2577 permits use of two very useful replace functions: <tt>ReplaceInstWithValue</tt> 2578 and <tt>ReplaceInstWithInst</tt>.</p> 2579 2580 <h5><a name="schanges_deleting">Deleting <tt>Instruction</tt>s</a></h5> 2581 2582 <div> 2583 <ul> 2584 <li><tt>ReplaceInstWithValue</tt> 2585 2586 <p>This function replaces all uses of a given instruction with a value, 2587 and then removes the original instruction. The following example 2588 illustrates the replacement of the result of a particular 2589 <tt>AllocaInst</tt> that allocates memory for a single integer with a null 2590 pointer to an integer.</p> 2591 2592 <div class="doc_code"> 2593 <pre> 2594 AllocaInst* instToReplace = ...; 2595 BasicBlock::iterator ii(instToReplace); 2596 2597 ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii, 2598 Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty))); 2599 </pre></div></li> 2600 2601 <li><tt>ReplaceInstWithInst</tt> 2602 2603 <p>This function replaces a particular instruction with another 2604 instruction, inserting the new instruction into the basic block at the 2605 location where the old instruction was, and replacing any uses of the old 2606 instruction with the new instruction. The following example illustrates 2607 the replacement of one <tt>AllocaInst</tt> with another.</p> 2608 2609 <div class="doc_code"> 2610 <pre> 2611 AllocaInst* instToReplace = ...; 2612 BasicBlock::iterator ii(instToReplace); 2613 2614 ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii, 2615 new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt")); 2616 </pre></div></li> 2617 </ul> 2618 2619 </div> 2620 2621 <h5><i>Replacing multiple uses of <tt>User</tt>s and <tt>Value</tt>s</i></h5> 2622 2623 <p>You can use <tt>Value::replaceAllUsesWith</tt> and 2624 <tt>User::replaceUsesOfWith</tt> to change more than one use at a time. See the 2625 doxygen documentation for the <a href="/doxygen/classllvm_1_1Value.html">Value Class</a> 2626 and <a href="/doxygen/classllvm_1_1User.html">User Class</a>, respectively, for more 2627 information.</p> 2628 2629 <!-- Value::replaceAllUsesWith User::replaceUsesOfWith Point out: 2630 include/llvm/Transforms/Utils/ especially BasicBlockUtils.h with: 2631 ReplaceInstWithValue, ReplaceInstWithInst --> 2632 2633 </div> 2634 2635 <!--_______________________________________________________________________--> 2636 <h4> 2637 <a name="schanges_deletingGV">Deleting <tt>GlobalVariable</tt>s</a> 2638 </h4> 2639 2640 <div> 2641 2642 <p>Deleting a global variable from a module is just as easy as deleting an 2643 Instruction. First, you must have a pointer to the global variable that you wish 2644 to delete. You use this pointer to erase it from its parent, the module. 2645 For example:</p> 2646 2647 <div class="doc_code"> 2648 <pre> 2649 <a href="#GlobalVariable">GlobalVariable</a> *GV = .. ; 2650 2651 GV->eraseFromParent(); 2652 </pre> 2653 </div> 2654 2655 </div> 2656 2657 </div> 2658 2659 <!-- ======================================================================= --> 2660 <h3> 2661 <a name="create_types">How to Create Types</a> 2662 </h3> 2663 2664 <div> 2665 2666 <p>In generating IR, you may need some complex types. If you know these types 2667 statically, you can use <tt>TypeBuilder<...>::get()</tt>, defined 2668 in <tt>llvm/Support/TypeBuilder.h</tt>, to retrieve them. <tt>TypeBuilder</tt> 2669 has two forms depending on whether you're building types for cross-compilation 2670 or native library use. <tt>TypeBuilder<T, true></tt> requires 2671 that <tt>T</tt> be independent of the host environment, meaning that it's built 2672 out of types from 2673 the <a href="/doxygen/namespacellvm_1_1types.html"><tt>llvm::types</tt></a> 2674 namespace and pointers, functions, arrays, etc. built of 2675 those. <tt>TypeBuilder<T, false></tt> additionally allows native C types 2676 whose size may depend on the host compiler. For example,</p> 2677 2678 <div class="doc_code"> 2679 <pre> 2680 FunctionType *ft = TypeBuilder<types::i<8>(types::i<32>*), true>::get(); 2681 </pre> 2682 </div> 2683 2684 <p>is easier to read and write than the equivalent</p> 2685 2686 <div class="doc_code"> 2687 <pre> 2688 std::vector<const Type*> params; 2689 params.push_back(PointerType::getUnqual(Type::Int32Ty)); 2690 FunctionType *ft = FunctionType::get(Type::Int8Ty, params, false); 2691 </pre> 2692 </div> 2693 2694 <p>See the <a href="/doxygen/TypeBuilder_8h-source.html#l00001">class 2695 comment</a> for more details.</p> 2696 2697 </div> 2698 2699 </div> 2700 2701 <!-- *********************************************************************** --> 2702 <h2> 2703 <a name="threading">Threads and LLVM</a> 2704 </h2> 2705 <!-- *********************************************************************** --> 2706 2707 <div> 2708 <p> 2709 This section describes the interaction of the LLVM APIs with multithreading, 2710 both on the part of client applications, and in the JIT, in the hosted 2711 application. 2712 </p> 2713 2714 <p> 2715 Note that LLVM's support for multithreading is still relatively young. Up 2716 through version 2.5, the execution of threaded hosted applications was 2717 supported, but not threaded client access to the APIs. While this use case is 2718 now supported, clients <em>must</em> adhere to the guidelines specified below to 2719 ensure proper operation in multithreaded mode. 2720 </p> 2721 2722 <p> 2723 Note that, on Unix-like platforms, LLVM requires the presence of GCC's atomic 2724 intrinsics in order to support threaded operation. If you need a 2725 multhreading-capable LLVM on a platform without a suitably modern system 2726 compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and 2727 using the resultant compiler to build a copy of LLVM with multithreading 2728 support. 2729 </p> 2730 2731 <!-- ======================================================================= --> 2732 <h3> 2733 <a name="startmultithreaded">Entering and Exiting Multithreaded Mode</a> 2734 </h3> 2735 2736 <div> 2737 2738 <p> 2739 In order to properly protect its internal data structures while avoiding 2740 excessive locking overhead in the single-threaded case, the LLVM must intialize 2741 certain data structures necessary to provide guards around its internals. To do 2742 so, the client program must invoke <tt>llvm_start_multithreaded()</tt> before 2743 making any concurrent LLVM API calls. To subsequently tear down these 2744 structures, use the <tt>llvm_stop_multithreaded()</tt> call. You can also use 2745 the <tt>llvm_is_multithreaded()</tt> call to check the status of multithreaded 2746 mode. 2747 </p> 2748 2749 <p> 2750 Note that both of these calls must be made <em>in isolation</em>. That is to 2751 say that no other LLVM API calls may be executing at any time during the 2752 execution of <tt>llvm_start_multithreaded()</tt> or <tt>llvm_stop_multithreaded 2753 </tt>. It's is the client's responsibility to enforce this isolation. 2754 </p> 2755 2756 <p> 2757 The return value of <tt>llvm_start_multithreaded()</tt> indicates the success or 2758 failure of the initialization. Failure typically indicates that your copy of 2759 LLVM was built without multithreading support, typically because GCC atomic 2760 intrinsics were not found in your system compiler. In this case, the LLVM API 2761 will not be safe for concurrent calls. However, it <em>will</em> be safe for 2762 hosting threaded applications in the JIT, though <a href="#jitthreading">care 2763 must be taken</a> to ensure that side exits and the like do not accidentally 2764 result in concurrent LLVM API calls. 2765 </p> 2766 </div> 2767 2768 <!-- ======================================================================= --> 2769 <h3> 2770 <a name="shutdown">Ending Execution with <tt>llvm_shutdown()</tt></a> 2771 </h3> 2772 2773 <div> 2774 <p> 2775 When you are done using the LLVM APIs, you should call <tt>llvm_shutdown()</tt> 2776 to deallocate memory used for internal structures. This will also invoke 2777 <tt>llvm_stop_multithreaded()</tt> if LLVM is operating in multithreaded mode. 2778 As such, <tt>llvm_shutdown()</tt> requires the same isolation guarantees as 2779 <tt>llvm_stop_multithreaded()</tt>. 2780 </p> 2781 2782 <p> 2783 Note that, if you use scope-based shutdown, you can use the 2784 <tt>llvm_shutdown_obj</tt> class, which calls <tt>llvm_shutdown()</tt> in its 2785 destructor. 2786 </div> 2787 2788 <!-- ======================================================================= --> 2789 <h3> 2790 <a name="managedstatic">Lazy Initialization with <tt>ManagedStatic</tt></a> 2791 </h3> 2792 2793 <div> 2794 <p> 2795 <tt>ManagedStatic</tt> is a utility class in LLVM used to implement static 2796 initialization of static resources, such as the global type tables. Before the 2797 invocation of <tt>llvm_shutdown()</tt>, it implements a simple lazy 2798 initialization scheme. Once <tt>llvm_start_multithreaded()</tt> returns, 2799 however, it uses double-checked locking to implement thread-safe lazy 2800 initialization. 2801 </p> 2802 2803 <p> 2804 Note that, because no other threads are allowed to issue LLVM API calls before 2805 <tt>llvm_start_multithreaded()</tt> returns, it is possible to have 2806 <tt>ManagedStatic</tt>s of <tt>llvm::sys::Mutex</tt>s. 2807 </p> 2808 2809 <p> 2810 The <tt>llvm_acquire_global_lock()</tt> and <tt>llvm_release_global_lock</tt> 2811 APIs provide access to the global lock used to implement the double-checked 2812 locking for lazy initialization. These should only be used internally to LLVM, 2813 and only if you know what you're doing! 2814 </p> 2815 </div> 2816 2817 <!-- ======================================================================= --> 2818 <h3> 2819 <a name="llvmcontext">Achieving Isolation with <tt>LLVMContext</tt></a> 2820 </h3> 2821 2822 <div> 2823 <p> 2824 <tt>LLVMContext</tt> is an opaque class in the LLVM API which clients can use 2825 to operate multiple, isolated instances of LLVM concurrently within the same 2826 address space. For instance, in a hypothetical compile-server, the compilation 2827 of an individual translation unit is conceptually independent from all the 2828 others, and it would be desirable to be able to compile incoming translation 2829 units concurrently on independent server threads. Fortunately, 2830 <tt>LLVMContext</tt> exists to enable just this kind of scenario! 2831 </p> 2832 2833 <p> 2834 Conceptually, <tt>LLVMContext</tt> provides isolation. Every LLVM entity 2835 (<tt>Module</tt>s, <tt>Value</tt>s, <tt>Type</tt>s, <tt>Constant</tt>s, etc.) 2836 in LLVM's in-memory IR belongs to an <tt>LLVMContext</tt>. Entities in 2837 different contexts <em>cannot</em> interact with each other: <tt>Module</tt>s in 2838 different contexts cannot be linked together, <tt>Function</tt>s cannot be added 2839 to <tt>Module</tt>s in different contexts, etc. What this means is that is is 2840 safe to compile on multiple threads simultaneously, as long as no two threads 2841 operate on entities within the same context. 2842 </p> 2843 2844 <p> 2845 In practice, very few places in the API require the explicit specification of a 2846 <tt>LLVMContext</tt>, other than the <tt>Type</tt> creation/lookup APIs. 2847 Because every <tt>Type</tt> carries a reference to its owning context, most 2848 other entities can determine what context they belong to by looking at their 2849 own <tt>Type</tt>. If you are adding new entities to LLVM IR, please try to 2850 maintain this interface design. 2851 </p> 2852 2853 <p> 2854 For clients that do <em>not</em> require the benefits of isolation, LLVM 2855 provides a convenience API <tt>getGlobalContext()</tt>. This returns a global, 2856 lazily initialized <tt>LLVMContext</tt> that may be used in situations where 2857 isolation is not a concern. 2858 </p> 2859 </div> 2860 2861 <!-- ======================================================================= --> 2862 <h3> 2863 <a name="jitthreading">Threads and the JIT</a> 2864 </h3> 2865 2866 <div> 2867 <p> 2868 LLVM's "eager" JIT compiler is safe to use in threaded programs. Multiple 2869 threads can call <tt>ExecutionEngine::getPointerToFunction()</tt> or 2870 <tt>ExecutionEngine::runFunction()</tt> concurrently, and multiple threads can 2871 run code output by the JIT concurrently. The user must still ensure that only 2872 one thread accesses IR in a given <tt>LLVMContext</tt> while another thread 2873 might be modifying it. One way to do that is to always hold the JIT lock while 2874 accessing IR outside the JIT (the JIT <em>modifies</em> the IR by adding 2875 <tt>CallbackVH</tt>s). Another way is to only 2876 call <tt>getPointerToFunction()</tt> from the <tt>LLVMContext</tt>'s thread. 2877 </p> 2878 2879 <p>When the JIT is configured to compile lazily (using 2880 <tt>ExecutionEngine::DisableLazyCompilation(false)</tt>), there is currently a 2881 <a href="http://llvm.org/bugs/show_bug.cgi?id=5184">race condition</a> in 2882 updating call sites after a function is lazily-jitted. It's still possible to 2883 use the lazy JIT in a threaded program if you ensure that only one thread at a 2884 time can call any particular lazy stub and that the JIT lock guards any IR 2885 access, but we suggest using only the eager JIT in threaded programs. 2886 </p> 2887 </div> 2888 2889 </div> 2890 2891 <!-- *********************************************************************** --> 2892 <h2> 2893 <a name="advanced">Advanced Topics</a> 2894 </h2> 2895 <!-- *********************************************************************** --> 2896 2897 <div> 2898 <p> 2899 This section describes some of the advanced or obscure API's that most clients 2900 do not need to be aware of. These API's tend manage the inner workings of the 2901 LLVM system, and only need to be accessed in unusual circumstances. 2902 </p> 2903 2904 2905 <!-- ======================================================================= --> 2906 <h3> 2907 <a name="SymbolTable">The <tt>ValueSymbolTable</tt> class</a> 2908 </h3> 2909 2910 <div> 2911 <p>The <tt><a href="http://llvm.org/doxygen/classllvm_1_1ValueSymbolTable.html"> 2912 ValueSymbolTable</a></tt> class provides a symbol table that the <a 2913 href="#Function"><tt>Function</tt></a> and <a href="#Module"> 2914 <tt>Module</tt></a> classes use for naming value definitions. The symbol table 2915 can provide a name for any <a href="#Value"><tt>Value</tt></a>. 2916 </p> 2917 2918 <p>Note that the <tt>SymbolTable</tt> class should not be directly accessed 2919 by most clients. It should only be used when iteration over the symbol table 2920 names themselves are required, which is very special purpose. Note that not 2921 all LLVM 2922 <tt><a href="#Value">Value</a></tt>s have names, and those without names (i.e. they have 2923 an empty name) do not exist in the symbol table. 2924 </p> 2925 2926 <p>Symbol tables support iteration over the values in the symbol 2927 table with <tt>begin/end/iterator</tt> and supports querying to see if a 2928 specific name is in the symbol table (with <tt>lookup</tt>). The 2929 <tt>ValueSymbolTable</tt> class exposes no public mutator methods, instead, 2930 simply call <tt>setName</tt> on a value, which will autoinsert it into the 2931 appropriate symbol table.</p> 2932 2933 </div> 2934 2935 2936 2937 <!-- ======================================================================= --> 2938 <h3> 2939 <a name="UserLayout">The <tt>User</tt> and owned <tt>Use</tt> classes' memory layout</a> 2940 </h3> 2941 2942 <div> 2943 <p>The <tt><a href="http://llvm.org/doxygen/classllvm_1_1User.html"> 2944 User</a></tt> class provides a basis for expressing the ownership of <tt>User</tt> 2945 towards other <tt><a href="http://llvm.org/doxygen/classllvm_1_1Value.html"> 2946 Value</a></tt>s. The <tt><a href="http://llvm.org/doxygen/classllvm_1_1Use.html"> 2947 Use</a></tt> helper class is employed to do the bookkeeping and to facilitate <i>O(1)</i> 2948 addition and removal.</p> 2949 2950 <!-- ______________________________________________________________________ --> 2951 <h4> 2952 <a name="Use2User"> 2953 Interaction and relationship between <tt>User</tt> and <tt>Use</tt> objects 2954 </a> 2955 </h4> 2956 2957 <div> 2958 <p> 2959 A subclass of <tt>User</tt> can choose between incorporating its <tt>Use</tt> objects 2960 or refer to them out-of-line by means of a pointer. A mixed variant 2961 (some <tt>Use</tt>s inline others hung off) is impractical and breaks the invariant 2962 that the <tt>Use</tt> objects belonging to the same <tt>User</tt> form a contiguous array. 2963 </p> 2964 2965 <p> 2966 We have 2 different layouts in the <tt>User</tt> (sub)classes: 2967 <ul> 2968 <li><p>Layout a) 2969 The <tt>Use</tt> object(s) are inside (resp. at fixed offset) of the <tt>User</tt> 2970 object and there are a fixed number of them.</p> 2971 2972 <li><p>Layout b) 2973 The <tt>Use</tt> object(s) are referenced by a pointer to an 2974 array from the <tt>User</tt> object and there may be a variable 2975 number of them.</p> 2976 </ul> 2977 <p> 2978 As of v2.4 each layout still possesses a direct pointer to the 2979 start of the array of <tt>Use</tt>s. Though not mandatory for layout a), 2980 we stick to this redundancy for the sake of simplicity. 2981 The <tt>User</tt> object also stores the number of <tt>Use</tt> objects it 2982 has. (Theoretically this information can also be calculated 2983 given the scheme presented below.)</p> 2984 <p> 2985 Special forms of allocation operators (<tt>operator new</tt>) 2986 enforce the following memory layouts:</p> 2987 2988 <ul> 2989 <li><p>Layout a) is modelled by prepending the <tt>User</tt> object by the <tt>Use[]</tt> array.</p> 2990 2991 <pre> 2992 ...---.---.---.---.-------... 2993 | P | P | P | P | User 2994 '''---'---'---'---'-------''' 2995 </pre> 2996 2997 <li><p>Layout b) is modelled by pointing at the <tt>Use[]</tt> array.</p> 2998 <pre> 2999 .-------... 3000 | User 3001 '-------''' 3002 | 3003 v 3004 .---.---.---.---... 3005 | P | P | P | P | 3006 '---'---'---'---''' 3007 </pre> 3008 </ul> 3009 <i>(In the above figures '<tt>P</tt>' stands for the <tt>Use**</tt> that 3010 is stored in each <tt>Use</tt> object in the member <tt>Use::Prev</tt>)</i> 3011 3012 </div> 3013 3014 <!-- ______________________________________________________________________ --> 3015 <h4> 3016 <a name="Waymarking">The waymarking algorithm</a> 3017 </h4> 3018 3019 <div> 3020 <p> 3021 Since the <tt>Use</tt> objects are deprived of the direct (back)pointer to 3022 their <tt>User</tt> objects, there must be a fast and exact method to 3023 recover it. This is accomplished by the following scheme:</p> 3024 3025 A bit-encoding in the 2 LSBits (least significant bits) of the <tt>Use::Prev</tt> allows to find the 3026 start of the <tt>User</tt> object: 3027 <ul> 3028 <li><tt>00</tt> —> binary digit 0</li> 3029 <li><tt>01</tt> —> binary digit 1</li> 3030 <li><tt>10</tt> —> stop and calculate (<tt>s</tt>)</li> 3031 <li><tt>11</tt> —> full stop (<tt>S</tt>)</li> 3032 </ul> 3033 <p> 3034 Given a <tt>Use*</tt>, all we have to do is to walk till we get 3035 a stop and we either have a <tt>User</tt> immediately behind or 3036 we have to walk to the next stop picking up digits 3037 and calculating the offset:</p> 3038 <pre> 3039 .---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---------------- 3040 | 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*) 3041 '---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---------------- 3042 |+15 |+10 |+6 |+3 |+1 3043 | | | | |__> 3044 | | | |__________> 3045 | | |______________________> 3046 | |______________________________________> 3047 |__________________________________________________________> 3048 </pre> 3049 <p> 3050 Only the significant number of bits need to be stored between the 3051 stops, so that the <i>worst case is 20 memory accesses</i> when there are 3052 1000 <tt>Use</tt> objects associated with a <tt>User</tt>.</p> 3053 3054 </div> 3055 3056 <!-- ______________________________________________________________________ --> 3057 <h4> 3058 <a name="ReferenceImpl">Reference implementation</a> 3059 </h4> 3060 3061 <div> 3062 <p> 3063 The following literate Haskell fragment demonstrates the concept:</p> 3064 3065 <div class="doc_code"> 3066 <pre> 3067 > import Test.QuickCheck 3068 > 3069 > digits :: Int -> [Char] -> [Char] 3070 > digits 0 acc = '0' : acc 3071 > digits 1 acc = '1' : acc 3072 > digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc 3073 > 3074 > dist :: Int -> [Char] -> [Char] 3075 > dist 0 [] = ['S'] 3076 > dist 0 acc = acc 3077 > dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r 3078 > dist n acc = dist (n - 1) $ dist 1 acc 3079 > 3080 > takeLast n ss = reverse $ take n $ reverse ss 3081 > 3082 > test = takeLast 40 $ dist 20 [] 3083 > 3084 </pre> 3085 </div> 3086 <p> 3087 Printing <test> gives: <tt>"1s100000s11010s10100s1111s1010s110s11s1S"</tt></p> 3088 <p> 3089 The reverse algorithm computes the length of the string just by examining 3090 a certain prefix:</p> 3091 3092 <div class="doc_code"> 3093 <pre> 3094 > pref :: [Char] -> Int 3095 > pref "S" = 1 3096 > pref ('s':'1':rest) = decode 2 1 rest 3097 > pref (_:rest) = 1 + pref rest 3098 > 3099 > decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest 3100 > decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest 3101 > decode walk acc _ = walk + acc 3102 > 3103 </pre> 3104 </div> 3105 <p> 3106 Now, as expected, printing <pref test> gives <tt>40</tt>.</p> 3107 <p> 3108 We can <i>quickCheck</i> this with following property:</p> 3109 3110 <div class="doc_code"> 3111 <pre> 3112 > testcase = dist 2000 [] 3113 > testcaseLength = length testcase 3114 > 3115 > identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr 3116 > where arr = takeLast n testcase 3117 > 3118 </pre> 3119 </div> 3120 <p> 3121 As expected <quickCheck identityProp> gives:</p> 3122 3123 <pre> 3124 *Main> quickCheck identityProp 3125 OK, passed 100 tests. 3126 </pre> 3127 <p> 3128 Let's be a bit more exhaustive:</p> 3129 3130 <div class="doc_code"> 3131 <pre> 3132 > 3133 > deepCheck p = check (defaultConfig { configMaxTest = 500 }) p 3134 > 3135 </pre> 3136 </div> 3137 <p> 3138 And here is the result of <deepCheck identityProp>:</p> 3139 3140 <pre> 3141 *Main> deepCheck identityProp 3142 OK, passed 500 tests. 3143 </pre> 3144 3145 </div> 3146 3147 <!-- ______________________________________________________________________ --> 3148 <h4> 3149 <a name="Tagging">Tagging considerations</a> 3150 </h4> 3151 3152 <div> 3153 3154 <p> 3155 To maintain the invariant that the 2 LSBits of each <tt>Use**</tt> in <tt>Use</tt> 3156 never change after being set up, setters of <tt>Use::Prev</tt> must re-tag the 3157 new <tt>Use**</tt> on every modification. Accordingly getters must strip the 3158 tag bits.</p> 3159 <p> 3160 For layout b) instead of the <tt>User</tt> we find a pointer (<tt>User*</tt> with LSBit set). 3161 Following this pointer brings us to the <tt>User</tt>. A portable trick ensures 3162 that the first bytes of <tt>User</tt> (if interpreted as a pointer) never has 3163 the LSBit set. (Portability is relying on the fact that all known compilers place the 3164 <tt>vptr</tt> in the first word of the instances.)</p> 3165 3166 </div> 3167 3168 </div> 3169 3170 </div> 3171 3172 <!-- *********************************************************************** --> 3173 <h2> 3174 <a name="coreclasses">The Core LLVM Class Hierarchy Reference </a> 3175 </h2> 3176 <!-- *********************************************************************** --> 3177 3178 <div> 3179 <p><tt>#include "<a href="/doxygen/Type_8h-source.html">llvm/Type.h</a>"</tt> 3180 <br>doxygen info: <a href="/doxygen/classllvm_1_1Type.html">Type Class</a></p> 3181 3182 <p>The Core LLVM classes are the primary means of representing the program 3183 being inspected or transformed. The core LLVM classes are defined in 3184 header files in the <tt>include/llvm/</tt> directory, and implemented in 3185 the <tt>lib/VMCore</tt> directory.</p> 3186 3187 <!-- ======================================================================= --> 3188 <h3> 3189 <a name="Type">The <tt>Type</tt> class and Derived Types</a> 3190 </h3> 3191 3192 <div> 3193 3194 <p><tt>Type</tt> is a superclass of all type classes. Every <tt>Value</tt> has 3195 a <tt>Type</tt>. <tt>Type</tt> cannot be instantiated directly but only 3196 through its subclasses. Certain primitive types (<tt>VoidType</tt>, 3197 <tt>LabelType</tt>, <tt>FloatType</tt> and <tt>DoubleType</tt>) have hidden 3198 subclasses. They are hidden because they offer no useful functionality beyond 3199 what the <tt>Type</tt> class offers except to distinguish themselves from 3200 other subclasses of <tt>Type</tt>.</p> 3201 <p>All other types are subclasses of <tt>DerivedType</tt>. Types can be 3202 named, but this is not a requirement. There exists exactly 3203 one instance of a given shape at any one time. This allows type equality to 3204 be performed with address equality of the Type Instance. That is, given two 3205 <tt>Type*</tt> values, the types are identical if the pointers are identical. 3206 </p> 3207 3208 <!-- _______________________________________________________________________ --> 3209 <h4> 3210 <a name="m_Type">Important Public Methods</a> 3211 </h4> 3212 3213 <div> 3214 3215 <ul> 3216 <li><tt>bool isIntegerTy() const</tt>: Returns true for any integer type.</li> 3217 3218 <li><tt>bool isFloatingPointTy()</tt>: Return true if this is one of the five 3219 floating point types.</li> 3220 3221 <li><tt>bool isSized()</tt>: Return true if the type has known size. Things 3222 that don't have a size are abstract types, labels and void.</li> 3223 3224 </ul> 3225 </div> 3226 3227 <!-- _______________________________________________________________________ --> 3228 <h4> 3229 <a name="derivedtypes">Important Derived Types</a> 3230 </h4> 3231 <div> 3232 <dl> 3233 <dt><tt>IntegerType</tt></dt> 3234 <dd>Subclass of DerivedType that represents integer types of any bit width. 3235 Any bit width between <tt>IntegerType::MIN_INT_BITS</tt> (1) and 3236 <tt>IntegerType::MAX_INT_BITS</tt> (~8 million) can be represented. 3237 <ul> 3238 <li><tt>static const IntegerType* get(unsigned NumBits)</tt>: get an integer 3239 type of a specific bit width.</li> 3240 <li><tt>unsigned getBitWidth() const</tt>: Get the bit width of an integer 3241 type.</li> 3242 </ul> 3243 </dd> 3244 <dt><tt>SequentialType</tt></dt> 3245 <dd>This is subclassed by ArrayType, PointerType and VectorType. 3246 <ul> 3247 <li><tt>const Type * getElementType() const</tt>: Returns the type of each 3248 of the elements in the sequential type. </li> 3249 </ul> 3250 </dd> 3251 <dt><tt>ArrayType</tt></dt> 3252 <dd>This is a subclass of SequentialType and defines the interface for array 3253 types. 3254 <ul> 3255 <li><tt>unsigned getNumElements() const</tt>: Returns the number of 3256 elements in the array. </li> 3257 </ul> 3258 </dd> 3259 <dt><tt>PointerType</tt></dt> 3260 <dd>Subclass of SequentialType for pointer types.</dd> 3261 <dt><tt>VectorType</tt></dt> 3262 <dd>Subclass of SequentialType for vector types. A 3263 vector type is similar to an ArrayType but is distinguished because it is 3264 a first class type whereas ArrayType is not. Vector types are used for 3265 vector operations and are usually small vectors of of an integer or floating 3266 point type.</dd> 3267 <dt><tt>StructType</tt></dt> 3268 <dd>Subclass of DerivedTypes for struct types.</dd> 3269 <dt><tt><a name="FunctionType">FunctionType</a></tt></dt> 3270 <dd>Subclass of DerivedTypes for function types. 3271 <ul> 3272 <li><tt>bool isVarArg() const</tt>: Returns true if it's a vararg 3273 function</li> 3274 <li><tt> const Type * getReturnType() const</tt>: Returns the 3275 return type of the function.</li> 3276 <li><tt>const Type * getParamType (unsigned i)</tt>: Returns 3277 the type of the ith parameter.</li> 3278 <li><tt> const unsigned getNumParams() const</tt>: Returns the 3279 number of formal parameters.</li> 3280 </ul> 3281 </dd> 3282 </dl> 3283 </div> 3284 3285 </div> 3286 3287 <!-- ======================================================================= --> 3288 <h3> 3289 <a name="Module">The <tt>Module</tt> class</a> 3290 </h3> 3291 3292 <div> 3293 3294 <p><tt>#include "<a 3295 href="/doxygen/Module_8h-source.html">llvm/Module.h</a>"</tt><br> doxygen info: 3296 <a href="/doxygen/classllvm_1_1Module.html">Module Class</a></p> 3297 3298 <p>The <tt>Module</tt> class represents the top level structure present in LLVM 3299 programs. An LLVM module is effectively either a translation unit of the 3300 original program or a combination of several translation units merged by the 3301 linker. The <tt>Module</tt> class keeps track of a list of <a 3302 href="#Function"><tt>Function</tt></a>s, a list of <a 3303 href="#GlobalVariable"><tt>GlobalVariable</tt></a>s, and a <a 3304 href="#SymbolTable"><tt>SymbolTable</tt></a>. Additionally, it contains a few 3305 helpful member functions that try to make common operations easy.</p> 3306 3307 <!-- _______________________________________________________________________ --> 3308 <h4> 3309 <a name="m_Module">Important Public Members of the <tt>Module</tt> class</a> 3310 </h4> 3311 3312 <div> 3313 3314 <ul> 3315 <li><tt>Module::Module(std::string name = "")</tt> 3316 3317 <p>Constructing a <a href="#Module">Module</a> is easy. You can optionally 3318 provide a name for it (probably based on the name of the translation unit).</p> 3319 </li> 3320 3321 <li><tt>Module::iterator</tt> - Typedef for function list iterator<br> 3322 <tt>Module::const_iterator</tt> - Typedef for const_iterator.<br> 3323 3324 <tt>begin()</tt>, <tt>end()</tt> 3325 <tt>size()</tt>, <tt>empty()</tt> 3326 3327 <p>These are forwarding methods that make it easy to access the contents of 3328 a <tt>Module</tt> object's <a href="#Function"><tt>Function</tt></a> 3329 list.</p></li> 3330 3331 <li><tt>Module::FunctionListType &getFunctionList()</tt> 3332 3333 <p> Returns the list of <a href="#Function"><tt>Function</tt></a>s. This is 3334 necessary to use when you need to update the list or perform a complex 3335 action that doesn't have a forwarding method.</p> 3336 3337 <p><!-- Global Variable --></p></li> 3338 </ul> 3339 3340 <hr> 3341 3342 <ul> 3343 <li><tt>Module::global_iterator</tt> - Typedef for global variable list iterator<br> 3344 3345 <tt>Module::const_global_iterator</tt> - Typedef for const_iterator.<br> 3346 3347 <tt>global_begin()</tt>, <tt>global_end()</tt> 3348 <tt>global_size()</tt>, <tt>global_empty()</tt> 3349 3350 <p> These are forwarding methods that make it easy to access the contents of 3351 a <tt>Module</tt> object's <a 3352 href="#GlobalVariable"><tt>GlobalVariable</tt></a> list.</p></li> 3353 3354 <li><tt>Module::GlobalListType &getGlobalList()</tt> 3355 3356 <p>Returns the list of <a 3357 href="#GlobalVariable"><tt>GlobalVariable</tt></a>s. This is necessary to 3358 use when you need to update the list or perform a complex action that 3359 doesn't have a forwarding method.</p> 3360 3361 <p><!-- Symbol table stuff --> </p></li> 3362 </ul> 3363 3364 <hr> 3365 3366 <ul> 3367 <li><tt><a href="#SymbolTable">SymbolTable</a> *getSymbolTable()</tt> 3368 3369 <p>Return a reference to the <a href="#SymbolTable"><tt>SymbolTable</tt></a> 3370 for this <tt>Module</tt>.</p> 3371 3372 <p><!-- Convenience methods --></p></li> 3373 </ul> 3374 3375 <hr> 3376 3377 <ul> 3378 3379 <li><tt><a href="#Function">Function</a> *getFunction(StringRef Name) const 3380 </tt> 3381 3382 <p>Look up the specified function in the <tt>Module</tt> <a 3383 href="#SymbolTable"><tt>SymbolTable</tt></a>. If it does not exist, return 3384 <tt>null</tt>.</p></li> 3385 3386 <li><tt><a href="#Function">Function</a> *getOrInsertFunction(const 3387 std::string &Name, const <a href="#FunctionType">FunctionType</a> *T)</tt> 3388 3389 <p>Look up the specified function in the <tt>Module</tt> <a 3390 href="#SymbolTable"><tt>SymbolTable</tt></a>. If it does not exist, add an 3391 external declaration for the function and return it.</p></li> 3392 3393 <li><tt>std::string getTypeName(const <a href="#Type">Type</a> *Ty)</tt> 3394 3395 <p>If there is at least one entry in the <a 3396 href="#SymbolTable"><tt>SymbolTable</tt></a> for the specified <a 3397 href="#Type"><tt>Type</tt></a>, return it. Otherwise return the empty 3398 string.</p></li> 3399 3400 <li><tt>bool addTypeName(const std::string &Name, const <a 3401 href="#Type">Type</a> *Ty)</tt> 3402 3403 <p>Insert an entry in the <a href="#SymbolTable"><tt>SymbolTable</tt></a> 3404 mapping <tt>Name</tt> to <tt>Ty</tt>. If there is already an entry for this 3405 name, true is returned and the <a 3406 href="#SymbolTable"><tt>SymbolTable</tt></a> is not modified.</p></li> 3407 </ul> 3408 3409 </div> 3410 3411 </div> 3412 3413 <!-- ======================================================================= --> 3414 <h3> 3415 <a name="Value">The <tt>Value</tt> class</a> 3416 </h3> 3417 3418 <div> 3419 3420 <p><tt>#include "<a href="/doxygen/Value_8h-source.html">llvm/Value.h</a>"</tt> 3421 <br> 3422 doxygen info: <a href="/doxygen/classllvm_1_1Value.html">Value Class</a></p> 3423 3424 <p>The <tt>Value</tt> class is the most important class in the LLVM Source 3425 base. It represents a typed value that may be used (among other things) as an 3426 operand to an instruction. There are many different types of <tt>Value</tt>s, 3427 such as <a href="#Constant"><tt>Constant</tt></a>s,<a 3428 href="#Argument"><tt>Argument</tt></a>s. Even <a 3429 href="#Instruction"><tt>Instruction</tt></a>s and <a 3430 href="#Function"><tt>Function</tt></a>s are <tt>Value</tt>s.</p> 3431 3432 <p>A particular <tt>Value</tt> may be used many times in the LLVM representation 3433 for a program. For example, an incoming argument to a function (represented 3434 with an instance of the <a href="#Argument">Argument</a> class) is "used" by 3435 every instruction in the function that references the argument. To keep track 3436 of this relationship, the <tt>Value</tt> class keeps a list of all of the <a 3437 href="#User"><tt>User</tt></a>s that is using it (the <a 3438 href="#User"><tt>User</tt></a> class is a base class for all nodes in the LLVM 3439 graph that can refer to <tt>Value</tt>s). This use list is how LLVM represents 3440 def-use information in the program, and is accessible through the <tt>use_</tt>* 3441 methods, shown below.</p> 3442 3443 <p>Because LLVM is a typed representation, every LLVM <tt>Value</tt> is typed, 3444 and this <a href="#Type">Type</a> is available through the <tt>getType()</tt> 3445 method. In addition, all LLVM values can be named. The "name" of the 3446 <tt>Value</tt> is a symbolic string printed in the LLVM code:</p> 3447 3448 <div class="doc_code"> 3449 <pre> 3450 %<b>foo</b> = add i32 1, 2 3451 </pre> 3452 </div> 3453 3454 <p><a name="nameWarning">The name of this instruction is "foo".</a> <b>NOTE</b> 3455 that the name of any value may be missing (an empty string), so names should 3456 <b>ONLY</b> be used for debugging (making the source code easier to read, 3457 debugging printouts), they should not be used to keep track of values or map 3458 between them. For this purpose, use a <tt>std::map</tt> of pointers to the 3459 <tt>Value</tt> itself instead.</p> 3460 3461 <p>One important aspect of LLVM is that there is no distinction between an SSA 3462 variable and the operation that produces it. Because of this, any reference to 3463 the value produced by an instruction (or the value available as an incoming 3464 argument, for example) is represented as a direct pointer to the instance of 3465 the class that 3466 represents this value. Although this may take some getting used to, it 3467 simplifies the representation and makes it easier to manipulate.</p> 3468 3469 <!-- _______________________________________________________________________ --> 3470 <h4> 3471 <a name="m_Value">Important Public Members of the <tt>Value</tt> class</a> 3472 </h4> 3473 3474 <div> 3475 3476 <ul> 3477 <li><tt>Value::use_iterator</tt> - Typedef for iterator over the 3478 use-list<br> 3479 <tt>Value::const_use_iterator</tt> - Typedef for const_iterator over 3480 the use-list<br> 3481 <tt>unsigned use_size()</tt> - Returns the number of users of the 3482 value.<br> 3483 <tt>bool use_empty()</tt> - Returns true if there are no users.<br> 3484 <tt>use_iterator use_begin()</tt> - Get an iterator to the start of 3485 the use-list.<br> 3486 <tt>use_iterator use_end()</tt> - Get an iterator to the end of the 3487 use-list.<br> 3488 <tt><a href="#User">User</a> *use_back()</tt> - Returns the last 3489 element in the list. 3490 <p> These methods are the interface to access the def-use 3491 information in LLVM. As with all other iterators in LLVM, the naming 3492 conventions follow the conventions defined by the <a href="#stl">STL</a>.</p> 3493 </li> 3494 <li><tt><a href="#Type">Type</a> *getType() const</tt> 3495 <p>This method returns the Type of the Value.</p> 3496 </li> 3497 <li><tt>bool hasName() const</tt><br> 3498 <tt>std::string getName() const</tt><br> 3499 <tt>void setName(const std::string &Name)</tt> 3500 <p> This family of methods is used to access and assign a name to a <tt>Value</tt>, 3501 be aware of the <a href="#nameWarning">precaution above</a>.</p> 3502 </li> 3503 <li><tt>void replaceAllUsesWith(Value *V)</tt> 3504 3505 <p>This method traverses the use list of a <tt>Value</tt> changing all <a 3506 href="#User"><tt>User</tt>s</a> of the current value to refer to 3507 "<tt>V</tt>" instead. For example, if you detect that an instruction always 3508 produces a constant value (for example through constant folding), you can 3509 replace all uses of the instruction with the constant like this:</p> 3510 3511 <div class="doc_code"> 3512 <pre> 3513 Inst->replaceAllUsesWith(ConstVal); 3514 </pre> 3515 </div> 3516 3517 </ul> 3518 3519 </div> 3520 3521 </div> 3522 3523 <!-- ======================================================================= --> 3524 <h3> 3525 <a name="User">The <tt>User</tt> class</a> 3526 </h3> 3527 3528 <div> 3529 3530 <p> 3531 <tt>#include "<a href="/doxygen/User_8h-source.html">llvm/User.h</a>"</tt><br> 3532 doxygen info: <a href="/doxygen/classllvm_1_1User.html">User Class</a><br> 3533 Superclass: <a href="#Value"><tt>Value</tt></a></p> 3534 3535 <p>The <tt>User</tt> class is the common base class of all LLVM nodes that may 3536 refer to <a href="#Value"><tt>Value</tt></a>s. It exposes a list of "Operands" 3537 that are all of the <a href="#Value"><tt>Value</tt></a>s that the User is 3538 referring to. The <tt>User</tt> class itself is a subclass of 3539 <tt>Value</tt>.</p> 3540 3541 <p>The operands of a <tt>User</tt> point directly to the LLVM <a 3542 href="#Value"><tt>Value</tt></a> that it refers to. Because LLVM uses Static 3543 Single Assignment (SSA) form, there can only be one definition referred to, 3544 allowing this direct connection. This connection provides the use-def 3545 information in LLVM.</p> 3546 3547 <!-- _______________________________________________________________________ --> 3548 <h4> 3549 <a name="m_User">Important Public Members of the <tt>User</tt> class</a> 3550 </h4> 3551 3552 <div> 3553 3554 <p>The <tt>User</tt> class exposes the operand list in two ways: through 3555 an index access interface and through an iterator based interface.</p> 3556 3557 <ul> 3558 <li><tt>Value *getOperand(unsigned i)</tt><br> 3559 <tt>unsigned getNumOperands()</tt> 3560 <p> These two methods expose the operands of the <tt>User</tt> in a 3561 convenient form for direct access.</p></li> 3562 3563 <li><tt>User::op_iterator</tt> - Typedef for iterator over the operand 3564 list<br> 3565 <tt>op_iterator op_begin()</tt> - Get an iterator to the start of 3566 the operand list.<br> 3567 <tt>op_iterator op_end()</tt> - Get an iterator to the end of the 3568 operand list. 3569 <p> Together, these methods make up the iterator based interface to 3570 the operands of a <tt>User</tt>.</p></li> 3571 </ul> 3572 3573 </div> 3574 3575 </div> 3576 3577 <!-- ======================================================================= --> 3578 <h3> 3579 <a name="Instruction">The <tt>Instruction</tt> class</a> 3580 </h3> 3581 3582 <div> 3583 3584 <p><tt>#include "</tt><tt><a 3585 href="/doxygen/Instruction_8h-source.html">llvm/Instruction.h</a>"</tt><br> 3586 doxygen info: <a href="/doxygen/classllvm_1_1Instruction.html">Instruction Class</a><br> 3587 Superclasses: <a href="#User"><tt>User</tt></a>, <a 3588 href="#Value"><tt>Value</tt></a></p> 3589 3590 <p>The <tt>Instruction</tt> class is the common base class for all LLVM 3591 instructions. It provides only a few methods, but is a very commonly used 3592 class. The primary data tracked by the <tt>Instruction</tt> class itself is the 3593 opcode (instruction type) and the parent <a 3594 href="#BasicBlock"><tt>BasicBlock</tt></a> the <tt>Instruction</tt> is embedded 3595 into. To represent a specific type of instruction, one of many subclasses of 3596 <tt>Instruction</tt> are used.</p> 3597 3598 <p> Because the <tt>Instruction</tt> class subclasses the <a 3599 href="#User"><tt>User</tt></a> class, its operands can be accessed in the same 3600 way as for other <a href="#User"><tt>User</tt></a>s (with the 3601 <tt>getOperand()</tt>/<tt>getNumOperands()</tt> and 3602 <tt>op_begin()</tt>/<tt>op_end()</tt> methods).</p> <p> An important file for 3603 the <tt>Instruction</tt> class is the <tt>llvm/Instruction.def</tt> file. This 3604 file contains some meta-data about the various different types of instructions 3605 in LLVM. It describes the enum values that are used as opcodes (for example 3606 <tt>Instruction::Add</tt> and <tt>Instruction::ICmp</tt>), as well as the 3607 concrete sub-classes of <tt>Instruction</tt> that implement the instruction (for 3608 example <tt><a href="#BinaryOperator">BinaryOperator</a></tt> and <tt><a 3609 href="#CmpInst">CmpInst</a></tt>). Unfortunately, the use of macros in 3610 this file confuses doxygen, so these enum values don't show up correctly in the 3611 <a href="/doxygen/classllvm_1_1Instruction.html">doxygen output</a>.</p> 3612 3613 <!-- _______________________________________________________________________ --> 3614 <h4> 3615 <a name="s_Instruction"> 3616 Important Subclasses of the <tt>Instruction</tt> class 3617 </a> 3618 </h4> 3619 <div> 3620 <ul> 3621 <li><tt><a name="BinaryOperator">BinaryOperator</a></tt> 3622 <p>This subclasses represents all two operand instructions whose operands 3623 must be the same type, except for the comparison instructions.</p></li> 3624 <li><tt><a name="CastInst">CastInst</a></tt> 3625 <p>This subclass is the parent of the 12 casting instructions. It provides 3626 common operations on cast instructions.</p> 3627 <li><tt><a name="CmpInst">CmpInst</a></tt> 3628 <p>This subclass respresents the two comparison instructions, 3629 <a href="LangRef.html#i_icmp">ICmpInst</a> (integer opreands), and 3630 <a href="LangRef.html#i_fcmp">FCmpInst</a> (floating point operands).</p> 3631 <li><tt><a name="TerminatorInst">TerminatorInst</a></tt> 3632 <p>This subclass is the parent of all terminator instructions (those which 3633 can terminate a block).</p> 3634 </ul> 3635 </div> 3636 3637 <!-- _______________________________________________________________________ --> 3638 <h4> 3639 <a name="m_Instruction"> 3640 Important Public Members of the <tt>Instruction</tt> class 3641 </a> 3642 </h4> 3643 3644 <div> 3645 3646 <ul> 3647 <li><tt><a href="#BasicBlock">BasicBlock</a> *getParent()</tt> 3648 <p>Returns the <a href="#BasicBlock"><tt>BasicBlock</tt></a> that 3649 this <tt>Instruction</tt> is embedded into.</p></li> 3650 <li><tt>bool mayWriteToMemory()</tt> 3651 <p>Returns true if the instruction writes to memory, i.e. it is a 3652 <tt>call</tt>,<tt>free</tt>,<tt>invoke</tt>, or <tt>store</tt>.</p></li> 3653 <li><tt>unsigned getOpcode()</tt> 3654 <p>Returns the opcode for the <tt>Instruction</tt>.</p></li> 3655 <li><tt><a href="#Instruction">Instruction</a> *clone() const</tt> 3656 <p>Returns another instance of the specified instruction, identical 3657 in all ways to the original except that the instruction has no parent 3658 (ie it's not embedded into a <a href="#BasicBlock"><tt>BasicBlock</tt></a>), 3659 and it has no name</p></li> 3660 </ul> 3661 3662 </div> 3663 3664 </div> 3665 3666 <!-- ======================================================================= --> 3667 <h3> 3668 <a name="Constant">The <tt>Constant</tt> class and subclasses</a> 3669 </h3> 3670 3671 <div> 3672 3673 <p>Constant represents a base class for different types of constants. It 3674 is subclassed by ConstantInt, ConstantArray, etc. for representing 3675 the various types of Constants. <a href="#GlobalValue">GlobalValue</a> is also 3676 a subclass, which represents the address of a global variable or function. 3677 </p> 3678 3679 <!-- _______________________________________________________________________ --> 3680 <h4>Important Subclasses of Constant</h4> 3681 <div> 3682 <ul> 3683 <li>ConstantInt : This subclass of Constant represents an integer constant of 3684 any width. 3685 <ul> 3686 <li><tt>const APInt& getValue() const</tt>: Returns the underlying 3687 value of this constant, an APInt value.</li> 3688 <li><tt>int64_t getSExtValue() const</tt>: Converts the underlying APInt 3689 value to an int64_t via sign extension. If the value (not the bit width) 3690 of the APInt is too large to fit in an int64_t, an assertion will result. 3691 For this reason, use of this method is discouraged.</li> 3692 <li><tt>uint64_t getZExtValue() const</tt>: Converts the underlying APInt 3693 value to a uint64_t via zero extension. IF the value (not the bit width) 3694 of the APInt is too large to fit in a uint64_t, an assertion will result. 3695 For this reason, use of this method is discouraged.</li> 3696 <li><tt>static ConstantInt* get(const APInt& Val)</tt>: Returns the 3697 ConstantInt object that represents the value provided by <tt>Val</tt>. 3698 The type is implied as the IntegerType that corresponds to the bit width 3699 of <tt>Val</tt>.</li> 3700 <li><tt>static ConstantInt* get(const Type *Ty, uint64_t Val)</tt>: 3701 Returns the ConstantInt object that represents the value provided by 3702 <tt>Val</tt> for integer type <tt>Ty</tt>.</li> 3703 </ul> 3704 </li> 3705 <li>ConstantFP : This class represents a floating point constant. 3706 <ul> 3707 <li><tt>double getValue() const</tt>: Returns the underlying value of 3708 this constant. </li> 3709 </ul> 3710 </li> 3711 <li>ConstantArray : This represents a constant array. 3712 <ul> 3713 <li><tt>const std::vector<Use> &getValues() const</tt>: Returns 3714 a vector of component constants that makeup this array. </li> 3715 </ul> 3716 </li> 3717 <li>ConstantStruct : This represents a constant struct. 3718 <ul> 3719 <li><tt>const std::vector<Use> &getValues() const</tt>: Returns 3720 a vector of component constants that makeup this array. </li> 3721 </ul> 3722 </li> 3723 <li>GlobalValue : This represents either a global variable or a function. In 3724 either case, the value is a constant fixed address (after linking). 3725 </li> 3726 </ul> 3727 </div> 3728 3729 </div> 3730 3731 <!-- ======================================================================= --> 3732 <h3> 3733 <a name="GlobalValue">The <tt>GlobalValue</tt> class</a> 3734 </h3> 3735 3736 <div> 3737 3738 <p><tt>#include "<a 3739 href="/doxygen/GlobalValue_8h-source.html">llvm/GlobalValue.h</a>"</tt><br> 3740 doxygen info: <a href="/doxygen/classllvm_1_1GlobalValue.html">GlobalValue 3741 Class</a><br> 3742 Superclasses: <a href="#Constant"><tt>Constant</tt></a>, 3743 <a href="#User"><tt>User</tt></a>, <a href="#Value"><tt>Value</tt></a></p> 3744 3745 <p>Global values (<a href="#GlobalVariable"><tt>GlobalVariable</tt></a>s or <a 3746 href="#Function"><tt>Function</tt></a>s) are the only LLVM values that are 3747 visible in the bodies of all <a href="#Function"><tt>Function</tt></a>s. 3748 Because they are visible at global scope, they are also subject to linking with 3749 other globals defined in different translation units. To control the linking 3750 process, <tt>GlobalValue</tt>s know their linkage rules. Specifically, 3751 <tt>GlobalValue</tt>s know whether they have internal or external linkage, as 3752 defined by the <tt>LinkageTypes</tt> enumeration.</p> 3753 3754 <p>If a <tt>GlobalValue</tt> has internal linkage (equivalent to being 3755 <tt>static</tt> in C), it is not visible to code outside the current translation 3756 unit, and does not participate in linking. If it has external linkage, it is 3757 visible to external code, and does participate in linking. In addition to 3758 linkage information, <tt>GlobalValue</tt>s keep track of which <a 3759 href="#Module"><tt>Module</tt></a> they are currently part of.</p> 3760 3761 <p>Because <tt>GlobalValue</tt>s are memory objects, they are always referred to 3762 by their <b>address</b>. As such, the <a href="#Type"><tt>Type</tt></a> of a 3763 global is always a pointer to its contents. It is important to remember this 3764 when using the <tt>GetElementPtrInst</tt> instruction because this pointer must 3765 be dereferenced first. For example, if you have a <tt>GlobalVariable</tt> (a 3766 subclass of <tt>GlobalValue)</tt> that is an array of 24 ints, type <tt>[24 x 3767 i32]</tt>, then the <tt>GlobalVariable</tt> is a pointer to that array. Although 3768 the address of the first element of this array and the value of the 3769 <tt>GlobalVariable</tt> are the same, they have different types. The 3770 <tt>GlobalVariable</tt>'s type is <tt>[24 x i32]</tt>. The first element's type 3771 is <tt>i32.</tt> Because of this, accessing a global value requires you to 3772 dereference the pointer with <tt>GetElementPtrInst</tt> first, then its elements 3773 can be accessed. This is explained in the <a href="LangRef.html#globalvars">LLVM 3774 Language Reference Manual</a>.</p> 3775 3776 <!-- _______________________________________________________________________ --> 3777 <h4> 3778 <a name="m_GlobalValue"> 3779 Important Public Members of the <tt>GlobalValue</tt> class 3780 </a> 3781 </h4> 3782 3783 <div> 3784 3785 <ul> 3786 <li><tt>bool hasInternalLinkage() const</tt><br> 3787 <tt>bool hasExternalLinkage() const</tt><br> 3788 <tt>void setInternalLinkage(bool HasInternalLinkage)</tt> 3789 <p> These methods manipulate the linkage characteristics of the <tt>GlobalValue</tt>.</p> 3790 <p> </p> 3791 </li> 3792 <li><tt><a href="#Module">Module</a> *getParent()</tt> 3793 <p> This returns the <a href="#Module"><tt>Module</tt></a> that the 3794 GlobalValue is currently embedded into.</p></li> 3795 </ul> 3796 3797 </div> 3798 3799 </div> 3800 3801 <!-- ======================================================================= --> 3802 <h3> 3803 <a name="Function">The <tt>Function</tt> class</a> 3804 </h3> 3805 3806 <div> 3807 3808 <p><tt>#include "<a 3809 href="/doxygen/Function_8h-source.html">llvm/Function.h</a>"</tt><br> doxygen 3810 info: <a href="/doxygen/classllvm_1_1Function.html">Function Class</a><br> 3811 Superclasses: <a href="#GlobalValue"><tt>GlobalValue</tt></a>, 3812 <a href="#Constant"><tt>Constant</tt></a>, 3813 <a href="#User"><tt>User</tt></a>, 3814 <a href="#Value"><tt>Value</tt></a></p> 3815 3816 <p>The <tt>Function</tt> class represents a single procedure in LLVM. It is 3817 actually one of the more complex classes in the LLVM hierarchy because it must 3818 keep track of a large amount of data. The <tt>Function</tt> class keeps track 3819 of a list of <a href="#BasicBlock"><tt>BasicBlock</tt></a>s, a list of formal 3820 <a href="#Argument"><tt>Argument</tt></a>s, and a 3821 <a href="#SymbolTable"><tt>SymbolTable</tt></a>.</p> 3822 3823 <p>The list of <a href="#BasicBlock"><tt>BasicBlock</tt></a>s is the most 3824 commonly used part of <tt>Function</tt> objects. The list imposes an implicit 3825 ordering of the blocks in the function, which indicate how the code will be 3826 laid out by the backend. Additionally, the first <a 3827 href="#BasicBlock"><tt>BasicBlock</tt></a> is the implicit entry node for the 3828 <tt>Function</tt>. It is not legal in LLVM to explicitly branch to this initial 3829 block. There are no implicit exit nodes, and in fact there may be multiple exit 3830 nodes from a single <tt>Function</tt>. If the <a 3831 href="#BasicBlock"><tt>BasicBlock</tt></a> list is empty, this indicates that 3832 the <tt>Function</tt> is actually a function declaration: the actual body of the 3833 function hasn't been linked in yet.</p> 3834 3835 <p>In addition to a list of <a href="#BasicBlock"><tt>BasicBlock</tt></a>s, the 3836 <tt>Function</tt> class also keeps track of the list of formal <a 3837 href="#Argument"><tt>Argument</tt></a>s that the function receives. This 3838 container manages the lifetime of the <a href="#Argument"><tt>Argument</tt></a> 3839 nodes, just like the <a href="#BasicBlock"><tt>BasicBlock</tt></a> list does for 3840 the <a href="#BasicBlock"><tt>BasicBlock</tt></a>s.</p> 3841 3842 <p>The <a href="#SymbolTable"><tt>SymbolTable</tt></a> is a very rarely used 3843 LLVM feature that is only used when you have to look up a value by name. Aside 3844 from that, the <a href="#SymbolTable"><tt>SymbolTable</tt></a> is used 3845 internally to make sure that there are not conflicts between the names of <a 3846 href="#Instruction"><tt>Instruction</tt></a>s, <a 3847 href="#BasicBlock"><tt>BasicBlock</tt></a>s, or <a 3848 href="#Argument"><tt>Argument</tt></a>s in the function body.</p> 3849 3850 <p>Note that <tt>Function</tt> is a <a href="#GlobalValue">GlobalValue</a> 3851 and therefore also a <a href="#Constant">Constant</a>. The value of the function 3852 is its address (after linking) which is guaranteed to be constant.</p> 3853 3854 <!-- _______________________________________________________________________ --> 3855 <h4> 3856 <a name="m_Function"> 3857 Important Public Members of the <tt>Function</tt> class 3858 </a> 3859 </h4> 3860 3861 <div> 3862 3863 <ul> 3864 <li><tt>Function(const </tt><tt><a href="#FunctionType">FunctionType</a> 3865 *Ty, LinkageTypes Linkage, const std::string &N = "", Module* Parent = 0)</tt> 3866 3867 <p>Constructor used when you need to create new <tt>Function</tt>s to add 3868 the program. The constructor must specify the type of the function to 3869 create and what type of linkage the function should have. The <a 3870 href="#FunctionType"><tt>FunctionType</tt></a> argument 3871 specifies the formal arguments and return value for the function. The same 3872 <a href="#FunctionType"><tt>FunctionType</tt></a> value can be used to 3873 create multiple functions. The <tt>Parent</tt> argument specifies the Module 3874 in which the function is defined. If this argument is provided, the function 3875 will automatically be inserted into that module's list of 3876 functions.</p></li> 3877 3878 <li><tt>bool isDeclaration()</tt> 3879 3880 <p>Return whether or not the <tt>Function</tt> has a body defined. If the 3881 function is "external", it does not have a body, and thus must be resolved 3882 by linking with a function defined in a different translation unit.</p></li> 3883 3884 <li><tt>Function::iterator</tt> - Typedef for basic block list iterator<br> 3885 <tt>Function::const_iterator</tt> - Typedef for const_iterator.<br> 3886 3887 <tt>begin()</tt>, <tt>end()</tt> 3888 <tt>size()</tt>, <tt>empty()</tt> 3889 3890 <p>These are forwarding methods that make it easy to access the contents of 3891 a <tt>Function</tt> object's <a href="#BasicBlock"><tt>BasicBlock</tt></a> 3892 list.</p></li> 3893 3894 <li><tt>Function::BasicBlockListType &getBasicBlockList()</tt> 3895 3896 <p>Returns the list of <a href="#BasicBlock"><tt>BasicBlock</tt></a>s. This 3897 is necessary to use when you need to update the list or perform a complex 3898 action that doesn't have a forwarding method.</p></li> 3899 3900 <li><tt>Function::arg_iterator</tt> - Typedef for the argument list 3901 iterator<br> 3902 <tt>Function::const_arg_iterator</tt> - Typedef for const_iterator.<br> 3903 3904 <tt>arg_begin()</tt>, <tt>arg_end()</tt> 3905 <tt>arg_size()</tt>, <tt>arg_empty()</tt> 3906 3907 <p>These are forwarding methods that make it easy to access the contents of 3908 a <tt>Function</tt> object's <a href="#Argument"><tt>Argument</tt></a> 3909 list.</p></li> 3910 3911 <li><tt>Function::ArgumentListType &getArgumentList()</tt> 3912 3913 <p>Returns the list of <a href="#Argument"><tt>Argument</tt></a>s. This is 3914 necessary to use when you need to update the list or perform a complex 3915 action that doesn't have a forwarding method.</p></li> 3916 3917 <li><tt><a href="#BasicBlock">BasicBlock</a> &getEntryBlock()</tt> 3918 3919 <p>Returns the entry <a href="#BasicBlock"><tt>BasicBlock</tt></a> for the 3920 function. Because the entry block for the function is always the first 3921 block, this returns the first block of the <tt>Function</tt>.</p></li> 3922 3923 <li><tt><a href="#Type">Type</a> *getReturnType()</tt><br> 3924 <tt><a href="#FunctionType">FunctionType</a> *getFunctionType()</tt> 3925 3926 <p>This traverses the <a href="#Type"><tt>Type</tt></a> of the 3927 <tt>Function</tt> and returns the return type of the function, or the <a 3928 href="#FunctionType"><tt>FunctionType</tt></a> of the actual 3929 function.</p></li> 3930 3931 <li><tt><a href="#SymbolTable">SymbolTable</a> *getSymbolTable()</tt> 3932 3933 <p> Return a pointer to the <a href="#SymbolTable"><tt>SymbolTable</tt></a> 3934 for this <tt>Function</tt>.</p></li> 3935 </ul> 3936 3937 </div> 3938 3939 </div> 3940 3941 <!-- ======================================================================= --> 3942 <h3> 3943 <a name="GlobalVariable">The <tt>GlobalVariable</tt> class</a> 3944 </h3> 3945 3946 <div> 3947 3948 <p><tt>#include "<a 3949 href="/doxygen/GlobalVariable_8h-source.html">llvm/GlobalVariable.h</a>"</tt> 3950 <br> 3951 doxygen info: <a href="/doxygen/classllvm_1_1GlobalVariable.html">GlobalVariable 3952 Class</a><br> 3953 Superclasses: <a href="#GlobalValue"><tt>GlobalValue</tt></a>, 3954 <a href="#Constant"><tt>Constant</tt></a>, 3955 <a href="#User"><tt>User</tt></a>, 3956 <a href="#Value"><tt>Value</tt></a></p> 3957 3958 <p>Global variables are represented with the (surprise surprise) 3959 <tt>GlobalVariable</tt> class. Like functions, <tt>GlobalVariable</tt>s are also 3960 subclasses of <a href="#GlobalValue"><tt>GlobalValue</tt></a>, and as such are 3961 always referenced by their address (global values must live in memory, so their 3962 "name" refers to their constant address). See 3963 <a href="#GlobalValue"><tt>GlobalValue</tt></a> for more on this. Global 3964 variables may have an initial value (which must be a 3965 <a href="#Constant"><tt>Constant</tt></a>), and if they have an initializer, 3966 they may be marked as "constant" themselves (indicating that their contents 3967 never change at runtime).</p> 3968 3969 <!-- _______________________________________________________________________ --> 3970 <h4> 3971 <a name="m_GlobalVariable"> 3972 Important Public Members of the <tt>GlobalVariable</tt> class 3973 </a> 3974 </h4> 3975 3976 <div> 3977 3978 <ul> 3979 <li><tt>GlobalVariable(const </tt><tt><a href="#Type">Type</a> *Ty, bool 3980 isConstant, LinkageTypes& Linkage, <a href="#Constant">Constant</a> 3981 *Initializer = 0, const std::string &Name = "", Module* Parent = 0)</tt> 3982 3983 <p>Create a new global variable of the specified type. If 3984 <tt>isConstant</tt> is true then the global variable will be marked as 3985 unchanging for the program. The Linkage parameter specifies the type of 3986 linkage (internal, external, weak, linkonce, appending) for the variable. 3987 If the linkage is InternalLinkage, WeakAnyLinkage, WeakODRLinkage, 3988 LinkOnceAnyLinkage or LinkOnceODRLinkage, then the resultant 3989 global variable will have internal linkage. AppendingLinkage concatenates 3990 together all instances (in different translation units) of the variable 3991 into a single variable but is only applicable to arrays. See 3992 the <a href="LangRef.html#modulestructure">LLVM Language Reference</a> for 3993 further details on linkage types. Optionally an initializer, a name, and the 3994 module to put the variable into may be specified for the global variable as 3995 well.</p></li> 3996 3997 <li><tt>bool isConstant() const</tt> 3998 3999 <p>Returns true if this is a global variable that is known not to 4000 be modified at runtime.</p></li> 4001 4002 <li><tt>bool hasInitializer()</tt> 4003 4004 <p>Returns true if this <tt>GlobalVariable</tt> has an intializer.</p></li> 4005 4006 <li><tt><a href="#Constant">Constant</a> *getInitializer()</tt> 4007 4008 <p>Returns the initial value for a <tt>GlobalVariable</tt>. It is not legal 4009 to call this method if there is no initializer.</p></li> 4010 </ul> 4011 4012 </div> 4013 4014 </div> 4015 4016 <!-- ======================================================================= --> 4017 <h3> 4018 <a name="BasicBlock">The <tt>BasicBlock</tt> class</a> 4019 </h3> 4020 4021 <div> 4022 4023 <p><tt>#include "<a 4024 href="/doxygen/BasicBlock_8h-source.html">llvm/BasicBlock.h</a>"</tt><br> 4025 doxygen info: <a href="/doxygen/classllvm_1_1BasicBlock.html">BasicBlock 4026 Class</a><br> 4027 Superclass: <a href="#Value"><tt>Value</tt></a></p> 4028 4029 <p>This class represents a single entry single exit section of the code, 4030 commonly known as a basic block by the compiler community. The 4031 <tt>BasicBlock</tt> class maintains a list of <a 4032 href="#Instruction"><tt>Instruction</tt></a>s, which form the body of the block. 4033 Matching the language definition, the last element of this list of instructions 4034 is always a terminator instruction (a subclass of the <a 4035 href="#TerminatorInst"><tt>TerminatorInst</tt></a> class).</p> 4036 4037 <p>In addition to tracking the list of instructions that make up the block, the 4038 <tt>BasicBlock</tt> class also keeps track of the <a 4039 href="#Function"><tt>Function</tt></a> that it is embedded into.</p> 4040 4041 <p>Note that <tt>BasicBlock</tt>s themselves are <a 4042 href="#Value"><tt>Value</tt></a>s, because they are referenced by instructions 4043 like branches and can go in the switch tables. <tt>BasicBlock</tt>s have type 4044 <tt>label</tt>.</p> 4045 4046 <!-- _______________________________________________________________________ --> 4047 <h4> 4048 <a name="m_BasicBlock"> 4049 Important Public Members of the <tt>BasicBlock</tt> class 4050 </a> 4051 </h4> 4052 4053 <div> 4054 <ul> 4055 4056 <li><tt>BasicBlock(const std::string &Name = "", </tt><tt><a 4057 href="#Function">Function</a> *Parent = 0)</tt> 4058 4059 <p>The <tt>BasicBlock</tt> constructor is used to create new basic blocks for 4060 insertion into a function. The constructor optionally takes a name for the new 4061 block, and a <a href="#Function"><tt>Function</tt></a> to insert it into. If 4062 the <tt>Parent</tt> parameter is specified, the new <tt>BasicBlock</tt> is 4063 automatically inserted at the end of the specified <a 4064 href="#Function"><tt>Function</tt></a>, if not specified, the BasicBlock must be 4065 manually inserted into the <a href="#Function"><tt>Function</tt></a>.</p></li> 4066 4067 <li><tt>BasicBlock::iterator</tt> - Typedef for instruction list iterator<br> 4068 <tt>BasicBlock::const_iterator</tt> - Typedef for const_iterator.<br> 4069 <tt>begin()</tt>, <tt>end()</tt>, <tt>front()</tt>, <tt>back()</tt>, 4070 <tt>size()</tt>, <tt>empty()</tt> 4071 STL-style functions for accessing the instruction list. 4072 4073 <p>These methods and typedefs are forwarding functions that have the same 4074 semantics as the standard library methods of the same names. These methods 4075 expose the underlying instruction list of a basic block in a way that is easy to 4076 manipulate. To get the full complement of container operations (including 4077 operations to update the list), you must use the <tt>getInstList()</tt> 4078 method.</p></li> 4079 4080 <li><tt>BasicBlock::InstListType &getInstList()</tt> 4081 4082 <p>This method is used to get access to the underlying container that actually 4083 holds the Instructions. This method must be used when there isn't a forwarding 4084 function in the <tt>BasicBlock</tt> class for the operation that you would like 4085 to perform. Because there are no forwarding functions for "updating" 4086 operations, you need to use this if you want to update the contents of a 4087 <tt>BasicBlock</tt>.</p></li> 4088 4089 <li><tt><a href="#Function">Function</a> *getParent()</tt> 4090 4091 <p> Returns a pointer to <a href="#Function"><tt>Function</tt></a> the block is 4092 embedded into, or a null pointer if it is homeless.</p></li> 4093 4094 <li><tt><a href="#TerminatorInst">TerminatorInst</a> *getTerminator()</tt> 4095 4096 <p> Returns a pointer to the terminator instruction that appears at the end of 4097 the <tt>BasicBlock</tt>. If there is no terminator instruction, or if the last 4098 instruction in the block is not a terminator, then a null pointer is 4099 returned.</p></li> 4100 4101 </ul> 4102 4103 </div> 4104 4105 </div> 4106 4107 <!-- ======================================================================= --> 4108 <h3> 4109 <a name="Argument">The <tt>Argument</tt> class</a> 4110 </h3> 4111 4112 <div> 4113 4114 <p>This subclass of Value defines the interface for incoming formal 4115 arguments to a function. A Function maintains a list of its formal 4116 arguments. An argument has a pointer to the parent Function.</p> 4117 4118 </div> 4119 4120 </div> 4121 4122 <!-- *********************************************************************** --> 4123 <hr> 4124 <address> 4125 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img 4126 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> 4127 <a href="http://validator.w3.org/check/referer"><img 4128 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01 Strict"></a> 4129 4130 <a href="mailto:dhurjati (a] cs.uiuc.edu">Dinakar Dhurjati</a> and 4131 <a href="mailto:sabre (a] nondot.org">Chris Lattner</a><br> 4132 <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> 4133 Last modified: $Date$ 4134 </address> 4135 4136 </body> 4137 </html> 4138