1 ======================= 2 Writing an LLVM Backend 3 ======================= 4 5 .. toctree:: 6 :hidden: 7 8 HowToUseInstrMappings 9 10 .. contents:: 11 :local: 12 13 Introduction 14 ============ 15 16 This document describes techniques for writing compiler backends that convert 17 the LLVM Intermediate Representation (IR) to code for a specified machine or 18 other languages. Code intended for a specific machine can take the form of 19 either assembly code or binary code (usable for a JIT compiler). 20 21 The backend of LLVM features a target-independent code generator that may 22 create output for several types of target CPUs --- including X86, PowerPC, 23 ARM, and SPARC. The backend may also be used to generate code targeted at SPUs 24 of the Cell processor or GPUs to support the execution of compute kernels. 25 26 The document focuses on existing examples found in subdirectories of 27 ``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document 28 focuses on the example of creating a static compiler (one that emits text 29 assembly) for a SPARC target, because SPARC has fairly standard 30 characteristics, such as a RISC instruction set and straightforward calling 31 conventions. 32 33 Audience 34 -------- 35 36 The audience for this document is anyone who needs to write an LLVM backend to 37 generate code for a specific hardware or software target. 38 39 Prerequisite Reading 40 -------------------- 41 42 These essential documents must be read before reading this document: 43 44 * `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for 45 the LLVM assembly language. 46 47 * :doc:`CodeGenerator` --- a guide to the components (classes and code 48 generation algorithms) for translating the LLVM internal representation into 49 machine code for a specified target. Pay particular attention to the 50 descriptions of code generation stages: Instruction Selection, Scheduling and 51 Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code 52 Insertion, Late Machine Code Optimizations, and Code Emission. 53 54 * :doc:`TableGen/index` --- a document that describes the TableGen 55 (``tblgen``) application that manages domain-specific information to support 56 LLVM code generation. TableGen processes input from a target description 57 file (``.td`` suffix) and generates C++ code that can be used for code 58 generation. 59 60 * :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as 61 are several ``SelectionDAG`` processing steps. 62 63 To follow the SPARC examples in this document, have a copy of `The SPARC 64 Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for 65 reference. For details about the ARM instruction set, refer to the `ARM 66 Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about 67 the GNU Assembler format (``GAS``), see `Using As 68 <http://sourceware.org/binutils/docs/as/index.html>`_, especially for the 69 assembly printer. "Using As" contains a list of target machine dependent 70 features. 71 72 Basic Steps 73 ----------- 74 75 To write a compiler backend for LLVM that converts the LLVM IR to code for a 76 specified target (machine or other language), follow these steps: 77 78 * Create a subclass of the ``TargetMachine`` class that describes 79 characteristics of your target machine. Copy existing examples of specific 80 ``TargetMachine`` class and header files; for example, start with 81 ``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file 82 names for your target. Similarly, change code that references "``Sparc``" to 83 reference your target. 84 85 * Describe the register set of the target. Use TableGen to generate code for 86 register definition, register aliases, and register classes from a 87 target-specific ``RegisterInfo.td`` input file. You should also write 88 additional code for a subclass of the ``TargetRegisterInfo`` class that 89 represents the class register file data used for register allocation and also 90 describes the interactions between registers. 91 92 * Describe the instruction set of the target. Use TableGen to generate code 93 for target-specific instructions from target-specific versions of 94 ``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write 95 additional code for a subclass of the ``TargetInstrInfo`` class to represent 96 machine instructions supported by the target machine. 97 98 * Describe the selection and conversion of the LLVM IR from a Directed Acyclic 99 Graph (DAG) representation of instructions to native target-specific 100 instructions. Use TableGen to generate code that matches patterns and 101 selects instructions based on additional information in a target-specific 102 version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``, 103 where ``XXX`` identifies the specific target, to perform pattern matching and 104 DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp`` 105 to replace or remove operations and data types that are not supported 106 natively in a SelectionDAG. 107 108 * Write code for an assembly printer that converts LLVM IR to a GAS format for 109 your target machine. You should add assembly strings to the instructions 110 defined in your target-specific version of ``TargetInstrInfo.td``. You 111 should also write code for a subclass of ``AsmPrinter`` that performs the 112 LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``. 113 114 * Optionally, add support for subtargets (i.e., variants with different 115 capabilities). You should also write code for a subclass of the 116 ``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and 117 ``-mattr=`` command-line options. 118 119 * Optionally, add JIT support and create a machine code emitter (subclass of 120 ``TargetJITInfo``) that is used to emit binary code directly into memory. 121 122 In the ``.cpp`` and ``.h``. files, initially stub up these methods and then 123 implement them later. Initially, you may not know which private members that 124 the class will need and which components will need to be subclassed. 125 126 Preliminaries 127 ------------- 128 129 To actually create your compiler backend, you need to create and modify a few 130 files. The absolute minimum is discussed here. But to actually use the LLVM 131 target-independent code generator, you must perform the steps described in the 132 :doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document. 133 134 First, you should create a subdirectory under ``lib/Target`` to hold all the 135 files related to your target. If your target is called "Dummy", create the 136 directory ``lib/Target/Dummy``. 137 138 In this new directory, create a ``Makefile``. It is easiest to copy a 139 ``Makefile`` of another target and modify it. It should at least contain the 140 ``LEVEL``, ``LIBRARYNAME`` and ``TARGET`` variables, and then include 141 ``$(LEVEL)/Makefile.common``. The library can be named ``LLVMDummy`` (for 142 example, see the MIPS target). Alternatively, you can split the library into 143 ``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which should be 144 implemented in a subdirectory below ``lib/Target/Dummy`` (for example, see the 145 PowerPC target). 146 147 Note that these two naming schemes are hardcoded into ``llvm-config``. Using 148 any other naming scheme will confuse ``llvm-config`` and produce a lot of 149 (seemingly unrelated) linker errors when linking ``llc``. 150 151 To make your target actually do something, you need to implement a subclass of 152 ``TargetMachine``. This implementation should typically be in the file 153 ``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target`` 154 directory will be built and should work. To use LLVM's target independent code 155 generator, you should do what all current machine backends do: create a 156 subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a 157 subclass of ``TargetMachine``.) 158 159 To get LLVM to actually build and link your target, you need to add it to the 160 ``TARGETS_TO_BUILD`` variable. To do this, you modify the configure script to 161 know about your target when parsing the ``--enable-targets`` option. Search 162 the configure script for ``TARGETS_TO_BUILD``, add your target to the lists 163 there (some creativity required), and then reconfigure. Alternatively, you can 164 change ``autotools/configure.ac`` and regenerate configure by running 165 ``./autoconf/AutoRegen.sh``. 166 167 Target Machine 168 ============== 169 170 ``LLVMTargetMachine`` is designed as a base class for targets implemented with 171 the LLVM target-independent code generator. The ``LLVMTargetMachine`` class 172 should be specialized by a concrete target class that implements the various 173 virtual methods. ``LLVMTargetMachine`` is defined as a subclass of 174 ``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The 175 ``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes 176 numerous command-line options. 177 178 To create a concrete target-specific subclass of ``LLVMTargetMachine``, start 179 by copying an existing ``TargetMachine`` class and header. You should name the 180 files that you create to reflect your specific target. For instance, for the 181 SPARC target, name the files ``SparcTargetMachine.h`` and 182 ``SparcTargetMachine.cpp``. 183 184 For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must 185 have access methods to obtain objects that represent target components. These 186 methods are named ``get*Info``, and are intended to obtain the instruction set 187 (``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout 188 (``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also 189 implement the ``getDataLayout`` method to access an object with target-specific 190 data characteristics, such as data type size and alignment requirements. 191 192 For instance, for the SPARC target, the header file ``SparcTargetMachine.h`` 193 declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that 194 simply return a class member. 195 196 .. code-block:: c++ 197 198 namespace llvm { 199 200 class Module; 201 202 class SparcTargetMachine : public LLVMTargetMachine { 203 const DataLayout DataLayout; // Calculates type size & alignment 204 SparcSubtarget Subtarget; 205 SparcInstrInfo InstrInfo; 206 TargetFrameInfo FrameInfo; 207 208 protected: 209 virtual const TargetAsmInfo *createTargetAsmInfo() const; 210 211 public: 212 SparcTargetMachine(const Module &M, const std::string &FS); 213 214 virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; } 215 virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; } 216 virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; } 217 virtual const TargetRegisterInfo *getRegisterInfo() const { 218 return &InstrInfo.getRegisterInfo(); 219 } 220 virtual const DataLayout *getDataLayout() const { return &DataLayout; } 221 static unsigned getModuleMatchQuality(const Module &M); 222 223 // Pass Pipeline Configuration 224 virtual bool addInstSelector(PassManagerBase &PM, bool Fast); 225 virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast); 226 }; 227 228 } // end namespace llvm 229 230 * ``getInstrInfo()`` 231 * ``getRegisterInfo()`` 232 * ``getFrameInfo()`` 233 * ``getDataLayout()`` 234 * ``getSubtargetImpl()`` 235 236 For some targets, you also need to support the following methods: 237 238 * ``getTargetLowering()`` 239 * ``getJITInfo()`` 240 241 Some architectures, such as GPUs, do not support jumping to an arbitrary 242 program location and implement branching using masked execution and loop using 243 special instructions around the loop body. In order to avoid CFG modifications 244 that introduce irreducible control flow not handled by such hardware, a target 245 must call `setRequiresStructuredCFG(true)` when being initialized. 246 247 In addition, the ``XXXTargetMachine`` constructor should specify a 248 ``TargetDescription`` string that determines the data layout for the target 249 machine, including characteristics such as pointer size, alignment, and 250 endianness. For example, the constructor for ``SparcTargetMachine`` contains 251 the following: 252 253 .. code-block:: c++ 254 255 SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS) 256 : DataLayout("E-p:32:32-f128:128:128"), 257 Subtarget(M, FS), InstrInfo(Subtarget), 258 FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) { 259 } 260 261 Hyphens separate portions of the ``TargetDescription`` string. 262 263 * An upper-case "``E``" in the string indicates a big-endian target data model. 264 A lower-case "``e``" indicates little-endian. 265 266 * "``p:``" is followed by pointer information: size, ABI alignment, and 267 preferred alignment. If only two figures follow "``p:``", then the first 268 value is pointer size, and the second value is both ABI and preferred 269 alignment. 270 271 * Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or 272 "``a``" (corresponding to integer, floating point, vector, or aggregate). 273 "``i``", "``v``", or "``a``" are followed by ABI alignment and preferred 274 alignment. "``f``" is followed by three values: the first indicates the size 275 of a long double, then ABI alignment, and then ABI preferred alignment. 276 277 Target Registration 278 =================== 279 280 You must also register your target with the ``TargetRegistry``, which is what 281 other LLVM tools use to be able to lookup and use your target at runtime. The 282 ``TargetRegistry`` can be used directly, but for most targets there are helper 283 templates which should take care of the work for you. 284 285 All targets should declare a global ``Target`` object which is used to 286 represent the target during registration. Then, in the target's ``TargetInfo`` 287 library, the target should define that object and use the ``RegisterTarget`` 288 template to register the target. For example, the Sparc registration code 289 looks like this: 290 291 .. code-block:: c++ 292 293 Target llvm::TheSparcTarget; 294 295 extern "C" void LLVMInitializeSparcTargetInfo() { 296 RegisterTarget<Triple::sparc, /*HasJIT=*/false> 297 X(TheSparcTarget, "sparc", "Sparc"); 298 } 299 300 This allows the ``TargetRegistry`` to look up the target by name or by target 301 triple. In addition, most targets will also register additional features which 302 are available in separate libraries. These registration steps are separate, 303 because some clients may wish to only link in some parts of the target --- the 304 JIT code generator does not require the use of the assembler printer, for 305 example. Here is an example of registering the Sparc assembly printer: 306 307 .. code-block:: c++ 308 309 extern "C" void LLVMInitializeSparcAsmPrinter() { 310 RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget); 311 } 312 313 For more information, see "`llvm/Target/TargetRegistry.h 314 </doxygen/TargetRegistry_8h-source.html>`_". 315 316 Register Set and Register Classes 317 ================================= 318 319 You should describe a concrete target-specific class that represents the 320 register file of a target machine. This class is called ``XXXRegisterInfo`` 321 (where ``XXX`` identifies the target) and represents the class register file 322 data that is used for register allocation. It also describes the interactions 323 between registers. 324 325 You also need to define register classes to categorize related registers. A 326 register class should be added for groups of registers that are all treated the 327 same way for some instruction. Typical examples are register classes for 328 integer, floating-point, or vector registers. A register allocator allows an 329 instruction to use any register in a specified register class to perform the 330 instruction in a similar manner. Register classes allocate virtual registers 331 to instructions from these sets, and register classes let the 332 target-independent register allocator automatically choose the actual 333 registers. 334 335 Much of the code for registers, including register definition, register 336 aliases, and register classes, is generated by TableGen from 337 ``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc`` 338 and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the 339 implementation of ``XXXRegisterInfo`` requires hand-coding. 340 341 Defining a Register 342 ------------------- 343 344 The ``XXXRegisterInfo.td`` file typically starts with register definitions for 345 a target machine. The ``Register`` class (specified in ``Target.td``) is used 346 to define an object for each register. The specified string ``n`` becomes the 347 ``Name`` of the register. The basic ``Register`` object does not have any 348 subregisters and does not specify any aliases. 349 350 .. code-block:: llvm 351 352 class Register<string n> { 353 string Namespace = ""; 354 string AsmName = n; 355 string Name = n; 356 int SpillSize = 0; 357 int SpillAlignment = 0; 358 list<Register> Aliases = []; 359 list<Register> SubRegs = []; 360 list<int> DwarfNumbers = []; 361 } 362 363 For example, in the ``X86RegisterInfo.td`` file, there are register definitions 364 that utilize the ``Register`` class, such as: 365 366 .. code-block:: llvm 367 368 def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>; 369 370 This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``) 371 that are used by ``gcc``, ``gdb``, or a debug information writer to identify a 372 register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values 373 representing 3 different modes: the first element is for X86-64, the second for 374 exception handling (EH) on X86-32, and the third is generic. -1 is a special 375 Dwarf number that indicates the gcc number is undefined, and -2 indicates the 376 register number is invalid for this mode. 377 378 From the previously described line in the ``X86RegisterInfo.td`` file, TableGen 379 generates this code in the ``X86GenRegisterInfo.inc`` file: 380 381 .. code-block:: c++ 382 383 static const unsigned GR8[] = { X86::AL, ... }; 384 385 const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 }; 386 387 const TargetRegisterDesc RegisterDescriptors[] = { 388 ... 389 { "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ... 390 391 From the register info file, TableGen generates a ``TargetRegisterDesc`` object 392 for each register. ``TargetRegisterDesc`` is defined in 393 ``include/llvm/Target/TargetRegisterInfo.h`` with the following fields: 394 395 .. code-block:: c++ 396 397 struct TargetRegisterDesc { 398 const char *AsmName; // Assembly language name for the register 399 const char *Name; // Printable name for the reg (for debugging) 400 const unsigned *AliasSet; // Register Alias Set 401 const unsigned *SubRegs; // Sub-register set 402 const unsigned *ImmSubRegs; // Immediate sub-register set 403 const unsigned *SuperRegs; // Super-register set 404 }; 405 406 TableGen uses the entire target description file (``.td``) to determine text 407 names for the register (in the ``AsmName`` and ``Name`` fields of 408 ``TargetRegisterDesc``) and the relationships of other registers to the defined 409 register (in the other ``TargetRegisterDesc`` fields). In this example, other 410 definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as 411 aliases for one another, so TableGen generates a null-terminated array 412 (``AL_AliasSet``) for this register alias set. 413 414 The ``Register`` class is commonly used as a base class for more complex 415 classes. In ``Target.td``, the ``Register`` class is the base for the 416 ``RegisterWithSubRegs`` class that is used to define registers that need to 417 specify subregisters in the ``SubRegs`` list, as shown here: 418 419 .. code-block:: llvm 420 421 class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> { 422 let SubRegs = subregs; 423 } 424 425 In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC: 426 a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``, 427 and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a 428 feature common to these subclasses. Note the use of "``let``" expressions to 429 override values that are initially defined in a superclass (such as ``SubRegs`` 430 field in the ``Rd`` class). 431 432 .. code-block:: llvm 433 434 class SparcReg<string n> : Register<n> { 435 field bits<5> Num; 436 let Namespace = "SP"; 437 } 438 // Ri - 32-bit integer registers 439 class Ri<bits<5> num, string n> : 440 SparcReg<n> { 441 let Num = num; 442 } 443 // Rf - 32-bit floating-point registers 444 class Rf<bits<5> num, string n> : 445 SparcReg<n> { 446 let Num = num; 447 } 448 // Rd - Slots in the FP register file for 64-bit floating-point values. 449 class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> { 450 let Num = num; 451 let SubRegs = subregs; 452 } 453 454 In the ``SparcRegisterInfo.td`` file, there are register definitions that 455 utilize these subclasses of ``Register``, such as: 456 457 .. code-block:: llvm 458 459 def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>; 460 def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>; 461 ... 462 def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>; 463 def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>; 464 ... 465 def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>; 466 def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>; 467 468 The last two registers shown above (``D0`` and ``D1``) are double-precision 469 floating-point registers that are aliases for pairs of single-precision 470 floating-point sub-registers. In addition to aliases, the sub-register and 471 super-register relationships of the defined register are in fields of a 472 register's ``TargetRegisterDesc``. 473 474 Defining a Register Class 475 ------------------------- 476 477 The ``RegisterClass`` class (specified in ``Target.td``) is used to define an 478 object that represents a group of related registers and also defines the 479 default allocation order of the registers. A target description file 480 ``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes 481 using the following class: 482 483 .. code-block:: llvm 484 485 class RegisterClass<string namespace, 486 list<ValueType> regTypes, int alignment, dag regList> { 487 string Namespace = namespace; 488 list<ValueType> RegTypes = regTypes; 489 int Size = 0; // spill size, in bits; zero lets tblgen pick the size 490 int Alignment = alignment; 491 492 // CopyCost is the cost of copying a value between two registers 493 // default value 1 means a single instruction 494 // A negative value means copying is extremely expensive or impossible 495 int CopyCost = 1; 496 dag MemberList = regList; 497 498 // for register classes that are subregisters of this class 499 list<RegisterClass> SubRegClassList = []; 500 501 code MethodProtos = [{}]; // to insert arbitrary code 502 code MethodBodies = [{}]; 503 } 504 505 To define a ``RegisterClass``, use the following 4 arguments: 506 507 * The first argument of the definition is the name of the namespace. 508 509 * The second argument is a list of ``ValueType`` register type values that are 510 defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include 511 integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean), 512 floating-point types (``f32``, ``f64``), and vector types (for example, 513 ``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass`` 514 must have the same ``ValueType``, but some registers may store vector data in 515 different configurations. For example a register that can process a 128-bit 516 vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4 517 32-bit integers, and so on. 518 519 * The third argument of the ``RegisterClass`` definition specifies the 520 alignment required of the registers when they are stored or loaded to 521 memory. 522 523 * The final argument, ``regList``, specifies which registers are in this class. 524 If an alternative allocation order method is not specified, then ``regList`` 525 also defines the order of allocation used by the register allocator. Besides 526 simply listing registers with ``(add R0, R1, ...)``, more advanced set 527 operators are available. See ``include/llvm/Target/Target.td`` for more 528 information. 529 530 In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined: 531 ``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the 532 first argument defines the namespace with the string "``SP``". ``FPRegs`` 533 defines a group of 32 single-precision floating-point registers (``F0`` to 534 ``F31``); ``DFPRegs`` defines a group of 16 double-precision registers 535 (``D0-D15``). 536 537 .. code-block:: llvm 538 539 // F0, F1, F2, ..., F31 540 def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>; 541 542 def DFPRegs : RegisterClass<"SP", [f64], 64, 543 (add D0, D1, D2, D3, D4, D5, D6, D7, D8, 544 D9, D10, D11, D12, D13, D14, D15)>; 545 546 def IntRegs : RegisterClass<"SP", [i32], 32, 547 (add L0, L1, L2, L3, L4, L5, L6, L7, 548 I0, I1, I2, I3, I4, I5, 549 O0, O1, O2, O3, O4, O5, O7, 550 G1, 551 // Non-allocatable regs: 552 G2, G3, G4, 553 O6, // stack ptr 554 I6, // frame ptr 555 I7, // return address 556 G0, // constant zero 557 G5, G6, G7 // reserved for kernel 558 )>; 559 560 Using ``SparcRegisterInfo.td`` with TableGen generates several output files 561 that are intended for inclusion in other source code that you write. 562 ``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should 563 be included in the header file for the implementation of the SPARC register 564 implementation that you write (``SparcRegisterInfo.h``). In 565 ``SparcGenRegisterInfo.h.inc`` a new structure is defined called 566 ``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also 567 specifies types, based upon the defined register classes: ``DFPRegsClass``, 568 ``FPRegsClass``, and ``IntRegsClass``. 569 570 ``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is 571 included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register 572 implementation. The code below shows only the generated integer registers and 573 associated register classes. The order of registers in ``IntRegs`` reflects 574 the order in the definition of ``IntRegs`` in the target description file. 575 576 .. code-block:: c++ 577 578 // IntRegs Register Class... 579 static const unsigned IntRegs[] = { 580 SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5, 581 SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, 582 SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3, 583 SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, 584 SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5, 585 SP::G6, SP::G7, 586 }; 587 588 // IntRegsVTs Register Class Value Types... 589 static const MVT::ValueType IntRegsVTs[] = { 590 MVT::i32, MVT::Other 591 }; 592 593 namespace SP { // Register class instances 594 DFPRegsClass DFPRegsRegClass; 595 FPRegsClass FPRegsRegClass; 596 IntRegsClass IntRegsRegClass; 597 ... 598 // IntRegs Sub-register Classess... 599 static const TargetRegisterClass* const IntRegsSubRegClasses [] = { 600 NULL 601 }; 602 ... 603 // IntRegs Super-register Classess... 604 static const TargetRegisterClass* const IntRegsSuperRegClasses [] = { 605 NULL 606 }; 607 ... 608 // IntRegs Register Class sub-classes... 609 static const TargetRegisterClass* const IntRegsSubclasses [] = { 610 NULL 611 }; 612 ... 613 // IntRegs Register Class super-classes... 614 static const TargetRegisterClass* const IntRegsSuperclasses [] = { 615 NULL 616 }; 617 618 IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID, 619 IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses, 620 IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {} 621 } 622 623 The register allocators will avoid using reserved registers, and callee saved 624 registers are not used until all the volatile registers have been used. That 625 is usually good enough, but in some cases it may be necessary to provide custom 626 allocation orders. 627 628 Implement a subclass of ``TargetRegisterInfo`` 629 ---------------------------------------------- 630 631 The final step is to hand code portions of ``XXXRegisterInfo``, which 632 implements the interface described in ``TargetRegisterInfo.h`` (see 633 :ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or 634 ``false``, unless overridden. Here is a list of functions that are overridden 635 for the SPARC implementation in ``SparcRegisterInfo.cpp``: 636 637 * ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the 638 order of the desired callee-save stack frame offset. 639 640 * ``getReservedRegs`` --- Returns a bitset indexed by physical register 641 numbers, indicating if a particular register is unavailable. 642 643 * ``hasFP`` --- Return a Boolean indicating if a function should have a 644 dedicated frame pointer register. 645 646 * ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo 647 instructions are used, this can be called to eliminate them. 648 649 * ``eliminateFrameIndex`` --- Eliminate abstract frame indices from 650 instructions that may use them. 651 652 * ``emitPrologue`` --- Insert prologue code into the function. 653 654 * ``emitEpilogue`` --- Insert epilogue code into the function. 655 656 .. _instruction-set: 657 658 Instruction Set 659 =============== 660 661 During the early stages of code generation, the LLVM IR code is converted to a 662 ``SelectionDAG`` with nodes that are instances of the ``SDNode`` class 663 containing target instructions. An ``SDNode`` has an opcode, operands, type 664 requirements, and operation properties. For example, is an operation 665 commutative, does an operation load from memory. The various operation node 666 types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file 667 (values of the ``NodeType`` enum in the ``ISD`` namespace). 668 669 TableGen uses the following target description (``.td``) input files to 670 generate much of the code for instruction definition: 671 672 * ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and 673 other fundamental classes are defined. 674 675 * ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection 676 generators, contains ``SDTC*`` classes (selection DAG type constraint), 677 definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``, 678 ``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``, 679 ``PatFrag``, ``PatLeaf``, ``ComplexPattern``. 680 681 * ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific 682 instructions. 683 684 * ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates, 685 condition codes, and instructions of an instruction set. For architecture 686 modifications, a different file name may be used. For example, for Pentium 687 with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with 688 MMX, this file is ``X86InstrMMX.td``. 689 690 There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of 691 the target. The ``XXX.td`` file includes the other ``.td`` input files, but 692 its contents are only directly important for subtargets. 693 694 You should describe a concrete target-specific class ``XXXInstrInfo`` that 695 represents machine instructions supported by a target machine. 696 ``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of 697 which describes one instruction. An instruction descriptor defines: 698 699 * Opcode mnemonic 700 * Number of operands 701 * List of implicit register definitions and uses 702 * Target-independent properties (such as memory access, is commutable) 703 * Target-specific flags 704 705 The Instruction class (defined in ``Target.td``) is mostly used as a base for 706 more complex instruction classes. 707 708 .. code-block:: llvm 709 710 class Instruction { 711 string Namespace = ""; 712 dag OutOperandList; // A dag containing the MI def operand list. 713 dag InOperandList; // A dag containing the MI use operand list. 714 string AsmString = ""; // The .s format to print the instruction with. 715 list<dag> Pattern; // Set to the DAG pattern for this instruction. 716 list<Register> Uses = []; 717 list<Register> Defs = []; 718 list<Predicate> Predicates = []; // predicates turned into isel match code 719 ... remainder not shown for space ... 720 } 721 722 A ``SelectionDAG`` node (``SDNode``) should contain an object representing a 723 target-specific instruction that is defined in ``XXXInstrInfo.td``. The 724 instruction objects should represent instructions from the architecture manual 725 of the target machine (such as the SPARC Architecture Manual for the SPARC 726 target). 727 728 A single instruction from the architecture manual is often modeled as multiple 729 target instructions, depending upon its operands. For example, a manual might 730 describe an add instruction that takes a register or an immediate operand. An 731 LLVM target could model this with two instructions named ``ADDri`` and 732 ``ADDrr``. 733 734 You should define a class for each instruction category and define each opcode 735 as a subclass of the category with appropriate parameters such as the fixed 736 binary encoding of opcodes and extended opcodes. You should map the register 737 bits to the bits of the instruction in which they are encoded (for the JIT). 738 Also you should specify how the instruction should be printed when the 739 automatic assembly printer is used. 740 741 As is described in the SPARC Architecture Manual, Version 8, there are three 742 major 32-bit formats for instructions. Format 1 is only for the ``CALL`` 743 instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high 744 bits of a register) instructions. Format 3 is for other instructions. 745 746 Each of these formats has corresponding classes in ``SparcInstrFormat.td``. 747 ``InstSP`` is a base class for other instruction classes. Additional base 748 classes are specified for more precise formats: for example in 749 ``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for 750 branches. There are three other base classes: ``F3_1`` for register/register 751 operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for 752 floating-point operations. ``SparcInstrInfo.td`` also adds the base class 753 ``Pseudo`` for synthetic SPARC instructions. 754 755 ``SparcInstrInfo.td`` largely consists of operand and instruction definitions 756 for the SPARC target. In ``SparcInstrInfo.td``, the following target 757 description file entry, ``LDrr``, defines the Load Integer instruction for a 758 Word (the ``LD`` SPARC opcode) from a memory address to a register. The first 759 parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this 760 category of operation. The second parameter (``000000``\ :sub:`2`) is the 761 specific operation value for ``LD``/Load Word. The third parameter is the 762 output destination, which is a register operand and defined in the ``Register`` 763 target description file (``IntRegs``). 764 765 .. code-block:: llvm 766 767 def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr), 768 "ld [$addr], $dst", 769 [(set i32:$dst, (load ADDRrr:$addr))]>; 770 771 The fourth parameter is the input source, which uses the address operand 772 ``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``: 773 774 .. code-block:: llvm 775 776 def MEMrr : Operand<i32> { 777 let PrintMethod = "printMemOperand"; 778 let MIOperandInfo = (ops IntRegs, IntRegs); 779 } 780 781 The fifth parameter is a string that is used by the assembly printer and can be 782 left as an empty string until the assembly printer interface is implemented. 783 The sixth and final parameter is the pattern used to match the instruction 784 during the SelectionDAG Select Phase described in :doc:`CodeGenerator`. 785 This parameter is detailed in the next section, :ref:`instruction-selector`. 786 787 Instruction class definitions are not overloaded for different operand types, 788 so separate versions of instructions are needed for register, memory, or 789 immediate value operands. For example, to perform a Load Integer instruction 790 for a Word from an immediate operand to a register, the following instruction 791 class is defined: 792 793 .. code-block:: llvm 794 795 def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr), 796 "ld [$addr], $dst", 797 [(set i32:$dst, (load ADDRri:$addr))]>; 798 799 Writing these definitions for so many similar instructions can involve a lot of 800 cut and paste. In ``.td`` files, the ``multiclass`` directive enables the 801 creation of templates to define several instruction classes at once (using the 802 ``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass`` 803 pattern ``F3_12`` is defined to create 2 instruction classes each time 804 ``F3_12`` is invoked: 805 806 .. code-block:: llvm 807 808 multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> { 809 def rr : F3_1 <2, Op3Val, 810 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c), 811 !strconcat(OpcStr, " $b, $c, $dst"), 812 [(set i32:$dst, (OpNode i32:$b, i32:$c))]>; 813 def ri : F3_2 <2, Op3Val, 814 (outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c), 815 !strconcat(OpcStr, " $b, $c, $dst"), 816 [(set i32:$dst, (OpNode i32:$b, simm13:$c))]>; 817 } 818 819 So when the ``defm`` directive is used for the ``XOR`` and ``ADD`` 820 instructions, as seen below, it creates four instruction objects: ``XORrr``, 821 ``XORri``, ``ADDrr``, and ``ADDri``. 822 823 .. code-block:: llvm 824 825 defm XOR : F3_12<"xor", 0b000011, xor>; 826 defm ADD : F3_12<"add", 0b000000, add>; 827 828 ``SparcInstrInfo.td`` also includes definitions for condition codes that are 829 referenced by branch instructions. The following definitions in 830 ``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code. 831 For example, the 10\ :sup:`th` bit represents the "greater than" condition for 832 integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for 833 floats. 834 835 .. code-block:: llvm 836 837 def ICC_NE : ICC_VAL< 9>; // Not Equal 838 def ICC_E : ICC_VAL< 1>; // Equal 839 def ICC_G : ICC_VAL<10>; // Greater 840 ... 841 def FCC_U : FCC_VAL<23>; // Unordered 842 def FCC_G : FCC_VAL<22>; // Greater 843 def FCC_UG : FCC_VAL<21>; // Unordered or Greater 844 ... 845 846 (Note that ``Sparc.h`` also defines enums that correspond to the same SPARC 847 condition codes. Care must be taken to ensure the values in ``Sparc.h`` 848 correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``, 849 ``SPCC::FCC_U = 23`` and so on.) 850 851 Instruction Operand Mapping 852 --------------------------- 853 854 The code generator backend maps instruction operands to fields in the 855 instruction. Operands are assigned to unbound fields in the instruction in the 856 order they are defined. Fields are bound when they are assigned a value. For 857 example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1`` 858 format instruction having three operands. 859 860 .. code-block:: llvm 861 862 def XNORrr : F3_1<2, 0b000111, 863 (outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c), 864 "xnor $b, $c, $dst", 865 [(set i32:$dst, (not (xor i32:$b, i32:$c)))]>; 866 867 The instruction templates in ``SparcInstrFormats.td`` show the base class for 868 ``F3_1`` is ``InstSP``. 869 870 .. code-block:: llvm 871 872 class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction { 873 field bits<32> Inst; 874 let Namespace = "SP"; 875 bits<2> op; 876 let Inst{31-30} = op; 877 dag OutOperandList = outs; 878 dag InOperandList = ins; 879 let AsmString = asmstr; 880 let Pattern = pattern; 881 } 882 883 ``InstSP`` leaves the ``op`` field unbound. 884 885 .. code-block:: llvm 886 887 class F3<dag outs, dag ins, string asmstr, list<dag> pattern> 888 : InstSP<outs, ins, asmstr, pattern> { 889 bits<5> rd; 890 bits<6> op3; 891 bits<5> rs1; 892 let op{1} = 1; // Op = 2 or 3 893 let Inst{29-25} = rd; 894 let Inst{24-19} = op3; 895 let Inst{18-14} = rs1; 896 } 897 898 ``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1`` 899 fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and 900 ``rs1`` fields. 901 902 .. code-block:: llvm 903 904 class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins, 905 string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> { 906 bits<8> asi = 0; // asi not currently used 907 bits<5> rs2; 908 let op = opVal; 909 let op3 = op3val; 910 let Inst{13} = 0; // i field = 0 911 let Inst{12-5} = asi; // address space identifier 912 let Inst{4-0} = rs2; 913 } 914 915 ``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1`` 916 format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2`` 917 fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``, 918 and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively. 919 920 Instruction Operand Name Mapping 921 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 922 923 TableGen will also generate a function called getNamedOperandIdx() which 924 can be used to look up an operand's index in a MachineInstr based on its 925 TableGen name. Setting the UseNamedOperandTable bit in an instruction's 926 TableGen definition will add all of its operands to an enumeration in the 927 llvm::XXX:OpName namespace and also add an entry for it into the OperandMap 928 table, which can be queried using getNamedOperandIdx() 929 930 .. code-block:: llvm 931 932 int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0 933 int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b); // => 1 934 int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c); // => 2 935 int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d); // => -1 936 937 ... 938 939 The entries in the OpName enum are taken verbatim from the TableGen definitions, 940 so operands with lowercase names will have lower case entries in the enum. 941 942 To include the getNamedOperandIdx() function in your backend, you will need 943 to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h. 944 For example: 945 946 XXXInstrInfo.cpp: 947 948 .. code-block:: c++ 949 950 #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function 951 #include "XXXGenInstrInfo.inc" 952 953 XXXInstrInfo.h: 954 955 .. code-block:: c++ 956 957 #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum 958 #include "XXXGenInstrInfo.inc" 959 960 namespace XXX { 961 int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex); 962 } // End namespace XXX 963 964 Instruction Operand Types 965 ^^^^^^^^^^^^^^^^^^^^^^^^^ 966 967 TableGen will also generate an enumeration consisting of all named Operand 968 types defined in the backend, in the llvm::XXX::OpTypes namespace. 969 Some common immediate Operand types (for instance i8, i32, i64, f32, f64) 970 are defined for all targets in ``include/llvm/Target/Target.td``, and are 971 available in each Target's OpTypes enum. Also, only named Operand types appear 972 in the enumeration: anonymous types are ignored. 973 For example, the X86 backend defines ``brtarget`` and ``brtarget8``, both 974 instances of the TableGen ``Operand`` class, which represent branch target 975 operands: 976 977 .. code-block:: llvm 978 979 def brtarget : Operand<OtherVT>; 980 def brtarget8 : Operand<OtherVT>; 981 982 This results in: 983 984 .. code-block:: c++ 985 986 namespace X86 { 987 namespace OpTypes { 988 enum OperandType { 989 ... 990 brtarget, 991 brtarget8, 992 ... 993 i32imm, 994 i64imm, 995 ... 996 OPERAND_TYPE_LIST_END 997 } // End namespace OpTypes 998 } // End namespace X86 999 1000 In typical TableGen fashion, to use the enum, you will need to define a 1001 preprocessor macro: 1002 1003 .. code-block:: c++ 1004 1005 #define GET_INSTRINFO_OPERAND_TYPES_ENUM // For OpTypes enum 1006 #include "XXXGenInstrInfo.inc" 1007 1008 1009 Instruction Scheduling 1010 ---------------------- 1011 1012 Instruction itineraries can be queried using MCDesc::getSchedClass(). The 1013 value can be named by an enumemation in llvm::XXX::Sched namespace generated 1014 by TableGen in XXXGenInstrInfo.inc. The name of the schedule classes are 1015 the same as provided in XXXSchedule.td plus a default NoItinerary class. 1016 1017 Instruction Relation Mapping 1018 ---------------------------- 1019 1020 This TableGen feature is used to relate instructions with each other. It is 1021 particularly useful when you have multiple instruction formats and need to 1022 switch between them after instruction selection. This entire feature is driven 1023 by relation models which can be defined in ``XXXInstrInfo.td`` files 1024 according to the target-specific instruction set. Relation models are defined 1025 using ``InstrMapping`` class as a base. TableGen parses all the models 1026 and generates instruction relation maps using the specified information. 1027 Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file 1028 along with the functions to query them. For the detailed information on how to 1029 use this feature, please refer to :doc:`HowToUseInstrMappings`. 1030 1031 Implement a subclass of ``TargetInstrInfo`` 1032 ------------------------------------------- 1033 1034 The final step is to hand code portions of ``XXXInstrInfo``, which implements 1035 the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`). 1036 These functions return ``0`` or a Boolean or they assert, unless overridden. 1037 Here's a list of functions that are overridden for the SPARC implementation in 1038 ``SparcInstrInfo.cpp``: 1039 1040 * ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct 1041 load from a stack slot, return the register number of the destination and the 1042 ``FrameIndex`` of the stack slot. 1043 1044 * ``isStoreToStackSlot`` --- If the specified machine instruction is a direct 1045 store to a stack slot, return the register number of the destination and the 1046 ``FrameIndex`` of the stack slot. 1047 1048 * ``copyPhysReg`` --- Copy values between a pair of physical registers. 1049 1050 * ``storeRegToStackSlot`` --- Store a register value to a stack slot. 1051 1052 * ``loadRegFromStackSlot`` --- Load a register value from a stack slot. 1053 1054 * ``storeRegToAddr`` --- Store a register value to memory. 1055 1056 * ``loadRegFromAddr`` --- Load a register value from memory. 1057 1058 * ``foldMemoryOperand`` --- Attempt to combine instructions of any load or 1059 store instruction for the specified operand(s). 1060 1061 Branch Folding and If Conversion 1062 -------------------------------- 1063 1064 Performance can be improved by combining instructions or by eliminating 1065 instructions that are never reached. The ``AnalyzeBranch`` method in 1066 ``XXXInstrInfo`` may be implemented to examine conditional instructions and 1067 remove unnecessary instructions. ``AnalyzeBranch`` looks at the end of a 1068 machine basic block (MBB) for opportunities for improvement, such as branch 1069 folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine 1070 function passes (see the source files ``BranchFolding.cpp`` and 1071 ``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``AnalyzeBranch`` 1072 to improve the control flow graph that represents the instructions. 1073 1074 Several implementations of ``AnalyzeBranch`` (for ARM, Alpha, and X86) can be 1075 examined as models for your own ``AnalyzeBranch`` implementation. Since SPARC 1076 does not implement a useful ``AnalyzeBranch``, the ARM target implementation is 1077 shown below. 1078 1079 ``AnalyzeBranch`` returns a Boolean value and takes four parameters: 1080 1081 * ``MachineBasicBlock &MBB`` --- The incoming block to be examined. 1082 1083 * ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a 1084 conditional branch that evaluates to true, ``TBB`` is the destination. 1085 1086 * ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to 1087 false, ``FBB`` is returned as the destination. 1088 1089 * ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a 1090 condition for a conditional branch. 1091 1092 In the simplest case, if a block ends without a branch, then it falls through 1093 to the successor block. No destination blocks are specified for either ``TBB`` 1094 or ``FBB``, so both parameters return ``NULL``. The start of the 1095 ``AnalyzeBranch`` (see code below for the ARM target) shows the function 1096 parameters and the code for the simplest case. 1097 1098 .. code-block:: c++ 1099 1100 bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB, 1101 MachineBasicBlock *&TBB, 1102 MachineBasicBlock *&FBB, 1103 std::vector<MachineOperand> &Cond) const 1104 { 1105 MachineBasicBlock::iterator I = MBB.end(); 1106 if (I == MBB.begin() || !isUnpredicatedTerminator(--I)) 1107 return false; 1108 1109 If a block ends with a single unconditional branch instruction, then 1110 ``AnalyzeBranch`` (shown below) should return the destination of that branch in 1111 the ``TBB`` parameter. 1112 1113 .. code-block:: c++ 1114 1115 if (LastOpc == ARM::B || LastOpc == ARM::tB) { 1116 TBB = LastInst->getOperand(0).getMBB(); 1117 return false; 1118 } 1119 1120 If a block ends with two unconditional branches, then the second branch is 1121 never reached. In that situation, as shown below, remove the last branch 1122 instruction and return the penultimate branch in the ``TBB`` parameter. 1123 1124 .. code-block:: c++ 1125 1126 if ((SecondLastOpc == ARM::B || SecondLastOpc == ARM::tB) && 1127 (LastOpc == ARM::B || LastOpc == ARM::tB)) { 1128 TBB = SecondLastInst->getOperand(0).getMBB(); 1129 I = LastInst; 1130 I->eraseFromParent(); 1131 return false; 1132 } 1133 1134 A block may end with a single conditional branch instruction that falls through 1135 to successor block if the condition evaluates to false. In that case, 1136 ``AnalyzeBranch`` (shown below) should return the destination of that 1137 conditional branch in the ``TBB`` parameter and a list of operands in the 1138 ``Cond`` parameter to evaluate the condition. 1139 1140 .. code-block:: c++ 1141 1142 if (LastOpc == ARM::Bcc || LastOpc == ARM::tBcc) { 1143 // Block ends with fall-through condbranch. 1144 TBB = LastInst->getOperand(0).getMBB(); 1145 Cond.push_back(LastInst->getOperand(1)); 1146 Cond.push_back(LastInst->getOperand(2)); 1147 return false; 1148 } 1149 1150 If a block ends with both a conditional branch and an ensuing unconditional 1151 branch, then ``AnalyzeBranch`` (shown below) should return the conditional 1152 branch destination (assuming it corresponds to a conditional evaluation of 1153 "``true``") in the ``TBB`` parameter and the unconditional branch destination 1154 in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A 1155 list of operands to evaluate the condition should be returned in the ``Cond`` 1156 parameter. 1157 1158 .. code-block:: c++ 1159 1160 unsigned SecondLastOpc = SecondLastInst->getOpcode(); 1161 1162 if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) || 1163 (SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) { 1164 TBB = SecondLastInst->getOperand(0).getMBB(); 1165 Cond.push_back(SecondLastInst->getOperand(1)); 1166 Cond.push_back(SecondLastInst->getOperand(2)); 1167 FBB = LastInst->getOperand(0).getMBB(); 1168 return false; 1169 } 1170 1171 For the last two cases (ending with a single conditional branch or ending with 1172 one conditional and one unconditional branch), the operands returned in the 1173 ``Cond`` parameter can be passed to methods of other instructions to create new 1174 branches or perform other operations. An implementation of ``AnalyzeBranch`` 1175 requires the helper methods ``RemoveBranch`` and ``InsertBranch`` to manage 1176 subsequent operations. 1177 1178 ``AnalyzeBranch`` should return false indicating success in most circumstances. 1179 ``AnalyzeBranch`` should only return true when the method is stumped about what 1180 to do, for example, if a block has three terminating branches. 1181 ``AnalyzeBranch`` may return true if it encounters a terminator it cannot 1182 handle, such as an indirect branch. 1183 1184 .. _instruction-selector: 1185 1186 Instruction Selector 1187 ==================== 1188 1189 LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of 1190 the ``SelectionDAG`` ideally represent native target instructions. During code 1191 generation, instruction selection passes are performed to convert non-native 1192 DAG instructions into native target-specific instructions. The pass described 1193 in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG 1194 instruction selection. Optionally, a pass may be defined (in 1195 ``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch 1196 instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes 1197 operations and data types not supported natively (legalizes) in a 1198 ``SelectionDAG``. 1199 1200 TableGen generates code for instruction selection using the following target 1201 description input files: 1202 1203 * ``XXXInstrInfo.td`` --- Contains definitions of instructions in a 1204 target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is 1205 included in ``XXXISelDAGToDAG.cpp``. 1206 1207 * ``XXXCallingConv.td`` --- Contains the calling and return value conventions 1208 for the target architecture, and it generates ``XXXGenCallingConv.inc``, 1209 which is included in ``XXXISelLowering.cpp``. 1210 1211 The implementation of an instruction selection pass must include a header that 1212 declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In 1213 ``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction 1214 selection pass into the queue of passes to run. 1215 1216 The LLVM static compiler (``llc``) is an excellent tool for visualizing the 1217 contents of DAGs. To display the ``SelectionDAG`` before or after specific 1218 processing phases, use the command line options for ``llc``, described at 1219 :ref:`SelectionDAG-Process`. 1220 1221 To describe instruction selector behavior, you should add patterns for lowering 1222 LLVM code into a ``SelectionDAG`` as the last parameter of the instruction 1223 definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``, 1224 this entry defines a register store operation, and the last parameter describes 1225 a pattern with the store DAG operator. 1226 1227 .. code-block:: llvm 1228 1229 def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src), 1230 "st $src, [$addr]", [(store i32:$src, ADDRrr:$addr)]>; 1231 1232 ``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``: 1233 1234 .. code-block:: llvm 1235 1236 def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>; 1237 1238 The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function 1239 defined in an implementation of the Instructor Selector (such as 1240 ``SparcISelDAGToDAG.cpp``). 1241 1242 In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined 1243 below: 1244 1245 .. code-block:: llvm 1246 1247 def store : PatFrag<(ops node:$val, node:$ptr), 1248 (st node:$val, node:$ptr), [{ 1249 if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N)) 1250 return !ST->isTruncatingStore() && 1251 ST->getAddressingMode() == ISD::UNINDEXED; 1252 return false; 1253 }]>; 1254 1255 ``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the 1256 ``SelectCode`` method that is used to call the appropriate processing method 1257 for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE`` 1258 for the ``ISD::STORE`` opcode. 1259 1260 .. code-block:: c++ 1261 1262 SDNode *SelectCode(SDValue N) { 1263 ... 1264 MVT::ValueType NVT = N.getNode()->getValueType(0); 1265 switch (N.getOpcode()) { 1266 case ISD::STORE: { 1267 switch (NVT) { 1268 default: 1269 return Select_ISD_STORE(N); 1270 break; 1271 } 1272 break; 1273 } 1274 ... 1275 1276 The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``, 1277 code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method 1278 is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this 1279 instruction. 1280 1281 .. code-block:: c++ 1282 1283 SDNode *Select_ISD_STORE(const SDValue &N) { 1284 SDValue Chain = N.getOperand(0); 1285 if (Predicate_store(N.getNode())) { 1286 SDValue N1 = N.getOperand(1); 1287 SDValue N2 = N.getOperand(2); 1288 SDValue CPTmp0; 1289 SDValue CPTmp1; 1290 1291 // Pattern: (st:void i32:i32:$src, 1292 // ADDRrr:i32:$addr)<<P:Predicate_store>> 1293 // Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src) 1294 // Pattern complexity = 13 cost = 1 size = 0 1295 if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) && 1296 N1.getNode()->getValueType(0) == MVT::i32 && 1297 N2.getNode()->getValueType(0) == MVT::i32) { 1298 return Emit_22(N, SP::STrr, CPTmp0, CPTmp1); 1299 } 1300 ... 1301 1302 The SelectionDAG Legalize Phase 1303 ------------------------------- 1304 1305 The Legalize phase converts a DAG to use types and operations that are natively 1306 supported by the target. For natively unsupported types and operations, you 1307 need to add code to the target-specific ``XXXTargetLowering`` implementation to 1308 convert unsupported types and operations to supported ones. 1309 1310 In the constructor for the ``XXXTargetLowering`` class, first use the 1311 ``addRegisterClass`` method to specify which types are supported and which 1312 register classes are associated with them. The code for the register classes 1313 are generated by TableGen from ``XXXRegisterInfo.td`` and placed in 1314 ``XXXGenRegisterInfo.h.inc``. For example, the implementation of the 1315 constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``) 1316 starts with the following code: 1317 1318 .. code-block:: c++ 1319 1320 addRegisterClass(MVT::i32, SP::IntRegsRegisterClass); 1321 addRegisterClass(MVT::f32, SP::FPRegsRegisterClass); 1322 addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass); 1323 1324 You should examine the node types in the ``ISD`` namespace 1325 (``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations 1326 the target natively supports. For operations that do **not** have native 1327 support, add a callback to the constructor for the ``XXXTargetLowering`` class, 1328 so the instruction selection process knows what to do. The ``TargetLowering`` 1329 class callback methods (declared in ``llvm/Target/TargetLowering.h``) are: 1330 1331 * ``setOperationAction`` --- General operation. 1332 * ``setLoadExtAction`` --- Load with extension. 1333 * ``setTruncStoreAction`` --- Truncating store. 1334 * ``setIndexedLoadAction`` --- Indexed load. 1335 * ``setIndexedStoreAction`` --- Indexed store. 1336 * ``setConvertAction`` --- Type conversion. 1337 * ``setCondCodeAction`` --- Support for a given condition code. 1338 1339 Note: on older releases, ``setLoadXAction`` is used instead of 1340 ``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not 1341 be supported. Examine your release to see what methods are specifically 1342 supported. 1343 1344 These callbacks are used to determine that an operation does or does not work 1345 with a specified type (or types). And in all cases, the third parameter is a 1346 ``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or 1347 ``Legal``. ``SparcISelLowering.cpp`` contains examples of all four 1348 ``LegalAction`` values. 1349 1350 Promote 1351 ^^^^^^^ 1352 1353 For an operation without native support for a given type, the specified type 1354 may be promoted to a larger type that is supported. For example, SPARC does 1355 not support a sign-extending load for Boolean values (``i1`` type), so in 1356 ``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes 1357 ``i1`` type values to a large type before loading. 1358 1359 .. code-block:: c++ 1360 1361 setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote); 1362 1363 Expand 1364 ^^^^^^ 1365 1366 For a type without native support, a value may need to be broken down further, 1367 rather than promoted. For an operation without native support, a combination 1368 of other operations may be used to similar effect. In SPARC, the 1369 floating-point sine and cosine trig operations are supported by expansion to 1370 other operations, as indicated by the third parameter, ``Expand``, to 1371 ``setOperationAction``: 1372 1373 .. code-block:: c++ 1374 1375 setOperationAction(ISD::FSIN, MVT::f32, Expand); 1376 setOperationAction(ISD::FCOS, MVT::f32, Expand); 1377 1378 Custom 1379 ^^^^^^ 1380 1381 For some operations, simple type promotion or operation expansion may be 1382 insufficient. In some cases, a special intrinsic function must be implemented. 1383 1384 For example, a constant value may require special treatment, or an operation 1385 may require spilling and restoring registers in the stack and working with 1386 register allocators. 1387 1388 As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion 1389 from a floating point value to a signed integer, first the 1390 ``setOperationAction`` should be called with ``Custom`` as the third parameter: 1391 1392 .. code-block:: c++ 1393 1394 setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom); 1395 1396 In the ``LowerOperation`` method, for each ``Custom`` operation, a case 1397 statement should be added to indicate what function to call. In the following 1398 code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method: 1399 1400 .. code-block:: c++ 1401 1402 SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) { 1403 switch (Op.getOpcode()) { 1404 case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG); 1405 ... 1406 } 1407 } 1408 1409 Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to 1410 convert the floating-point value to an integer. 1411 1412 .. code-block:: c++ 1413 1414 static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) { 1415 assert(Op.getValueType() == MVT::i32); 1416 Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0)); 1417 return DAG.getNode(ISD::BITCAST, MVT::i32, Op); 1418 } 1419 1420 Legal 1421 ^^^^^ 1422 1423 The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation 1424 **is** natively supported. ``Legal`` represents the default condition, so it 1425 is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an 1426 operation to count the bits set in an integer) is natively supported only for 1427 SPARC v9. The following code enables the ``Expand`` conversion technique for 1428 non-v9 SPARC implementations. 1429 1430 .. code-block:: c++ 1431 1432 setOperationAction(ISD::CTPOP, MVT::i32, Expand); 1433 ... 1434 if (TM.getSubtarget<SparcSubtarget>().isV9()) 1435 setOperationAction(ISD::CTPOP, MVT::i32, Legal); 1436 1437 Calling Conventions 1438 ------------------- 1439 1440 To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses 1441 interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in 1442 ``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor 1443 file ``XXXGenCallingConv.td`` and generate the header file 1444 ``XXXGenCallingConv.inc``, which is typically included in 1445 ``XXXISelLowering.cpp``. You can use the interfaces in 1446 ``TargetCallingConv.td`` to specify: 1447 1448 * The order of parameter allocation. 1449 1450 * Where parameters and return values are placed (that is, on the stack or in 1451 registers). 1452 1453 * Which registers may be used. 1454 1455 * Whether the caller or callee unwinds the stack. 1456 1457 The following example demonstrates the use of the ``CCIfType`` and 1458 ``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is, 1459 if the current argument is of type ``f32`` or ``f64``), then the action is 1460 performed. In this case, the ``CCAssignToReg`` action assigns the argument 1461 value to the first available register: either ``R0`` or ``R1``. 1462 1463 .. code-block:: llvm 1464 1465 CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>> 1466 1467 ``SparcCallingConv.td`` contains definitions for a target-specific return-value 1468 calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention 1469 (``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates 1470 which registers are used for specified scalar return types. A single-precision 1471 float is returned to register ``F0``, and a double-precision float goes to 1472 register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``. 1473 1474 .. code-block:: llvm 1475 1476 def RetCC_Sparc32 : CallingConv<[ 1477 CCIfType<[i32], CCAssignToReg<[I0, I1]>>, 1478 CCIfType<[f32], CCAssignToReg<[F0]>>, 1479 CCIfType<[f64], CCAssignToReg<[D0]>> 1480 ]>; 1481 1482 The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces 1483 ``CCAssignToStack``, which assigns the value to a stack slot with the specified 1484 size and alignment. In the example below, the first parameter, 4, indicates 1485 the size of the slot, and the second parameter, also 4, indicates the stack 1486 alignment along 4-byte units. (Special cases: if size is zero, then the ABI 1487 size is used; if alignment is zero, then the ABI alignment is used.) 1488 1489 .. code-block:: llvm 1490 1491 def CC_Sparc32 : CallingConv<[ 1492 // All arguments get passed in integer registers if there is space. 1493 CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>, 1494 CCAssignToStack<4, 4> 1495 ]>; 1496 1497 ``CCDelegateTo`` is another commonly used interface, which tries to find a 1498 specified sub-calling convention, and, if a match is found, it is invoked. In 1499 the following example (in ``X86CallingConv.td``), the definition of 1500 ``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is 1501 assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is 1502 invoked. 1503 1504 .. code-block:: llvm 1505 1506 def RetCC_X86_32_C : CallingConv<[ 1507 CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>, 1508 CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>, 1509 CCDelegateTo<RetCC_X86Common> 1510 ]>; 1511 1512 ``CCIfCC`` is an interface that attempts to match the given name to the current 1513 calling convention. If the name identifies the current calling convention, 1514 then a specified action is invoked. In the following example (in 1515 ``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then 1516 ``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in 1517 use, then ``RetCC_X86_32_SSE`` is invoked. 1518 1519 .. code-block:: llvm 1520 1521 def RetCC_X86_32 : CallingConv<[ 1522 CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>, 1523 CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>, 1524 CCDelegateTo<RetCC_X86_32_C> 1525 ]>; 1526 1527 Other calling convention interfaces include: 1528 1529 * ``CCIf <predicate, action>`` --- If the predicate matches, apply the action. 1530 1531 * ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``" 1532 attribute, then apply the action. 1533 1534 * ``CCIfNest <action>`` --- If the argument is marked with the "``nest``" 1535 attribute, then apply the action. 1536 1537 * ``CCIfNotVarArg <action>`` --- If the current function does not take a 1538 variable number of arguments, apply the action. 1539 1540 * ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to 1541 ``CCAssignToReg``, but with a shadow list of registers. 1542 1543 * ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the 1544 minimum specified size and alignment. 1545 1546 * ``CCPromoteToType <type>`` --- Promote the current value to the specified 1547 type. 1548 1549 * ``CallingConv <[actions]>`` --- Define each calling convention that is 1550 supported. 1551 1552 Assembly Printer 1553 ================ 1554 1555 During the code emission stage, the code generator may utilize an LLVM pass to 1556 produce assembly output. To do this, you want to implement the code for a 1557 printer that converts LLVM IR to a GAS-format assembly language for your target 1558 machine, using the following steps: 1559 1560 * Define all the assembly strings for your target, adding them to the 1561 instructions defined in the ``XXXInstrInfo.td`` file. (See 1562 :ref:`instruction-set`.) TableGen will produce an output file 1563 (``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction`` 1564 method for the ``XXXAsmPrinter`` class. 1565 1566 * Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of 1567 the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``). 1568 1569 * Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for 1570 ``TargetAsmInfo`` properties and sometimes new implementations for methods. 1571 1572 * Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that 1573 performs the LLVM-to-assembly conversion. 1574 1575 The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the 1576 ``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly, 1577 ``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo`` 1578 replacement values that override the default values in ``TargetAsmInfo.cpp``. 1579 For example in ``SparcTargetAsmInfo.cpp``: 1580 1581 .. code-block:: c++ 1582 1583 SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) { 1584 Data16bitsDirective = "\t.half\t"; 1585 Data32bitsDirective = "\t.word\t"; 1586 Data64bitsDirective = 0; // .xword is only supported by V9. 1587 ZeroDirective = "\t.skip\t"; 1588 CommentString = "!"; 1589 ConstantPoolSection = "\t.section \".rodata\",#alloc\n"; 1590 } 1591 1592 The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example 1593 where the target specific ``TargetAsmInfo`` class uses an overridden methods: 1594 ``ExpandInlineAsm``. 1595 1596 A target-specific implementation of ``AsmPrinter`` is written in 1597 ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts 1598 the LLVM to printable assembly. The implementation must include the following 1599 headers that have declarations for the ``AsmPrinter`` and 1600 ``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of 1601 ``FunctionPass``. 1602 1603 .. code-block:: c++ 1604 1605 #include "llvm/CodeGen/AsmPrinter.h" 1606 #include "llvm/CodeGen/MachineFunctionPass.h" 1607 1608 As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set 1609 up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is 1610 instantiated to process variable names. 1611 1612 In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in 1613 ``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In 1614 ``MachineFunctionPass``, the ``runOnFunction`` method invokes 1615 ``runOnMachineFunction``. Target-specific implementations of 1616 ``runOnMachineFunction`` differ, but generally do the following to process each 1617 machine function: 1618 1619 * Call ``SetupMachineFunction`` to perform initialization. 1620 1621 * Call ``EmitConstantPool`` to print out (to the output stream) constants which 1622 have been spilled to memory. 1623 1624 * Call ``EmitJumpTableInfo`` to print out jump tables used by the current 1625 function. 1626 1627 * Print out the label for the current function. 1628 1629 * Print out the code for the function, including basic block labels and the 1630 assembly for the instruction (using ``printInstruction``) 1631 1632 The ``XXXAsmPrinter`` implementation must also include the code generated by 1633 TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in 1634 ``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction`` 1635 method that may call these methods: 1636 1637 * ``printOperand`` 1638 * ``printMemOperand`` 1639 * ``printCCOperand`` (for conditional statements) 1640 * ``printDataDirective`` 1641 * ``printDeclare`` 1642 * ``printImplicitDef`` 1643 * ``printInlineAsm`` 1644 1645 The implementations of ``printDeclare``, ``printImplicitDef``, 1646 ``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally 1647 adequate for printing assembly and do not need to be overridden. 1648 1649 The ``printOperand`` method is implemented with a long ``switch``/``case`` 1650 statement for the type of operand: register, immediate, basic block, external 1651 symbol, global address, constant pool index, or jump table index. For an 1652 instruction with a memory address operand, the ``printMemOperand`` method 1653 should be implemented to generate the proper output. Similarly, 1654 ``printCCOperand`` should be used to print a conditional operand. 1655 1656 ``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be 1657 called to shut down the assembly printer. During ``doFinalization``, global 1658 variables and constants are printed to output. 1659 1660 Subtarget Support 1661 ================= 1662 1663 Subtarget support is used to inform the code generation process of instruction 1664 set variations for a given chip set. For example, the LLVM SPARC 1665 implementation provided covers three major versions of the SPARC microprocessor 1666 architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a 1667 64-bit architecture), and the UltraSPARC architecture. V8 has 16 1668 double-precision floating-point registers that are also usable as either 32 1669 single-precision or 8 quad-precision registers. V8 is also purely big-endian. 1670 V9 has 32 double-precision floating-point registers that are also usable as 16 1671 quad-precision registers, but cannot be used as single-precision registers. 1672 The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set 1673 extensions. 1674 1675 If subtarget support is needed, you should implement a target-specific 1676 ``XXXSubtarget`` class for your architecture. This class should process the 1677 command-line options ``-mcpu=`` and ``-mattr=``. 1678 1679 TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to 1680 generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the 1681 ``SubtargetFeature`` interface is defined. The first 4 string parameters of 1682 the ``SubtargetFeature`` interface are a feature name, an attribute set by the 1683 feature, the value of the attribute, and a description of the feature. (The 1684 fifth parameter is a list of features whose presence is implied, and its 1685 default value is an empty array.) 1686 1687 .. code-block:: llvm 1688 1689 class SubtargetFeature<string n, string a, string v, string d, 1690 list<SubtargetFeature> i = []> { 1691 string Name = n; 1692 string Attribute = a; 1693 string Value = v; 1694 string Desc = d; 1695 list<SubtargetFeature> Implies = i; 1696 } 1697 1698 In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the 1699 following features. 1700 1701 .. code-block:: llvm 1702 1703 def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true", 1704 "Enable SPARC-V9 instructions">; 1705 def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8", 1706 "V8DeprecatedInsts", "true", 1707 "Enable deprecated V8 instructions in V9 mode">; 1708 def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true", 1709 "Enable UltraSPARC Visual Instruction Set extensions">; 1710 1711 Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to 1712 define particular SPARC processor subtypes that may have the previously 1713 described features. 1714 1715 .. code-block:: llvm 1716 1717 class Proc<string Name, list<SubtargetFeature> Features> 1718 : Processor<Name, NoItineraries, Features>; 1719 1720 def : Proc<"generic", []>; 1721 def : Proc<"v8", []>; 1722 def : Proc<"supersparc", []>; 1723 def : Proc<"sparclite", []>; 1724 def : Proc<"f934", []>; 1725 def : Proc<"hypersparc", []>; 1726 def : Proc<"sparclite86x", []>; 1727 def : Proc<"sparclet", []>; 1728 def : Proc<"tsc701", []>; 1729 def : Proc<"v9", [FeatureV9]>; 1730 def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>; 1731 def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>; 1732 def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>; 1733 1734 From ``Target.td`` and ``Sparc.td`` files, the resulting 1735 ``SparcGenSubtarget.inc`` specifies enum values to identify the features, 1736 arrays of constants to represent the CPU features and CPU subtypes, and the 1737 ``ParseSubtargetFeatures`` method that parses the features string that sets 1738 specified subtarget options. The generated ``SparcGenSubtarget.inc`` file 1739 should be included in the ``SparcSubtarget.cpp``. The target-specific 1740 implementation of the ``XXXSubtarget`` method should follow this pseudocode: 1741 1742 .. code-block:: c++ 1743 1744 XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) { 1745 // Set the default features 1746 // Determine default and user specified characteristics of the CPU 1747 // Call ParseSubtargetFeatures(FS, CPU) to parse the features string 1748 // Perform any additional operations 1749 } 1750 1751 JIT Support 1752 =========== 1753 1754 The implementation of a target machine optionally includes a Just-In-Time (JIT) 1755 code generator that emits machine code and auxiliary structures as binary 1756 output that can be written directly to memory. To do this, implement JIT code 1757 generation by performing the following steps: 1758 1759 * Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass 1760 that transforms target-machine instructions into relocatable machine 1761 code. 1762 1763 * Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for 1764 target-specific code-generation activities, such as emitting machine code and 1765 stubs. 1766 1767 * Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object 1768 through its ``getJITInfo`` method. 1769 1770 There are several different approaches to writing the JIT support code. For 1771 instance, TableGen and target descriptor files may be used for creating a JIT 1772 code generator, but are not mandatory. For the Alpha and PowerPC target 1773 machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which 1774 contains the binary coding of machine instructions and the 1775 ``getBinaryCodeForInstr`` method to access those codes. Other JIT 1776 implementations do not. 1777 1778 Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the 1779 ``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the 1780 ``MachineCodeEmitter`` class containing code for several callback functions 1781 that write data (in bytes, words, strings, etc.) to the output stream. 1782 1783 Machine Code Emitter 1784 -------------------- 1785 1786 In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is 1787 implemented as a function pass (subclass of ``MachineFunctionPass``). The 1788 target-specific implementation of ``runOnMachineFunction`` (invoked by 1789 ``runOnFunction`` in ``MachineFunctionPass``) iterates through the 1790 ``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and 1791 emit binary code. ``emitInstruction`` is largely implemented with case 1792 statements on the instruction types defined in ``XXXInstrInfo.h``. For 1793 example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built 1794 around the following ``switch``/``case`` statements: 1795 1796 .. code-block:: c++ 1797 1798 switch (Desc->TSFlags & X86::FormMask) { 1799 case X86II::Pseudo: // for not yet implemented instructions 1800 ... // or pseudo-instructions 1801 break; 1802 case X86II::RawFrm: // for instructions with a fixed opcode value 1803 ... 1804 break; 1805 case X86II::AddRegFrm: // for instructions that have one register operand 1806 ... // added to their opcode 1807 break; 1808 case X86II::MRMDestReg:// for instructions that use the Mod/RM byte 1809 ... // to specify a destination (register) 1810 break; 1811 case X86II::MRMDestMem:// for instructions that use the Mod/RM byte 1812 ... // to specify a destination (memory) 1813 break; 1814 case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte 1815 ... // to specify a source (register) 1816 break; 1817 case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte 1818 ... // to specify a source (memory) 1819 break; 1820 case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on 1821 case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and 1822 case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field 1823 case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data 1824 ... 1825 break; 1826 case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on 1827 case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and 1828 case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field 1829 case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data 1830 ... 1831 break; 1832 case X86II::MRMInitReg: // for instructions whose source and 1833 ... // destination are the same register 1834 break; 1835 } 1836 1837 The implementations of these case statements often first emit the opcode and 1838 then get the operand(s). Then depending upon the operand, helper methods may 1839 be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``, 1840 for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is 1841 the opcode added to the register operand. Then an object representing the 1842 machine operand, ``MO1``, is extracted. The helper methods such as 1843 ``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``, 1844 ``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type. 1845 (``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``, 1846 ``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``, 1847 and ``emitJumpTableAddress`` that emit the data into the output stream.) 1848 1849 .. code-block:: c++ 1850 1851 case X86II::AddRegFrm: 1852 MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg())); 1853 1854 if (CurOp != NumOps) { 1855 const MachineOperand &MO1 = MI.getOperand(CurOp++); 1856 unsigned Size = X86InstrInfo::sizeOfImm(Desc); 1857 if (MO1.isImmediate()) 1858 emitConstant(MO1.getImm(), Size); 1859 else { 1860 unsigned rt = Is64BitMode ? X86::reloc_pcrel_word 1861 : (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word); 1862 if (Opcode == X86::MOV64ri) 1863 rt = X86::reloc_absolute_dword; // FIXME: add X86II flag? 1864 if (MO1.isGlobalAddress()) { 1865 bool NeedStub = isa<Function>(MO1.getGlobal()); 1866 bool isLazy = gvNeedsLazyPtr(MO1.getGlobal()); 1867 emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0, 1868 NeedStub, isLazy); 1869 } else if (MO1.isExternalSymbol()) 1870 emitExternalSymbolAddress(MO1.getSymbolName(), rt); 1871 else if (MO1.isConstantPoolIndex()) 1872 emitConstPoolAddress(MO1.getIndex(), rt); 1873 else if (MO1.isJumpTableIndex()) 1874 emitJumpTableAddress(MO1.getIndex(), rt); 1875 } 1876 } 1877 break; 1878 1879 In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which 1880 is a ``RelocationType`` enum that may be used to relocate addresses (for 1881 example, a global address with a PIC base offset). The ``RelocationType`` enum 1882 for that target is defined in the short target-specific ``XXXRelocations.h`` 1883 file. The ``RelocationType`` is used by the ``relocate`` method defined in 1884 ``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols. 1885 1886 For example, ``X86Relocations.h`` specifies the following relocation types for 1887 the X86 addresses. In all four cases, the relocated value is added to the 1888 value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``, 1889 there is an additional initial adjustment. 1890 1891 .. code-block:: c++ 1892 1893 enum RelocationType { 1894 reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc 1895 reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base 1896 reloc_absolute_word = 2, // absolute relocation; no additional adjustment 1897 reloc_absolute_dword = 3 // absolute relocation; no additional adjustment 1898 }; 1899 1900 Target JIT Info 1901 --------------- 1902 1903 ``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific 1904 code-generation activities, such as emitting machine code and stubs. At 1905 minimum, a target-specific version of ``XXXJITInfo`` implements the following: 1906 1907 * ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a 1908 function that is used for compilation. 1909 1910 * ``emitFunctionStub`` --- Returns a native function with a specified address 1911 for a callback function. 1912 1913 * ``relocate`` --- Changes the addresses of referenced globals, based on 1914 relocation types. 1915 1916 * Callback function that are wrappers to a function stub that is used when the 1917 real target is not initially known. 1918 1919 ``getLazyResolverFunction`` is generally trivial to implement. It makes the 1920 incoming parameter as the global ``JITCompilerFunction`` and returns the 1921 callback function that will be used a function wrapper. For the Alpha target 1922 (in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is 1923 simply: 1924 1925 .. code-block:: c++ 1926 1927 TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction( 1928 JITCompilerFn F) { 1929 JITCompilerFunction = F; 1930 return AlphaCompilationCallback; 1931 } 1932 1933 For the X86 target, the ``getLazyResolverFunction`` implementation is a little 1934 more complicated, because it returns a different callback function for 1935 processors with SSE instructions and XMM registers. 1936 1937 The callback function initially saves and later restores the callee register 1938 values, incoming arguments, and frame and return address. The callback 1939 function needs low-level access to the registers or stack, so it is typically 1940 implemented with assembler. 1941 1942