Home | History | Annotate | Download | only in TableGen
      1 ===========================
      2 TableGen Language Reference
      3 ===========================
      4 
      5 .. contents::
      6    :local:
      7 
      8 .. warning::
      9    This document is extremely rough. If you find something lacking, please
     10    fix it, file a documentation bug, or ask about it on llvm-dev.
     11 
     12 Introduction
     13 ============
     14 
     15 This document is meant to be a normative spec about the TableGen language
     16 in and of itself (i.e. how to understand a given construct in terms of how
     17 it affects the final set of records represented by the TableGen file). If
     18 you are unsure if this document is really what you are looking for, please
     19 read the :doc:`introduction to TableGen <index>` first.
     20 
     21 Notation
     22 ========
     23 
     24 The lexical and syntax notation used here is intended to imitate
     25 `Python's`_. In particular, for lexical definitions, the productions
     26 operate at the character level and there is no implied whitespace between
     27 elements. The syntax definitions operate at the token level, so there is
     28 implied whitespace between tokens.
     29 
     30 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
     31 
     32 Lexical Analysis
     33 ================
     34 
     35 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
     36 comments.
     37 
     38 The following is a listing of the basic punctuation tokens::
     39 
     40    - + [ ] { } ( ) < > : ; .  = ? #
     41 
     42 Numeric literals take one of the following forms:
     43 
     44 .. TableGen actually will lex some pretty strange sequences an interpret
     45    them as numbers. What is shown here is an attempt to approximate what it
     46    "should" accept.
     47 
     48 .. productionlist::
     49    TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
     50    DecimalInteger: ["+" | "-"] ("0"..."9")+
     51    HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
     52    BinInteger: "0b" ("0" | "1")+
     53 
     54 One aspect to note is that the :token:`DecimalInteger` token *includes* the
     55 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
     56 most languages do.
     57 
     58 Also note that :token:`BinInteger` creates a value of type ``bits<n>``
     59 (where ``n`` is the number of bits).  This will implicitly convert to
     60 integers when needed.
     61 
     62 TableGen has identifier-like tokens:
     63 
     64 .. productionlist::
     65    ualpha: "a"..."z" | "A"..."Z" | "_"
     66    TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
     67    TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
     68 
     69 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
     70 begin with a number. In case of ambiguity, a token will be interpreted as a
     71 numeric literal rather than an identifier.
     72 
     73 TableGen also has two string-like literals:
     74 
     75 .. productionlist::
     76    TokString: '"' <non-'"' characters and C-like escapes> '"'
     77    TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
     78 
     79 :token:`TokCodeFragment` is essentially a multiline string literal
     80 delimited by ``[{`` and ``}]``.
     81 
     82 .. note::
     83    The current implementation accepts the following C-like escapes::
     84 
     85       \\ \' \" \t \n
     86 
     87 TableGen also has the following keywords::
     88 
     89    bit   bits      class   code         dag
     90    def   foreach   defm    field        in
     91    int   let       list    multiclass   string
     92 
     93 TableGen also has "bang operators" which have a
     94 wide variety of meanings:
     95 
     96 .. productionlist::
     97    BangOperator: one of
     98                :!eq     !if      !head    !tail      !con
     99                :!add    !shl     !sra     !srl       !and
    100                :!cast   !empty   !subst   !foreach   !listconcat   !strconcat
    101 
    102 Syntax
    103 ======
    104 
    105 TableGen has an ``include`` mechanism. It does not play a role in the
    106 syntax per se, since it is lexically replaced with the contents of the
    107 included file.
    108 
    109 .. productionlist::
    110    IncludeDirective: "include" `TokString`
    111 
    112 TableGen's top-level production consists of "objects".
    113 
    114 .. productionlist::
    115    TableGenFile: `Object`*
    116    Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
    117 
    118 ``class``\es
    119 ------------
    120 
    121 .. productionlist::
    122    Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
    123 
    124 A ``class`` declaration creates a record which other records can inherit
    125 from. A class can be parametrized by a list of "template arguments", whose
    126 values can be used in the class body.
    127 
    128 A given class can only be defined once. A ``class`` declaration is
    129 considered to define the class if any of the following is true:
    130 
    131 .. break ObjectBody into its consituents so that they are present here?
    132 
    133 #. The :token:`TemplateArgList` is present.
    134 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
    135 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
    136 
    137 You can declare an empty class by giving and empty :token:`TemplateArgList`
    138 and an empty :token:`ObjectBody`. This can serve as a restricted form of
    139 forward declaration: note that records deriving from the forward-declared
    140 class will inherit no fields from it since the record expansion is done
    141 when the record is parsed.
    142 
    143 .. productionlist::
    144    TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
    145 
    146 Declarations
    147 ------------
    148 
    149 .. Omitting mention of arcane "field" prefix to discourage its use.
    150 
    151 The declaration syntax is pretty much what you would expect as a C++
    152 programmer.
    153 
    154 .. productionlist::
    155    Declaration: `Type` `TokIdentifier` ["=" `Value`]
    156 
    157 It assigns the value to the identifer.
    158 
    159 Types
    160 -----
    161 
    162 .. productionlist::
    163    Type: "string" | "code" | "bit" | "int" | "dag"
    164        :| "bits" "<" `TokInteger` ">"
    165        :| "list" "<" `Type` ">"
    166        :| `ClassID`
    167    ClassID: `TokIdentifier`
    168 
    169 Both ``string`` and ``code`` correspond to the string type; the difference
    170 is purely to indicate programmer intention.
    171 
    172 The :token:`ClassID` must identify a class that has been previously
    173 declared or defined.
    174 
    175 Values
    176 ------
    177 
    178 .. productionlist::
    179    Value: `SimpleValue` `ValueSuffix`*
    180    ValueSuffix: "{" `RangeList` "}"
    181               :| "[" `RangeList` "]"
    182               :| "." `TokIdentifier`
    183    RangeList: `RangePiece` ("," `RangePiece`)*
    184    RangePiece: `TokInteger`
    185              :| `TokInteger` "-" `TokInteger`
    186              :| `TokInteger` `TokInteger`
    187 
    188 The peculiar last form of :token:`RangePiece` is due to the fact that the
    189 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
    190 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
    191 instead of "1", "-", and "5".
    192 The :token:`RangeList` can be thought of as specifying "list slice" in some
    193 contexts.
    194 
    195 
    196 :token:`SimpleValue` has a number of forms:
    197 
    198 
    199 .. productionlist::
    200    SimpleValue: `TokIdentifier`
    201 
    202 The value will be the variable referenced by the identifier. It can be one
    203 of:
    204 
    205 .. The code for this is exceptionally abstruse. These examples are a
    206    best-effort attempt.
    207 
    208 * name of a ``def``, such as the use of ``Bar`` in::
    209 
    210      def Bar : SomeClass {
    211        int X = 5;
    212      }
    213 
    214      def Foo {
    215        SomeClass Baz = Bar;
    216      }
    217 
    218 * value local to a ``def``, such as the use of ``Bar`` in::
    219 
    220      def Foo {
    221        int Bar = 5;
    222        int Baz = Bar;
    223      }
    224 
    225 * a template arg of a ``class``, such as the use of ``Bar`` in::
    226 
    227      class Foo<int Bar> {
    228        int Baz = Bar;
    229      }
    230 
    231 * value local to a ``multiclass``, such as the use of ``Bar`` in::
    232 
    233      multiclass Foo {
    234        int Bar = 5;
    235        int Baz = Bar;
    236      }
    237 
    238 * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
    239 
    240      multiclass Foo<int Bar> {
    241        int Baz = Bar;
    242      }
    243 
    244 .. productionlist::
    245    SimpleValue: `TokInteger`
    246 
    247 This represents the numeric value of the integer.
    248 
    249 .. productionlist::
    250    SimpleValue: `TokString`+
    251 
    252 Multiple adjacent string literals are concatenated like in C/C++. The value
    253 is the concatenation of the strings.
    254 
    255 .. productionlist::
    256    SimpleValue: `TokCodeFragment`
    257 
    258 The value is the string value of the code fragment.
    259 
    260 .. productionlist::
    261    SimpleValue: "?"
    262 
    263 ``?`` represents an "unset" initializer.
    264 
    265 .. productionlist::
    266    SimpleValue: "{" `ValueList` "}"
    267    ValueList: [`ValueListNE`]
    268    ValueListNE: `Value` ("," `Value`)*
    269 
    270 This represents a sequence of bits, as would be used to initialize a
    271 ``bits<n>`` field (where ``n`` is the number of bits).
    272 
    273 .. productionlist::
    274    SimpleValue: `ClassID` "<" `ValueListNE` ">"
    275 
    276 This generates a new anonymous record definition (as would be created by an
    277 unnamed ``def`` inheriting from the given class with the given template
    278 arguments) and the value is the value of that record definition.
    279 
    280 .. productionlist::
    281    SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
    282 
    283 A list initializer. The optional :token:`Type` can be used to indicate a
    284 specific element type, otherwise the element type will be deduced from the
    285 given values.
    286 
    287 .. The initial `DagArg` of the dag must start with an identifier or
    288    !cast, but this is more of an implementation detail and so for now just
    289    leave it out.
    290 
    291 .. productionlist::
    292    SimpleValue: "(" `DagArg` `DagArgList` ")"
    293    DagArgList: `DagArg` ("," `DagArg`)*
    294    DagArg: `Value` [":" `TokVarName`] | `TokVarName`
    295 
    296 The initial :token:`DagArg` is called the "operator" of the dag.
    297 
    298 .. productionlist::
    299    SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
    300 
    301 Bodies
    302 ------
    303 
    304 .. productionlist::
    305    ObjectBody: `BaseClassList` `Body`
    306    BaseClassList: [":" `BaseClassListNE`]
    307    BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
    308    SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
    309    DefmID: `TokIdentifier`
    310 
    311 The version with the :token:`MultiClassID` is only valid in the
    312 :token:`BaseClassList` of a ``defm``.
    313 The :token:`MultiClassID` should be the name of a ``multiclass``.
    314 
    315 .. put this somewhere else
    316 
    317 It is after parsing the base class list that the "let stack" is applied.
    318 
    319 .. productionlist::
    320    Body: ";" | "{" BodyList "}"
    321    BodyList: BodyItem*
    322    BodyItem: `Declaration` ";"
    323            :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
    324 
    325 The ``let`` form allows overriding the value of an inherited field.
    326 
    327 ``def``
    328 -------
    329 
    330 .. TODO::
    331    There can be pastes in the names here, like ``#NAME#``. Look into that
    332    and document it (it boils down to ParseIDValue with IDParseMode ==
    333    ParseNameMode). ParseObjectName calls into the general ParseValue, with
    334    the only different from "arbitrary expression parsing" being IDParseMode
    335    == Mode.
    336 
    337 .. productionlist::
    338    Def: "def" `TokIdentifier` `ObjectBody`
    339 
    340 Defines a record whose name is given by the :token:`TokIdentifier`. The
    341 fields of the record are inherited from the base classes and defined in the
    342 body.
    343 
    344 Special handling occurs if this ``def`` appears inside a ``multiclass`` or
    345 a ``foreach``.
    346 
    347 ``defm``
    348 --------
    349 
    350 .. productionlist::
    351    Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
    352 
    353 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
    354 precede any ``class``'s that appear.
    355 
    356 ``foreach``
    357 -----------
    358 
    359 .. productionlist::
    360    Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
    361           :| "foreach" `Declaration` "in" `Object`
    362 
    363 The value assigned to the variable in the declaration is iterated over and
    364 the object or object list is reevaluated with the variable set at each
    365 iterated value.
    366 
    367 Top-Level ``let``
    368 -----------------
    369 
    370 .. productionlist::
    371    Let:  "let" `LetList` "in" "{" `Object`* "}"
    372       :| "let" `LetList` "in" `Object`
    373    LetList: `LetItem` ("," `LetItem`)*
    374    LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
    375 
    376 This is effectively equivalent to ``let`` inside the body of a record
    377 except that it applies to multiple records at a time. The bindings are
    378 applied at the end of parsing the base classes of a record.
    379 
    380 ``multiclass``
    381 --------------
    382 
    383 .. productionlist::
    384    MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
    385              : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
    386    BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
    387    MultiClassID: `TokIdentifier`
    388    MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
    389