Home | History | Annotate | Download | only in TableGen
      1 ===========================
      2 TableGen Language Reference
      3 ===========================
      4 
      5 .. contents::
      6    :local:
      7 
      8 .. warning::
      9    This document is extremely rough. If you find something lacking, please
     10    fix it, file a documentation bug, or ask about it on llvmdev.
     11 
     12 Introduction
     13 ============
     14 
     15 This document is meant to be a normative spec about the TableGen language
     16 in and of itself (i.e. how to understand a given construct in terms of how
     17 it affects the final set of records represented by the TableGen file). If
     18 you are unsure if this document is really what you are looking for, please
     19 read the :doc:`introduction to TableGen <index>` first.
     20 
     21 Notation
     22 ========
     23 
     24 The lexical and syntax notation used here is intended to imitate
     25 `Python's`_. In particular, for lexical definitions, the productions
     26 operate at the character level and there is no implied whitespace between
     27 elements. The syntax definitions operate at the token level, so there is
     28 implied whitespace between tokens.
     29 
     30 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
     31 
     32 Lexical Analysis
     33 ================
     34 
     35 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
     36 comments.
     37 
     38 The following is a listing of the basic punctuation tokens::
     39 
     40    - + [ ] { } ( ) < > : ; .  = ? #
     41 
     42 Numeric literals take one of the following forms:
     43 
     44 .. TableGen actually will lex some pretty strange sequences an interpret
     45    them as numbers. What is shown here is an attempt to approximate what it
     46    "should" accept.
     47 
     48 .. productionlist::
     49    TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
     50    DecimalInteger: ["+" | "-"] ("0"..."9")+
     51    HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
     52    BinInteger: "0b" ("0" | "1")+
     53 
     54 One aspect to note is that the :token:`DecimalInteger` token *includes* the
     55 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
     56 most languages do.
     57 
     58 TableGen has identifier-like tokens:
     59 
     60 .. productionlist::
     61    ualpha: "a"..."z" | "A"..."Z" | "_"
     62    TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
     63    TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
     64 
     65 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
     66 begin with a number. In case of ambiguity, a token will be interpreted as a
     67 numeric literal rather than an identifier.
     68 
     69 TableGen also has two string-like literals:
     70 
     71 .. productionlist::
     72    TokString: '"' <non-'"' characters and C-like escapes> '"'
     73    TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
     74 
     75 :token:`TokCodeFragment` is essentially a multiline string literal
     76 delimited by ``[{`` and ``}]``.
     77 
     78 .. note::
     79    The current implementation accepts the following C-like escapes::
     80 
     81       \\ \' \" \t \n
     82 
     83 TableGen also has the following keywords::
     84 
     85    bit   bits      class   code         dag
     86    def   foreach   defm    field        in
     87    int   let       list    multiclass   string
     88 
     89 TableGen also has "bang operators" which have a
     90 wide variety of meanings:
     91 
     92 .. productionlist::
     93    BangOperator: one of
     94                :!eq     !if      !head    !tail      !con
     95                :!add    !shl     !sra     !srl
     96                :!cast   !empty   !subst   !foreach   !listconcat   !strconcat
     97 
     98 Syntax
     99 ======
    100 
    101 TableGen has an ``include`` mechanism. It does not play a role in the
    102 syntax per se, since it is lexically replaced with the contents of the
    103 included file.
    104 
    105 .. productionlist::
    106    IncludeDirective: "include" `TokString`
    107 
    108 TableGen's top-level production consists of "objects".
    109 
    110 .. productionlist::
    111    TableGenFile: `Object`*
    112    Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
    113 
    114 ``class``\es
    115 ------------
    116 
    117 .. productionlist::
    118    Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
    119 
    120 A ``class`` declaration creates a record which other records can inherit
    121 from. A class can be parametrized by a list of "template arguments", whose
    122 values can be used in the class body.
    123 
    124 A given class can only be defined once. A ``class`` declaration is
    125 considered to define the class if any of the following is true:
    126 
    127 .. break ObjectBody into its consituents so that they are present here?
    128 
    129 #. The :token:`TemplateArgList` is present.
    130 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
    131 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
    132 
    133 You can declare an empty class by giving and empty :token:`TemplateArgList`
    134 and an empty :token:`ObjectBody`. This can serve as a restricted form of
    135 forward declaration: note that records deriving from the forward-declared
    136 class will inherit no fields from it since the record expansion is done
    137 when the record is parsed.
    138 
    139 .. productionlist::
    140    TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
    141 
    142 Declarations
    143 ------------
    144 
    145 .. Omitting mention of arcane "field" prefix to discourage its use.
    146 
    147 The declaration syntax is pretty much what you would expect as a C++
    148 programmer.
    149 
    150 .. productionlist::
    151    Declaration: `Type` `TokIdentifier` ["=" `Value`]
    152 
    153 It assigns the value to the identifer.
    154 
    155 Types
    156 -----
    157 
    158 .. productionlist::
    159    Type: "string" | "code" | "bit" | "int" | "dag"
    160        :| "bits" "<" `TokInteger` ">"
    161        :| "list" "<" `Type` ">"
    162        :| `ClassID`
    163    ClassID: `TokIdentifier`
    164 
    165 Both ``string`` and ``code`` correspond to the string type; the difference
    166 is purely to indicate programmer intention.
    167 
    168 The :token:`ClassID` must identify a class that has been previously
    169 declared or defined.
    170 
    171 Values
    172 ------
    173 
    174 .. productionlist::
    175    Value: `SimpleValue` `ValueSuffix`*
    176    ValueSuffix: "{" `RangeList` "}"
    177               :| "[" `RangeList` "]"
    178               :| "." `TokIdentifier`
    179    RangeList: `RangePiece` ("," `RangePiece`)*
    180    RangePiece: `TokInteger`
    181              :| `TokInteger` "-" `TokInteger`
    182              :| `TokInteger` `TokInteger`
    183 
    184 The peculiar last form of :token:`RangePiece` is due to the fact that the
    185 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
    186 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
    187 instead of "1", "-", and "5".
    188 The :token:`RangeList` can be thought of as specifying "list slice" in some
    189 contexts.
    190 
    191 
    192 :token:`SimpleValue` has a number of forms:
    193 
    194 
    195 .. productionlist::
    196    SimpleValue: `TokIdentifier`
    197 
    198 The value will be the variable referenced by the identifier. It can be one
    199 of:
    200 
    201 .. The code for this is exceptionally abstruse. These examples are a
    202    best-effort attempt.
    203 
    204 * name of a ``def``, such as the use of ``Bar`` in::
    205 
    206      def Bar : SomeClass {
    207        int X = 5;
    208      }
    209 
    210      def Foo {
    211        SomeClass Baz = Bar;
    212      }
    213 
    214 * value local to a ``def``, such as the use of ``Bar`` in::
    215 
    216      def Foo {
    217        int Bar = 5;
    218        int Baz = Bar;
    219      }
    220 
    221 * a template arg of a ``class``, such as the use of ``Bar`` in::
    222 
    223      class Foo<int Bar> {
    224        int Baz = Bar;
    225      }
    226 
    227 * value local to a ``multiclass``, such as the use of ``Bar`` in::
    228 
    229      multiclass Foo {
    230        int Bar = 5;
    231        int Baz = Bar;
    232      }
    233 
    234 * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
    235 
    236      multiclass Foo<int Bar> {
    237        int Baz = Bar;
    238      }
    239 
    240 .. productionlist::
    241    SimpleValue: `TokInteger`
    242 
    243 This represents the numeric value of the integer.
    244 
    245 .. productionlist::
    246    SimpleValue: `TokString`+
    247 
    248 Multiple adjacent string literals are concatenated like in C/C++. The value
    249 is the concatenation of the strings.
    250 
    251 .. productionlist::
    252    SimpleValue: `TokCodeFragment`
    253 
    254 The value is the string value of the code fragment.
    255 
    256 .. productionlist::
    257    SimpleValue: "?"
    258 
    259 ``?`` represents an "unset" initializer.
    260 
    261 .. productionlist::
    262    SimpleValue: "{" `ValueList` "}"
    263    ValueList: [`ValueListNE`]
    264    ValueListNE: `Value` ("," `Value`)*
    265 
    266 This represents a sequence of bits, as would be used to initialize a
    267 ``bits<n>`` field (where ``n`` is the number of bits).
    268 
    269 .. productionlist::
    270    SimpleValue: `ClassID` "<" `ValueListNE` ">"
    271 
    272 This generates a new anonymous record definition (as would be created by an
    273 unnamed ``def`` inheriting from the given class with the given template
    274 arguments) and the value is the value of that record definition.
    275 
    276 .. productionlist::
    277    SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
    278 
    279 A list initializer. The optional :token:`Type` can be used to indicate a
    280 specific element type, otherwise the element type will be deduced from the
    281 given values.
    282 
    283 .. The initial `DagArg` of the dag must start with an identifier or
    284    !cast, but this is more of an implementation detail and so for now just
    285    leave it out.
    286 
    287 .. productionlist::
    288    SimpleValue: "(" `DagArg` `DagArgList` ")"
    289    DagArgList: `DagArg` ("," `DagArg`)*
    290    DagArg: `Value` [":" `TokVarName`] | `TokVarName`
    291 
    292 The initial :token:`DagArg` is called the "operator" of the dag.
    293 
    294 .. productionlist::
    295    SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
    296 
    297 Bodies
    298 ------
    299 
    300 .. productionlist::
    301    ObjectBody: `BaseClassList` `Body`
    302    BaseClassList: [":" `BaseClassListNE`]
    303    BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
    304    SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
    305    DefmID: `TokIdentifier`
    306 
    307 The version with the :token:`MultiClassID` is only valid in the
    308 :token:`BaseClassList` of a ``defm``.
    309 The :token:`MultiClassID` should be the name of a ``multiclass``.
    310 
    311 .. put this somewhere else
    312 
    313 It is after parsing the base class list that the "let stack" is applied.
    314 
    315 .. productionlist::
    316    Body: ";" | "{" BodyList "}"
    317    BodyList: BodyItem*
    318    BodyItem: `Declaration` ";"
    319            :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
    320 
    321 The ``let`` form allows overriding the value of an inherited field.
    322 
    323 ``def``
    324 -------
    325 
    326 .. TODO::
    327    There can be pastes in the names here, like ``#NAME#``. Look into that
    328    and document it (it boils down to ParseIDValue with IDParseMode ==
    329    ParseNameMode). ParseObjectName calls into the general ParseValue, with
    330    the only different from "arbitrary expression parsing" being IDParseMode
    331    == Mode.
    332 
    333 .. productionlist::
    334    Def: "def" `TokIdentifier` `ObjectBody`
    335 
    336 Defines a record whose name is given by the :token:`TokIdentifier`. The
    337 fields of the record are inherited from the base classes and defined in the
    338 body.
    339 
    340 Special handling occurs if this ``def`` appears inside a ``multiclass`` or
    341 a ``foreach``.
    342 
    343 ``defm``
    344 --------
    345 
    346 .. productionlist::
    347    Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
    348 
    349 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
    350 precede any ``class``'s that appear.
    351 
    352 ``foreach``
    353 -----------
    354 
    355 .. productionlist::
    356    Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
    357           :| "foreach" `Declaration` "in" `Object`
    358 
    359 The value assigned to the variable in the declaration is iterated over and
    360 the object or object list is reevaluated with the variable set at each
    361 iterated value.
    362 
    363 Top-Level ``let``
    364 -----------------
    365 
    366 .. productionlist::
    367    Let:  "let" `LetList` "in" "{" `Object`* "}"
    368       :| "let" `LetList` "in" `Object`
    369    LetList: `LetItem` ("," `LetItem`)*
    370    LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
    371 
    372 This is effectively equivalent to ``let`` inside the body of a record
    373 except that it applies to multiple records at a time. The bindings are
    374 applied at the end of parsing the base classes of a record.
    375 
    376 ``multiclass``
    377 --------------
    378 
    379 .. productionlist::
    380    MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
    381              : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
    382    BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
    383    MultiClassID: `TokIdentifier`
    384    MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
    385