Home | History | Annotate | Download | only in TableGen
      1 ===========================
      2 TableGen Language Reference
      3 ===========================
      4 
      5 .. sectionauthor:: Sean Silva <silvas (a] purdue.edu>
      6 
      7 .. contents::
      8    :local:
      9 
     10 .. warning::
     11    This document is extremely rough. If you find something lacking, please
     12    fix it, file a documentation bug, or ask about it on llvmdev.
     13 
     14 Introduction
     15 ============
     16 
     17 This document is meant to be a normative spec about the TableGen language
     18 in and of itself (i.e. how to understand a given construct in terms of how
     19 it affects the final set of records represented by the TableGen file). If
     20 you are unsure if this document is really what you are looking for, please
     21 read :doc:`/TableGenFundamentals` first.
     22 
     23 Notation
     24 ========
     25 
     26 The lexical and syntax notation used here is intended to imitate
     27 `Python's`_. In particular, for lexical definitions, the productions
     28 operate at the character level and there is no implied whitespace between
     29 elements. The syntax definitions operate at the token level, so there is
     30 implied whitespace between tokens.
     31 
     32 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
     33 
     34 Lexical Analysis
     35 ================
     36 
     37 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
     38 comments.
     39 
     40 The following is a listing of the basic punctuation tokens::
     41 
     42    - + [ ] { } ( ) < > : ; .  = ? #
     43 
     44 Numeric literals take one of the following forms:
     45 
     46 .. TableGen actually will lex some pretty strange sequences an interpret
     47    them as numbers. What is shown here is an attempt to approximate what it
     48    "should" accept.
     49 
     50 .. productionlist::
     51    TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
     52    DecimalInteger: ["+" | "-"] ("0"..."9")+
     53    HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
     54    BinInteger: "0b" ("0" | "1")+
     55 
     56 One aspect to note is that the :token:`DecimalInteger` token *includes* the
     57 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
     58 most languages do.
     59 
     60 TableGen has identifier-like tokens:
     61 
     62 .. productionlist::
     63    ualpha: "a"..."z" | "A"..."Z" | "_"
     64    TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
     65    TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
     66 
     67 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
     68 begin with a number. In case of ambiguity, a token will be interpreted as a
     69 numeric literal rather than an identifier.
     70 
     71 TableGen also has two string-like literals:
     72 
     73 .. productionlist::
     74    TokString: '"' <non-'"' characters and C-like escapes> '"'
     75    TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
     76 
     77 .. note::
     78    The current implementation accepts the following C-like escapes::
     79 
     80       \\ \' \" \t \n
     81 
     82 TableGen also has the following keywords::
     83 
     84    bit   bits      class   code         dag
     85    def   foreach   defm    field        in
     86    int   let       list    multiclass   string
     87 
     88 TableGen also has "bang operators" which have a
     89 wide variety of meanings:
     90 
     91 .. productionlist::
     92    BangOperator: one of
     93                :!eq     !if      !head    !tail      !con
     94                :!add    !shl     !sra     !srl
     95                :!cast   !empty   !subst   !foreach   !strconcat
     96 
     97 Syntax
     98 ======
     99 
    100 TableGen has an ``include`` mechanism. It does not play a role in the
    101 syntax per se, since it is lexically replaced with the contents of the
    102 included file.
    103 
    104 .. productionlist::
    105    IncludeDirective: "include" `TokString`
    106 
    107 TableGen's top-level production consists of "objects".
    108 
    109 .. productionlist::
    110    TableGenFile: `Object`*
    111    Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
    112 
    113 ``class``\es
    114 ------------
    115 
    116 .. productionlist::
    117    Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
    118 
    119 A ``class`` declaration creates a record which other records can inherit
    120 from. A class can be parametrized by a list of "template arguments", whose
    121 values can be used in the class body.
    122 
    123 A given class can only be defined once. A ``class`` declaration is
    124 considered to define the class if any of the following is true:
    125 
    126 .. break ObjectBody into its consituents so that they are present here?
    127 
    128 #. The :token:`TemplateArgList` is present.
    129 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
    130 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
    131 
    132 You can declare an empty class by giving and empty :token:`TemplateArgList`
    133 and an empty :token:`ObjectBody`. This can serve as a restricted form of
    134 forward declaration: note that records deriving from the forward-declared
    135 class will inherit no fields from it since the record expansion is done
    136 when the record is parsed.
    137 
    138 .. productionlist::
    139    TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
    140 
    141 Declarations
    142 ------------
    143 
    144 .. Omitting mention of arcane "field" prefix to discourage its use.
    145 
    146 The declaration syntax is pretty much what you would expect as a C++
    147 programmer.
    148 
    149 .. productionlist::
    150    Declaration: `Type` `TokIdentifier` ["=" `Value`]
    151 
    152 It assigns the value to the identifer.
    153 
    154 Types
    155 -----
    156 
    157 .. productionlist::
    158    Type: "string" | "code" | "bit" | "int" | "dag"
    159        :| "bits" "<" `TokInteger` ">"
    160        :| "list" "<" `Type` ">"
    161        :| `ClassID`
    162    ClassID: `TokIdentifier`
    163 
    164 Both ``string`` and ``code`` correspond to the string type; the difference
    165 is purely to indicate programmer intention.
    166 
    167 The :token:`ClassID` must identify a class that has been previously
    168 declared or defined.
    169 
    170 Values
    171 ------
    172 
    173 .. productionlist::
    174    Value: `SimpleValue` `ValueSuffix`*
    175    ValueSuffix: "{" `RangeList` "}"
    176               :| "[" `RangeList` "]"
    177               :| "." `TokIdentifier`
    178    RangeList: `RangePiece` ("," `RangePiece`)*
    179    RangePiece: `TokInteger`
    180              :| `TokInteger` "-" `TokInteger`
    181              :| `TokInteger` `TokInteger`
    182 
    183 The peculiar last form of :token:`RangePiece` is due to the fact that the
    184 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
    185 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
    186 instead of "1", "-", and "5".
    187 The :token:`RangeList` can be thought of as specifying "list slice" in some
    188 contexts.
    189 
    190 
    191 :token:`SimpleValue` has a number of forms:
    192 
    193 
    194 .. productionlist::
    195    SimpleValue: `TokIdentifier`
    196 
    197 The value will be the variable referenced by the identifier. It can be one
    198 of:
    199 
    200 .. The code for this is exceptionally abstruse. These examples are a
    201    best-effort attempt.
    202 
    203 * name of a ``def``, such as the use of ``Bar`` in::
    204 
    205      def Bar : SomeClass {
    206        int X = 5;
    207      }
    208 
    209      def Foo {
    210        SomeClass Baz = Bar;
    211      }
    212 
    213 * value local to a ``def``, such as the use of ``Bar`` in::
    214 
    215      def Foo {
    216        int Bar = 5;
    217        int Baz = Bar;
    218      }
    219 
    220 * a template arg of a ``class``, such as the use of ``Bar`` in::
    221 
    222      class Foo<int Bar> {
    223        int Baz = Bar;
    224      }
    225 
    226 * value local to a ``multiclass``, such as the use of ``Bar`` in::
    227 
    228      multiclass Foo {
    229        int Bar = 5;
    230        int Baz = Bar;
    231      }
    232 
    233 * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
    234 
    235      multiclass Foo<int Bar> {
    236        int Baz = Bar;
    237      }
    238 
    239 .. productionlist::
    240    SimpleValue: `TokInteger`
    241 
    242 This represents the numeric value of the integer.
    243 
    244 .. productionlist::
    245    SimpleValue: `TokString`+
    246 
    247 Multiple adjacent string literals are concatenated like in C/C++. The value
    248 is the concatenation of the strings.
    249 
    250 .. productionlist::
    251    SimpleValue: `TokCodeFragment`
    252 
    253 The value is the string value of the code fragment.
    254 
    255 .. productionlist::
    256    SimpleValue: "?"
    257 
    258 ``?`` represents an "unset" initializer.
    259 
    260 .. productionlist::
    261    SimpleValue: "{" `ValueList` "}"
    262    ValueList: [`ValueListNE`]
    263    ValueListNE: `Value` ("," `Value`)*
    264 
    265 This represents a sequence of bits, as would be used to initialize a
    266 ``bits<n>`` field (where ``n`` is the number of bits).
    267 
    268 .. productionlist::
    269    SimpleValue: `ClassID` "<" `ValueListNE` ">"
    270 
    271 This generates a new anonymous record definition (as would be created by an
    272 unnamed ``def`` inheriting from the given class with the given template
    273 arguments) and the value is the value of that record definition.
    274 
    275 .. productionlist::
    276    SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
    277 
    278 A list initializer. The optional :token:`Type` can be used to indicate a
    279 specific element type, otherwise the element type will be deduced from the
    280 given values.
    281 
    282 .. The initial `DagArg` of the dag must start with an identifier or
    283    !cast, but this is more of an implementation detail and so for now just
    284    leave it out.
    285 
    286 .. productionlist::
    287    SimpleValue: "(" `DagArg` `DagArgList` ")"
    288    DagArgList: `DagArg` ("," `DagArg`)*
    289    DagArg: `Value` [":" `TokVarName`] | `TokVarName`
    290 
    291 The initial :token:`DagArg` is called the "operator" of the dag.
    292 
    293 .. productionlist::
    294    SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
    295 
    296 Bodies
    297 ------
    298 
    299 .. productionlist::
    300    ObjectBody: `BaseClassList` `Body`
    301    BaseClassList: [":" `BaseClassListNE`]
    302    BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
    303    SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
    304    DefmID: `TokIdentifier`
    305 
    306 The version with the :token:`MultiClassID` is only valid in the
    307 :token:`BaseClassList` of a ``defm``.
    308 The :token:`MultiClassID` should be the name of a ``multiclass``.
    309 
    310 .. put this somewhere else
    311 
    312 It is after parsing the base class list that the "let stack" is applied.
    313 
    314 .. productionlist::
    315    Body: ";" | "{" BodyList "}"
    316    BodyList: BodyItem*
    317    BodyItem: `Declaration` ";"
    318            :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
    319 
    320 The ``let`` form allows overriding the value of an inherited field.
    321 
    322 ``def``
    323 -------
    324 
    325 .. TODO::
    326    There can be pastes in the names here, like ``#NAME#``. Look into that
    327    and document it (it boils down to ParseIDValue with IDParseMode ==
    328    ParseNameMode). ParseObjectName calls into the general ParseValue, with
    329    the only different from "arbitrary expression parsing" being IDParseMode
    330    == Mode.
    331 
    332 .. productionlist::
    333    Def: "def" `TokIdentifier` `ObjectBody`
    334 
    335 Defines a record whose name is given by the :token:`TokIdentifier`. The
    336 fields of the record are inherited from the base classes and defined in the
    337 body.
    338 
    339 Special handling occurs if this ``def`` appears inside a ``multiclass`` or
    340 a ``foreach``.
    341 
    342 ``defm``
    343 --------
    344 
    345 .. productionlist::
    346    Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
    347 
    348 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
    349 precede any ``class``'s that appear.
    350 
    351 ``foreach``
    352 -----------
    353 
    354 .. productionlist::
    355    Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
    356           :| "foreach" `Declaration` "in" `Object`
    357 
    358 The value assigned to the variable in the declaration is iterated over and
    359 the object or object list is reevaluated with the variable set at each
    360 iterated value.
    361 
    362 Top-Level ``let``
    363 -----------------
    364 
    365 .. productionlist::
    366    Let:  "let" `LetList` "in" "{" `Object`* "}"
    367       :| "let" `LetList` "in" `Object`
    368    LetList: `LetItem` ("," `LetItem`)*
    369    LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
    370 
    371 This is effectively equivalent to ``let`` inside the body of a record
    372 except that it applies to multiple records at a time. The bindings are
    373 applied at the end of parsing the base classes of a record.
    374 
    375 ``multiclass``
    376 --------------
    377 
    378 .. productionlist::
    379    MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
    380              : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
    381    BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
    382    MultiClassID: `TokIdentifier`
    383    MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
    384