1 =========================== 2 TableGen Language Reference 3 =========================== 4 5 .. contents:: 6 :local: 7 8 .. warning:: 9 This document is extremely rough. If you find something lacking, please 10 fix it, file a documentation bug, or ask about it on llvmdev. 11 12 Introduction 13 ============ 14 15 This document is meant to be a normative spec about the TableGen language 16 in and of itself (i.e. how to understand a given construct in terms of how 17 it affects the final set of records represented by the TableGen file). If 18 you are unsure if this document is really what you are looking for, please 19 read the :doc:`introduction to TableGen <index>` first. 20 21 Notation 22 ======== 23 24 The lexical and syntax notation used here is intended to imitate 25 `Python's`_. In particular, for lexical definitions, the productions 26 operate at the character level and there is no implied whitespace between 27 elements. The syntax definitions operate at the token level, so there is 28 implied whitespace between tokens. 29 30 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation 31 32 Lexical Analysis 33 ================ 34 35 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) 36 comments. 37 38 The following is a listing of the basic punctuation tokens:: 39 40 - + [ ] { } ( ) < > : ; . = ? # 41 42 Numeric literals take one of the following forms: 43 44 .. TableGen actually will lex some pretty strange sequences an interpret 45 them as numbers. What is shown here is an attempt to approximate what it 46 "should" accept. 47 48 .. productionlist:: 49 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` 50 DecimalInteger: ["+" | "-"] ("0"..."9")+ 51 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ 52 BinInteger: "0b" ("0" | "1")+ 53 54 One aspect to note is that the :token:`DecimalInteger` token *includes* the 55 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as 56 most languages do. 57 58 TableGen has identifier-like tokens: 59 60 .. productionlist:: 61 ualpha: "a"..."z" | "A"..."Z" | "_" 62 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* 63 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* 64 65 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to 66 begin with a number. In case of ambiguity, a token will be interpreted as a 67 numeric literal rather than an identifier. 68 69 TableGen also has two string-like literals: 70 71 .. productionlist:: 72 TokString: '"' <non-'"' characters and C-like escapes> '"' 73 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" 74 75 :token:`TokCodeFragment` is essentially a multiline string literal 76 delimited by ``[{`` and ``}]``. 77 78 .. note:: 79 The current implementation accepts the following C-like escapes:: 80 81 \\ \' \" \t \n 82 83 TableGen also has the following keywords:: 84 85 bit bits class code dag 86 def foreach defm field in 87 int let list multiclass string 88 89 TableGen also has "bang operators" which have a 90 wide variety of meanings: 91 92 .. productionlist:: 93 BangOperator: one of 94 :!eq !if !head !tail !con 95 :!add !shl !sra !srl 96 :!cast !empty !subst !foreach !listconcat !strconcat 97 98 Syntax 99 ====== 100 101 TableGen has an ``include`` mechanism. It does not play a role in the 102 syntax per se, since it is lexically replaced with the contents of the 103 included file. 104 105 .. productionlist:: 106 IncludeDirective: "include" `TokString` 107 108 TableGen's top-level production consists of "objects". 109 110 .. productionlist:: 111 TableGenFile: `Object`* 112 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` 113 114 ``class``\es 115 ------------ 116 117 .. productionlist:: 118 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` 119 120 A ``class`` declaration creates a record which other records can inherit 121 from. A class can be parametrized by a list of "template arguments", whose 122 values can be used in the class body. 123 124 A given class can only be defined once. A ``class`` declaration is 125 considered to define the class if any of the following is true: 126 127 .. break ObjectBody into its consituents so that they are present here? 128 129 #. The :token:`TemplateArgList` is present. 130 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. 131 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present. 132 133 You can declare an empty class by giving and empty :token:`TemplateArgList` 134 and an empty :token:`ObjectBody`. This can serve as a restricted form of 135 forward declaration: note that records deriving from the forward-declared 136 class will inherit no fields from it since the record expansion is done 137 when the record is parsed. 138 139 .. productionlist:: 140 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" 141 142 Declarations 143 ------------ 144 145 .. Omitting mention of arcane "field" prefix to discourage its use. 146 147 The declaration syntax is pretty much what you would expect as a C++ 148 programmer. 149 150 .. productionlist:: 151 Declaration: `Type` `TokIdentifier` ["=" `Value`] 152 153 It assigns the value to the identifer. 154 155 Types 156 ----- 157 158 .. productionlist:: 159 Type: "string" | "code" | "bit" | "int" | "dag" 160 :| "bits" "<" `TokInteger` ">" 161 :| "list" "<" `Type` ">" 162 :| `ClassID` 163 ClassID: `TokIdentifier` 164 165 Both ``string`` and ``code`` correspond to the string type; the difference 166 is purely to indicate programmer intention. 167 168 The :token:`ClassID` must identify a class that has been previously 169 declared or defined. 170 171 Values 172 ------ 173 174 .. productionlist:: 175 Value: `SimpleValue` `ValueSuffix`* 176 ValueSuffix: "{" `RangeList` "}" 177 :| "[" `RangeList` "]" 178 :| "." `TokIdentifier` 179 RangeList: `RangePiece` ("," `RangePiece`)* 180 RangePiece: `TokInteger` 181 :| `TokInteger` "-" `TokInteger` 182 :| `TokInteger` `TokInteger` 183 184 The peculiar last form of :token:`RangePiece` is due to the fact that the 185 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as 186 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, 187 instead of "1", "-", and "5". 188 The :token:`RangeList` can be thought of as specifying "list slice" in some 189 contexts. 190 191 192 :token:`SimpleValue` has a number of forms: 193 194 195 .. productionlist:: 196 SimpleValue: `TokIdentifier` 197 198 The value will be the variable referenced by the identifier. It can be one 199 of: 200 201 .. The code for this is exceptionally abstruse. These examples are a 202 best-effort attempt. 203 204 * name of a ``def``, such as the use of ``Bar`` in:: 205 206 def Bar : SomeClass { 207 int X = 5; 208 } 209 210 def Foo { 211 SomeClass Baz = Bar; 212 } 213 214 * value local to a ``def``, such as the use of ``Bar`` in:: 215 216 def Foo { 217 int Bar = 5; 218 int Baz = Bar; 219 } 220 221 * a template arg of a ``class``, such as the use of ``Bar`` in:: 222 223 class Foo<int Bar> { 224 int Baz = Bar; 225 } 226 227 * value local to a ``multiclass``, such as the use of ``Bar`` in:: 228 229 multiclass Foo { 230 int Bar = 5; 231 int Baz = Bar; 232 } 233 234 * a template arg to a ``multiclass``, such as the use of ``Bar`` in:: 235 236 multiclass Foo<int Bar> { 237 int Baz = Bar; 238 } 239 240 .. productionlist:: 241 SimpleValue: `TokInteger` 242 243 This represents the numeric value of the integer. 244 245 .. productionlist:: 246 SimpleValue: `TokString`+ 247 248 Multiple adjacent string literals are concatenated like in C/C++. The value 249 is the concatenation of the strings. 250 251 .. productionlist:: 252 SimpleValue: `TokCodeFragment` 253 254 The value is the string value of the code fragment. 255 256 .. productionlist:: 257 SimpleValue: "?" 258 259 ``?`` represents an "unset" initializer. 260 261 .. productionlist:: 262 SimpleValue: "{" `ValueList` "}" 263 ValueList: [`ValueListNE`] 264 ValueListNE: `Value` ("," `Value`)* 265 266 This represents a sequence of bits, as would be used to initialize a 267 ``bits<n>`` field (where ``n`` is the number of bits). 268 269 .. productionlist:: 270 SimpleValue: `ClassID` "<" `ValueListNE` ">" 271 272 This generates a new anonymous record definition (as would be created by an 273 unnamed ``def`` inheriting from the given class with the given template 274 arguments) and the value is the value of that record definition. 275 276 .. productionlist:: 277 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] 278 279 A list initializer. The optional :token:`Type` can be used to indicate a 280 specific element type, otherwise the element type will be deduced from the 281 given values. 282 283 .. The initial `DagArg` of the dag must start with an identifier or 284 !cast, but this is more of an implementation detail and so for now just 285 leave it out. 286 287 .. productionlist:: 288 SimpleValue: "(" `DagArg` `DagArgList` ")" 289 DagArgList: `DagArg` ("," `DagArg`)* 290 DagArg: `Value` [":" `TokVarName`] | `TokVarName` 291 292 The initial :token:`DagArg` is called the "operator" of the dag. 293 294 .. productionlist:: 295 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" 296 297 Bodies 298 ------ 299 300 .. productionlist:: 301 ObjectBody: `BaseClassList` `Body` 302 BaseClassList: [":" `BaseClassListNE`] 303 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* 304 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] 305 DefmID: `TokIdentifier` 306 307 The version with the :token:`MultiClassID` is only valid in the 308 :token:`BaseClassList` of a ``defm``. 309 The :token:`MultiClassID` should be the name of a ``multiclass``. 310 311 .. put this somewhere else 312 313 It is after parsing the base class list that the "let stack" is applied. 314 315 .. productionlist:: 316 Body: ";" | "{" BodyList "}" 317 BodyList: BodyItem* 318 BodyItem: `Declaration` ";" 319 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" 320 321 The ``let`` form allows overriding the value of an inherited field. 322 323 ``def`` 324 ------- 325 326 .. TODO:: 327 There can be pastes in the names here, like ``#NAME#``. Look into that 328 and document it (it boils down to ParseIDValue with IDParseMode == 329 ParseNameMode). ParseObjectName calls into the general ParseValue, with 330 the only different from "arbitrary expression parsing" being IDParseMode 331 == Mode. 332 333 .. productionlist:: 334 Def: "def" `TokIdentifier` `ObjectBody` 335 336 Defines a record whose name is given by the :token:`TokIdentifier`. The 337 fields of the record are inherited from the base classes and defined in the 338 body. 339 340 Special handling occurs if this ``def`` appears inside a ``multiclass`` or 341 a ``foreach``. 342 343 ``defm`` 344 -------- 345 346 .. productionlist:: 347 Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" 348 349 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must 350 precede any ``class``'s that appear. 351 352 ``foreach`` 353 ----------- 354 355 .. productionlist:: 356 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" 357 :| "foreach" `Declaration` "in" `Object` 358 359 The value assigned to the variable in the declaration is iterated over and 360 the object or object list is reevaluated with the variable set at each 361 iterated value. 362 363 Top-Level ``let`` 364 ----------------- 365 366 .. productionlist:: 367 Let: "let" `LetList` "in" "{" `Object`* "}" 368 :| "let" `LetList` "in" `Object` 369 LetList: `LetItem` ("," `LetItem`)* 370 LetItem: `TokIdentifier` [`RangeList`] "=" `Value` 371 372 This is effectively equivalent to ``let`` inside the body of a record 373 except that it applies to multiple records at a time. The bindings are 374 applied at the end of parsing the base classes of a record. 375 376 ``multiclass`` 377 -------------- 378 379 .. productionlist:: 380 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] 381 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" 382 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* 383 MultiClassID: `TokIdentifier` 384 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` 385