1 =========================== 2 TableGen Language Reference 3 =========================== 4 5 .. contents:: 6 :local: 7 8 .. warning:: 9 This document is extremely rough. If you find something lacking, please 10 fix it, file a documentation bug, or ask about it on llvm-dev. 11 12 Introduction 13 ============ 14 15 This document is meant to be a normative spec about the TableGen language 16 in and of itself (i.e. how to understand a given construct in terms of how 17 it affects the final set of records represented by the TableGen file). If 18 you are unsure if this document is really what you are looking for, please 19 read the :doc:`introduction to TableGen <index>` first. 20 21 Notation 22 ======== 23 24 The lexical and syntax notation used here is intended to imitate 25 `Python's`_. In particular, for lexical definitions, the productions 26 operate at the character level and there is no implied whitespace between 27 elements. The syntax definitions operate at the token level, so there is 28 implied whitespace between tokens. 29 30 .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation 31 32 Lexical Analysis 33 ================ 34 35 TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) 36 comments. 37 38 The following is a listing of the basic punctuation tokens:: 39 40 - + [ ] { } ( ) < > : ; . = ? # 41 42 Numeric literals take one of the following forms: 43 44 .. TableGen actually will lex some pretty strange sequences an interpret 45 them as numbers. What is shown here is an attempt to approximate what it 46 "should" accept. 47 48 .. productionlist:: 49 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` 50 DecimalInteger: ["+" | "-"] ("0"..."9")+ 51 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ 52 BinInteger: "0b" ("0" | "1")+ 53 54 One aspect to note is that the :token:`DecimalInteger` token *includes* the 55 ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as 56 most languages do. 57 58 Also note that :token:`BinInteger` creates a value of type ``bits<n>`` 59 (where ``n`` is the number of bits). This will implicitly convert to 60 integers when needed. 61 62 TableGen has identifier-like tokens: 63 64 .. productionlist:: 65 ualpha: "a"..."z" | "A"..."Z" | "_" 66 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* 67 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* 68 69 Note that unlike most languages, TableGen allows :token:`TokIdentifier` to 70 begin with a number. In case of ambiguity, a token will be interpreted as a 71 numeric literal rather than an identifier. 72 73 TableGen also has two string-like literals: 74 75 .. productionlist:: 76 TokString: '"' <non-'"' characters and C-like escapes> '"' 77 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" 78 79 :token:`TokCodeFragment` is essentially a multiline string literal 80 delimited by ``[{`` and ``}]``. 81 82 .. note:: 83 The current implementation accepts the following C-like escapes:: 84 85 \\ \' \" \t \n 86 87 TableGen also has the following keywords:: 88 89 bit bits class code dag 90 def foreach defm field in 91 int let list multiclass string 92 93 TableGen also has "bang operators" which have a 94 wide variety of meanings: 95 96 .. productionlist:: 97 BangOperator: one of 98 :!eq !if !head !tail !con 99 :!add !shl !sra !srl !and 100 :!cast !empty !subst !foreach !listconcat !strconcat 101 102 Syntax 103 ====== 104 105 TableGen has an ``include`` mechanism. It does not play a role in the 106 syntax per se, since it is lexically replaced with the contents of the 107 included file. 108 109 .. productionlist:: 110 IncludeDirective: "include" `TokString` 111 112 TableGen's top-level production consists of "objects". 113 114 .. productionlist:: 115 TableGenFile: `Object`* 116 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` 117 118 ``class``\es 119 ------------ 120 121 .. productionlist:: 122 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` 123 124 A ``class`` declaration creates a record which other records can inherit 125 from. A class can be parametrized by a list of "template arguments", whose 126 values can be used in the class body. 127 128 A given class can only be defined once. A ``class`` declaration is 129 considered to define the class if any of the following is true: 130 131 .. break ObjectBody into its consituents so that they are present here? 132 133 #. The :token:`TemplateArgList` is present. 134 #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. 135 #. The :token:`BaseClassList` in the :token:`ObjectBody` is present. 136 137 You can declare an empty class by giving and empty :token:`TemplateArgList` 138 and an empty :token:`ObjectBody`. This can serve as a restricted form of 139 forward declaration: note that records deriving from the forward-declared 140 class will inherit no fields from it since the record expansion is done 141 when the record is parsed. 142 143 .. productionlist:: 144 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" 145 146 Declarations 147 ------------ 148 149 .. Omitting mention of arcane "field" prefix to discourage its use. 150 151 The declaration syntax is pretty much what you would expect as a C++ 152 programmer. 153 154 .. productionlist:: 155 Declaration: `Type` `TokIdentifier` ["=" `Value`] 156 157 It assigns the value to the identifer. 158 159 Types 160 ----- 161 162 .. productionlist:: 163 Type: "string" | "code" | "bit" | "int" | "dag" 164 :| "bits" "<" `TokInteger` ">" 165 :| "list" "<" `Type` ">" 166 :| `ClassID` 167 ClassID: `TokIdentifier` 168 169 Both ``string`` and ``code`` correspond to the string type; the difference 170 is purely to indicate programmer intention. 171 172 The :token:`ClassID` must identify a class that has been previously 173 declared or defined. 174 175 Values 176 ------ 177 178 .. productionlist:: 179 Value: `SimpleValue` `ValueSuffix`* 180 ValueSuffix: "{" `RangeList` "}" 181 :| "[" `RangeList` "]" 182 :| "." `TokIdentifier` 183 RangeList: `RangePiece` ("," `RangePiece`)* 184 RangePiece: `TokInteger` 185 :| `TokInteger` "-" `TokInteger` 186 :| `TokInteger` `TokInteger` 187 188 The peculiar last form of :token:`RangePiece` is due to the fact that the 189 "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as 190 two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, 191 instead of "1", "-", and "5". 192 The :token:`RangeList` can be thought of as specifying "list slice" in some 193 contexts. 194 195 196 :token:`SimpleValue` has a number of forms: 197 198 199 .. productionlist:: 200 SimpleValue: `TokIdentifier` 201 202 The value will be the variable referenced by the identifier. It can be one 203 of: 204 205 .. The code for this is exceptionally abstruse. These examples are a 206 best-effort attempt. 207 208 * name of a ``def``, such as the use of ``Bar`` in:: 209 210 def Bar : SomeClass { 211 int X = 5; 212 } 213 214 def Foo { 215 SomeClass Baz = Bar; 216 } 217 218 * value local to a ``def``, such as the use of ``Bar`` in:: 219 220 def Foo { 221 int Bar = 5; 222 int Baz = Bar; 223 } 224 225 * a template arg of a ``class``, such as the use of ``Bar`` in:: 226 227 class Foo<int Bar> { 228 int Baz = Bar; 229 } 230 231 * value local to a ``multiclass``, such as the use of ``Bar`` in:: 232 233 multiclass Foo { 234 int Bar = 5; 235 int Baz = Bar; 236 } 237 238 * a template arg to a ``multiclass``, such as the use of ``Bar`` in:: 239 240 multiclass Foo<int Bar> { 241 int Baz = Bar; 242 } 243 244 .. productionlist:: 245 SimpleValue: `TokInteger` 246 247 This represents the numeric value of the integer. 248 249 .. productionlist:: 250 SimpleValue: `TokString`+ 251 252 Multiple adjacent string literals are concatenated like in C/C++. The value 253 is the concatenation of the strings. 254 255 .. productionlist:: 256 SimpleValue: `TokCodeFragment` 257 258 The value is the string value of the code fragment. 259 260 .. productionlist:: 261 SimpleValue: "?" 262 263 ``?`` represents an "unset" initializer. 264 265 .. productionlist:: 266 SimpleValue: "{" `ValueList` "}" 267 ValueList: [`ValueListNE`] 268 ValueListNE: `Value` ("," `Value`)* 269 270 This represents a sequence of bits, as would be used to initialize a 271 ``bits<n>`` field (where ``n`` is the number of bits). 272 273 .. productionlist:: 274 SimpleValue: `ClassID` "<" `ValueListNE` ">" 275 276 This generates a new anonymous record definition (as would be created by an 277 unnamed ``def`` inheriting from the given class with the given template 278 arguments) and the value is the value of that record definition. 279 280 .. productionlist:: 281 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] 282 283 A list initializer. The optional :token:`Type` can be used to indicate a 284 specific element type, otherwise the element type will be deduced from the 285 given values. 286 287 .. The initial `DagArg` of the dag must start with an identifier or 288 !cast, but this is more of an implementation detail and so for now just 289 leave it out. 290 291 .. productionlist:: 292 SimpleValue: "(" `DagArg` `DagArgList` ")" 293 DagArgList: `DagArg` ("," `DagArg`)* 294 DagArg: `Value` [":" `TokVarName`] | `TokVarName` 295 296 The initial :token:`DagArg` is called the "operator" of the dag. 297 298 .. productionlist:: 299 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" 300 301 Bodies 302 ------ 303 304 .. productionlist:: 305 ObjectBody: `BaseClassList` `Body` 306 BaseClassList: [":" `BaseClassListNE`] 307 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* 308 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] 309 DefmID: `TokIdentifier` 310 311 The version with the :token:`MultiClassID` is only valid in the 312 :token:`BaseClassList` of a ``defm``. 313 The :token:`MultiClassID` should be the name of a ``multiclass``. 314 315 .. put this somewhere else 316 317 It is after parsing the base class list that the "let stack" is applied. 318 319 .. productionlist:: 320 Body: ";" | "{" BodyList "}" 321 BodyList: BodyItem* 322 BodyItem: `Declaration` ";" 323 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" 324 325 The ``let`` form allows overriding the value of an inherited field. 326 327 ``def`` 328 ------- 329 330 .. TODO:: 331 There can be pastes in the names here, like ``#NAME#``. Look into that 332 and document it (it boils down to ParseIDValue with IDParseMode == 333 ParseNameMode). ParseObjectName calls into the general ParseValue, with 334 the only different from "arbitrary expression parsing" being IDParseMode 335 == Mode. 336 337 .. productionlist:: 338 Def: "def" `TokIdentifier` `ObjectBody` 339 340 Defines a record whose name is given by the :token:`TokIdentifier`. The 341 fields of the record are inherited from the base classes and defined in the 342 body. 343 344 Special handling occurs if this ``def`` appears inside a ``multiclass`` or 345 a ``foreach``. 346 347 ``defm`` 348 -------- 349 350 .. productionlist:: 351 Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" 352 353 Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must 354 precede any ``class``'s that appear. 355 356 ``foreach`` 357 ----------- 358 359 .. productionlist:: 360 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" 361 :| "foreach" `Declaration` "in" `Object` 362 363 The value assigned to the variable in the declaration is iterated over and 364 the object or object list is reevaluated with the variable set at each 365 iterated value. 366 367 Top-Level ``let`` 368 ----------------- 369 370 .. productionlist:: 371 Let: "let" `LetList` "in" "{" `Object`* "}" 372 :| "let" `LetList` "in" `Object` 373 LetList: `LetItem` ("," `LetItem`)* 374 LetItem: `TokIdentifier` [`RangeList`] "=" `Value` 375 376 This is effectively equivalent to ``let`` inside the body of a record 377 except that it applies to multiple records at a time. The bindings are 378 applied at the end of parsing the base classes of a record. 379 380 ``multiclass`` 381 -------------- 382 383 .. productionlist:: 384 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] 385 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" 386 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* 387 MultiClassID: `TokIdentifier` 388 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` 389