Home | History | Annotate | Download | only in docs
      1 # Syscall descriptions syntax
      2 
      3 Pseudo-formal grammar of syscall description:
      4 
      5 ```
      6 syscallname "(" [arg ["," arg]*] ")" [type]
      7 arg = argname type
      8 argname = identifier
      9 type = typename [ "[" type-options "]" ]
     10 typename = "const" | "intN" | "intptr" | "flags" | "array" | "ptr" |
     11 	   "buffer" | "string" | "strconst" | "filename" | "len" |
     12 	   "bytesize" | "bytesizeN" | "bitsize" | "vma" | "proc"
     13 type-options = [type-opt ["," type-opt]]
     14 ```
     15 
     16 common type-options include:
     17 
     18 ```
     19 "opt" - the argument is optional (like mmap fd argument, or accept peer argument)
     20 ```
     21 
     22 rest of the type-options are type-specific:
     23 
     24 ```
     25 "const": integer constant, type-options:
     26 	value, underlying type (one if "intN", "intptr")
     27 "intN"/"intptr": an integer without a particular meaning, type-options:
     28 	optional range of values (e.g. "5:10", or "100:200")
     29 "flags": a set of flags, type-options:
     30 	reference to flags description (see below)
     31 "array": a variable/fixed-length array, type-options:
     32 	type of elements, optional size (fixed "5", or ranged "5:10", boundaries inclusive)
     33 "ptr"/"ptr64": a pointer to an object, type-options:
     34 	type of the object; direction (in/out/inout)
     35 	ptr64 has size of 8 bytes regardless of target pointer size
     36 "buffer": a pointer to a memory buffer (like read/write buffer argument), type-options:
     37 	direction (in/out/inout)
     38 "string": a zero-terminated memory buffer (no pointer indirection implied), type-options:
     39 	either a string value in quotes for constant strings (e.g. "foo"),
     40 	or a reference to string flags (special value `filename` produces file names),
     41 	optionally followed by a buffer size (string values will be padded with \x00 to that size)
     42 "stringnoz": a non-zero-terminated memory buffer (no pointer indirection implied), type-options:
     43 	either a string value in quotes for constant strings (e.g. "foo"),
     44 	or a reference to string flags,
     45 "fmt": a string representation of an integer (not zero-terminated), type-options:
     46 	format (one of "dec", "hex", "oct") and the value (a resource, int, flags, const or proc)
     47 	the resulting data is always fixed-size (formatted as "%020llu", "0x%016llx" or "%023llo", respectively)
     48 "fileoff": offset within a file
     49 "len": length of another field (for array it is number of elements), type-options:
     50 	argname of the object
     51 "bytesize": similar to "len", but always denotes the size in bytes, type-options:
     52 	argname of the object
     53 "bitsize": similar to "len", but always denotes the size in bits, type-options:
     54 	argname of the object
     55 "vma": a pointer to a set of pages (used as input for mmap/munmap/mremap/madvise), type-options:
     56 	optional number of pages (e.g. vma[7]), or a range of pages (e.g. vma[2-4])
     57 "proc": per process int (see description below), type-options:
     58 	value range start, how many values per process, underlying type
     59 "text": machine code of the specified type, type-options:
     60 	text type (x86_real, x86_16, x86_32, x86_64, arm64)
     61 "void": type with static size 0
     62 	mostly useful inside of templates and varlen unions, can't be syscall argument
     63 ```
     64 
     65 flags/len/flags also have trailing underlying type type-option when used in structs/unions/pointers.
     66 
     67 Flags are described as:
     68 
     69 ```
     70 flagname = const ["," const]*
     71 ```
     72 
     73 or for string flags as:
     74 
     75 ```
     76 flagname = "\"" literal "\"" ["," "\"" literal "\""]*
     77 ```
     78 
     79 ## Ints
     80 
     81 `int8`, `int16`, `int32` and `int64` denote an integer of the corresponding size.
     82 `intptr` denotes a pointer-sized integer, i.e. C `long` type.
     83 
     84 By appending `be` suffix (e.g. `int16be`) integers become big-endian.
     85 
     86 It's possible to specify range of values for an integer in the format of `int32[0:100]`.
     87 
     88 To denote a bitfield of size N use `int64:N`.
     89 
     90 It's possible to use these various kinds of ints as base types for `const`, `flags`, `len` and `proc`.
     91 
     92 ```
     93 example_struct {
     94 	f0	int8			# random 1-byte integer
     95 	f1	const[0x42, int16be]	# const 2-byte integer with value 0x4200 (big-endian 0x42)
     96 	f2	int32[0:100]		# random 4-byte integer with values from 0 to 100 inclusive
     97 	f3	int64:20		# random 20-bit bitfield
     98 }
     99 ```
    100 
    101 ## Structs
    102 
    103 Structs are described as:
    104 
    105 ```
    106 structname "{" "\n"
    107 	(fieldname type "\n")+
    108 "}" ("[" attribute* "]")?
    109 ```
    110 
    111 Structs can have attributes specified in square brackets after the struct.
    112 Attributes are:
    113 
    114 ```
    115 "packed": the struct does not have paddings and has default alignment 1
    116 "align_N": the struct has alignment N
    117 "size": the struct is padded up to the specified size
    118 ```
    119 
    120 attribute
    121 
    122 ## Unions
    123 
    124 Unions are described as:
    125 
    126 ```
    127 unionname "[" "\n"
    128 	(fieldname type "\n")+
    129 "]"
    130 ```
    131 
    132 Unions can have a trailing "varlen" attribute (specified in square brackets after the union),
    133 which means that union length is not maximum of all option lengths,
    134 but rather length of a particular chosen option.
    135 
    136 ## Resources
    137 
    138 Resources represent values that need to be passed from output of one syscall to input of another syscall. For example, `close` syscall requires an input value (fd) previously returned by `open` or `pipe` syscall. To achieve this, `fd` is declared as a resource. Resources are described as:
    139 
    140 ```
    141 "resource" identifier "[" underlying_type "]" [ ":" const ("," const)* ]
    142 ```
    143 
    144 `underlying_type` is either one of `int8`, `int16`, `int32`, `int64`, `intptr` or another resource (which models inheritance, for example, a socket is a subype of fd). The optional set of constants represent resource special values, for example, `0xffffffffffffffff` (-1) for "no fd", or `AT_FDCWD` for "the current dir". Special values are used once in a while as resource values. If no special values specified, special value of `0` is used. Resources can then be used as types, for example:
    145 
    146 ```
    147 resource fd[int32]: 0xffffffffffffffff, AT_FDCWD, 1000000
    148 resource sock[fd]
    149 resource sock_unix[sock]
    150 
    151 socket(...) sock
    152 accept(fd sock, ...) sock
    153 listen(fd sock, backlog int32)
    154 ```
    155 
    156 ## Type Aliases
    157 
    158 Complex types that are often repeated can be given short type aliases using the
    159 following syntax:
    160 
    161 ```
    162 type identifier underlying_type
    163 ```
    164 
    165 For example:
    166 
    167 ```
    168 type signalno int32[0:65]
    169 type net_port proc[20000, 4, int16be]
    170 ```
    171 
    172 Then, type alias can be used instead of the underlying type in any contexts.
    173 Underlying type needs to be described as if it's a struct field, that is,
    174 with the base type if it's required. However, type alias can be used as syscall
    175 arguments as well. Underlying types are currently restricted to integer types,
    176 `ptr`, `ptr64`, `const`, `flags` and `proc` types.
    177 
    178 There are some builtin type aliases:
    179 ```
    180 type bool8	int8[0:1]
    181 type bool16	int16[0:1]
    182 type bool32	int32[0:1]
    183 type bool64	int64[0:1]
    184 type boolptr	intptr[0:1]
    185 
    186 type filename string[filename]
    187 ```
    188 
    189 ## Type Templates
    190 
    191 Type templates can be declared as follows:
    192 ```
    193 type buffer[DIR] ptr[DIR, array[int8]]
    194 type fileoff[BASE] BASE
    195 type nlattr[TYPE, PAYLOAD] {
    196 	nla_len		len[parent, int16]
    197 	nla_type	const[TYPE, int16]
    198 	payload		PAYLOAD
    199 } [align_4]
    200 ```
    201 
    202 and later used as follows:
    203 ```
    204 syscall(a buffer[in], b fileoff[int64], c ptr[in, nlattr[FOO, int32]])
    205 ```
    206 
    207 There is builtin type template `optional` defined as:
    208 ```
    209 type optional[T] [
    210 	val	T
    211 	void	void
    212 ] [varlen]
    213 ```
    214 
    215 ## Length
    216 
    217 You can specify length of a particular field in struct or a named argument by using `len`, `bytesize` and `bitsize` types, for example:
    218 
    219 ```
    220 write(fd fd, buf buffer[in], count len[buf]) len[buf]
    221 
    222 sock_fprog {
    223 	len	len[filter, int16]
    224 	filter	ptr[in, array[sock_filter]]
    225 }
    226 ```
    227 
    228 If `len`'s argument is a pointer (or a `buffer`), then the length of the pointee argument is used.
    229 
    230 To denote the length of a field in N-byte words use `bytesizeN`, possible values for N are 1, 2, 4 and 8.
    231 
    232 To denote the length of the parent struct, you can use `len[parent, int8]`.
    233 To denote the length of the higher level parent when structs are embedded into one another, you can specify the type name of the particular parent:
    234 
    235 ```
    236 struct s1 {
    237     f0      len[s2]  # length of s2
    238 }
    239 
    240 struct s2 {
    241     f0      s1
    242     f1      array[int32]
    243 }
    244 
    245 ```
    246 
    247 ## Proc
    248 
    249 The `proc` type can be used to denote per process integers.
    250 The idea is to have a separate range of values for each executor, so they don't interfere.
    251 
    252 The simplest example is a port number.
    253 The `proc[20000, 4, int16be]` type means that we want to generate an `int16be`
    254 integer starting from `20000` and assign `4` values for each process.
    255 As a result the executor number `n` will get values in the `[20000 + n * 4, 20000 + (n + 1) * 4)` range.
    256 
    257 ## Integer Constants
    258 
    259 Integer constants can be specified as decimal literals, as `0x`-prefixed
    260 hex literals, as `'`-surrounded char literals, or as symbolic constants
    261 extracted from kernel headers or defined by `define` directives. For example:
    262 
    263 ```
    264 foo(a const[10], b const[-10])
    265 foo(a const[0xabcd])
    266 foo(a int8['a':'z'])
    267 foo(a const[PATH_MAX])
    268 foo(a ptr[in, array[int8, MY_PATH_MAX]])
    269 define MY_PATH_MAX	PATH_MAX + 2
    270 ```
    271 
    272 ## Misc
    273 
    274 Description files also contain `include` directives that refer to Linux kernel header files,
    275 `incdir` directives that refer to custom Linux kernel header directories 
    276 and `define` directives that define symbolic constant values.
    277