1 # Compiling P4 to EBPF 2 3 Mihai Budiu - mbudiu (a] barefootnetworks.com 4 5 September 22, 2015 6 7 ## Abstract 8 9 This document describes a prototype compiler that translates programs 10 written in the P4 programming languages to eBPF programs. The 11 translation is performed by generating programs written in a subset of 12 the C programming language, that are converted to EBPF using the BPF 13 Compiler Collection tools. 14 15 The compiler code is licensed under an [Apache v2.0 license] 16 (http://www.apache.org/licenses/LICENSE-2.0.html). 17 18 ## Preliminaries 19 20 In this section we give a brief overview of P4 and EBPF. A detailed 21 treatment of these topics is outside the scope of this text. 22 23 ### P4 24 25 P4 (http://p4.org) is a domain-specific programming language for 26 specifying the behavior of the dataplanes of network-forwarding 27 elements. The name of the programming language comes from the title 28 of a paper published in the proceedings of SIGCOMM Computer 29 Communications Review in 2014: 30 http://www.sigcomm.org/ccr/papers/2014/July/0000000.0000004: 31 "Programming Protocol-Independent Packet Processors". 32 33 P4 itself is protocol-independent but allows programmers to express a 34 rich set of data plane behaviors and protocols. The core P4 35 abstractions are: 36 37 * Header definitions describe the format (the set of fields and their 38 sizes) of each header within a packet. 39 40 * Parse graphs (finite-state machines) describe the permitted header 41 sequences within received packets. 42 43 * Tables associate keys to actions. P4 tables generalize traditional 44 forwarding tables; they can be used to implement routing tables, 45 flow lookup tables, access-control lists, etc. 46 47 * Actions describe how packet header fields and metadata are manipulated. 48 49 * Match-action units stitch together tables and actions, and perform 50 the following sequence of operations: 51 52 * Construct lookup keys from packet fields or computed metadata, 53 54 * Use the constructed lookup key to index into tables, choosing an 55 action to execute, 56 57 * Finally, execute the selected action. 58 59 * Control flow is expressed as an imperative program describing the 60 data-dependent packet processing within a pipeline, including the 61 data-dependent sequence of match-action unit invocations. 62 63 P4 programs describe the behavior of network-processing dataplanes. A 64 P4 program is designed to operate in concert with a separate *control 65 plane* program. The control plane is responsible for managing at 66 runtime the contents of the P4 tables. P4 cannot be used to specify 67 control-planes; however, a P4 program implicitly specifies the 68 interface between the data-plane and the control-plane. 69 70 The P4 language is under active development; the current stable 71 version is 1.0.2 (see http://p4.org/spec); a reference implementation 72 of a compiler and associated tools is freely available using a Apache 73 2 open-source license (see http://p4.org/code). 74 75 ### EBPF 76 77 #### Safe code 78 79 EBPF is a acronym that stands for Extended Berkeley Packet Filters. 80 In essence EBPF is a low-level programming language (similar to 81 machine code); EBPF programs are traditionally executed by a virtual 82 machine that resides in the Linux kernel. EBPF programs can be 83 inserted and removed from a live kernel using dynamic code 84 instrumentation. The main feature of EBPF programs is their *static 85 safety*: prior to execution all EBPF programs have to be validated as 86 being safe, and unsafe programs cannot be executed. A safe program 87 provably cannot compromise the machine it is running on: 88 89 * it can only access a restricted memory region (on the local stack) 90 91 * it can run only for a limited amount of time; during execution it 92 cannot block, sleep or take any locks 93 94 * it cannot use any kernel resources with the exception of a limited 95 set of kernel services which have been specifically whitelisted, 96 including operations to manipulate tables (described below) 97 98 #### Kernel hooks 99 100 EBPF programs are inserted into the kernel using *hooks*. There are 101 several types of hooks available: 102 103 * any function entry point in the kernel can act as a hook; attaching 104 an EBPF program to a function `foo()` will cause the EBPF program to 105 execute every time some kernel thread executes `foo()`. 106 107 * EBPF programs can also be attached using the Linux Traffic Control 108 (TC) subsystem, in the network packet processing datapath. Such 109 programs can be used as TC classifiers and actions. 110 111 * EBPF programs can also be attached to sockets or network interfaces. 112 In this case they can be used for processing packets that flow 113 through the socket/interface. 114 115 EBPF programs can be used for many purposes; the main use cases are 116 dynamic tracing and monitoring, and packet processing. We are mostly 117 interested in the latter use case in this document. 118 119 #### EBPF Tables 120 121 The EBPF runtime exposes a bi-directional kernel-userspace data 122 communication channel, called *tables* (also called maps in some EBPF 123 documents and code samples). EBPF tables are essentially key-value 124 stores, where keys and values are arbitrary fixed-size bitstrings. 125 The key width, value width and table size (maximum number of entries 126 that can be stored) are declared statically, at table creation time. 127 128 In user-space tables handles are exposed as file descriptors. Both 129 user- and kernel-space programs can manipulate tables, by inserting, 130 deleting, looking up, modifying, and enumerating entries in a table. 131 132 In kernel space the keys and values are exposed as pointers to the raw 133 underlying data stored in the table, whereas in user-space the 134 pointers point to copies of the data. 135 136 #### Concurrency 137 138 An important aspect to understand related to EBPF is the execution 139 model. An EBPF program is triggered by a kernel hook; multiple 140 instances of the same kernel hook can be running simultaneously on 141 different cores. 142 143 Each table however has a single instances across all the cores. A 144 single table may be accessed simultaneously by multiple instances of 145 the same EBPF program running as separate kernel threads on different 146 cores. EBPF tables are native kernel objects, and access to the table 147 contents is protected using the kernel RCU mechanism. This makes 148 access to table entries safe under concurrent execution; for example, 149 the memory associated to a value cannot be accidentally freed while an 150 EBPF program holds a pointer to the respective value. However, 151 accessing tables is prone to data races; since EBPF programs cannot 152 use locks, some of these races often cannot be avoided. 153 154 EBPF and the associated tools are also under active development, and 155 new capabilities are added frequently. The P4 compiler generates code 156 that can be compiled using the BPF Compiler Collection (BCC) 157 (https://github.com/iovisor/bcc) 158 159 ## Compiling P4 to EBPF 160 161 From the above description it is apparent that the P4 and EBPF 162 programming languages have different expressive powers. However, 163 there is a significant overlap in their capabilities, in particular, 164 in the domain of network packet processing. The following image 165 illustrates the situation: 166 167 ![P4 and EBPF overlap in capabilities](scope.png) 168 169 We expect that the overlapping region will grow in size as both P4 and 170 EBPF continue to mature. 171 172 The current version of the P4 to EBPF compiler translates programs 173 written in the version 1.1 of the P4 programming language to programs 174 written in a restricted subset of C. The subset of C is chosen such 175 that it should be compilable to EBPF using BCC. 176 177 ``` 178 -------------- ------- 179 P4 ---> | P4-to-EBPF | ---> C ----> | BCC | --> EBPF 180 -------------- ------- 181 ``` 182 183 The P4 program only describes the packet processing *data plane*, that 184 runs in the Linux kernel. The *control plane* must be separately 185 implemented by the user. The BCC tools simplify this task 186 considerably, by generating C and/or Python APIs that expose the 187 dataplane/control-plane APIs. 188 189 ### Dependencies 190 191 EBPF programs require a Linux kernel with version 4.2 or newer. 192 193 In order to use the P4 to EBPF compiler the following software must be installed: 194 195 * The compiler itself is written in the Python (v2.x) programming 196 language. 197 198 * the P4 compiler front-end: (https://github.com/p4lang/p4-hlir). 199 This is required for parsing the P4 programs. 200 201 * the BCC compiler collection tools: (https://github.com/iovisor/bcc). 202 This is required for compiling the generated code. Also, BCC comes 203 with a set of Python utilities which can be used to implement 204 control-plane programs that operate in concert with the kernel EBPF 205 datapath. 206 207 The P4 to EBPF compiler generates code that is designed for being used 208 as a classifier using the Linux TC subsystem. 209 210 Furthermore, the test code provided is written using the Python (v3.x) 211 programming language and requires several Python packages to be 212 installed. 213 214 ### Supported capabilities 215 216 The current version of the P4 to EBPF compiler supports a relatively 217 narrow subset of the P4 language, but still powerful enough to write 218 very complex packet filters and simple packet forwarding engines. In 219 the spirit of open-source "release early, release often", we expect 220 that the compiler's capabilities will improve gradually. 221 222 * Packet filtering is performed using the `drop()` action. Packets 223 that are not dropped will be forwarded. 224 225 * Packet forwarding is performed by setting the 226 `standard_metadata.egress_port` to the index of the destination 227 network interface 228 229 Here are some limitations imposed on the P4 programs: 230 231 * Currently both the ingress and the egress P4 pipelines are executed 232 at the same hook (wherever the user chooses to insert the generated 233 EBPF program). In the future the compiler should probably generate 234 two separate EBPF programs. 235 236 * arbitrary parsers can be compiled, but the BCC compiler will reject 237 parsers that contain cycles 238 239 * arithmetic on data wider than 32 bits is not supported 240 241 * checksum computations are not implemented. In consequence, programs 242 that IP/TCP/UDP headers will produce incorrect packet headers. 243 244 * EBPF does not offer support for ternary or LPM tables 245 246 * P4 cloning and recirculation and not supported 247 248 * meters and registers are not supported; only direct counters are 249 currently supported. EBPF can potentially support registers and 250 arbitrary counters, so these may appear in the future. 251 252 * learning (i.e. `generate_digest`) is not implemented 253 254 ### Translating P4 to C 255 256 To simplify the translation, the P4 programmer should refrain using 257 identifiers whose name starts with `ebpf_`. 258 259 The following table provides a brief summary of how each P4 construct 260 is mapped to a corresponding C construct: 261 262 #### Translating parsers 263 264 P4 Construct | C Translation 265 ----------|------------ 266 `header_type` | `struct` type 267 `header` | `struct` instance with an additional `valid` bit 268 `metadata` | `struct` instance 269 parser state | code block 270 state transition | `goto` statement 271 `extract` | load/shift/mask data from packet buffer 272 273 #### Translating match-action pipelines 274 275 P4 Construct | C Translation 276 ----------|------------ 277 table | 2 EBPF tables: second one used just for the default action 278 table key | `struct` type 279 table `actions` block | tagged `union` with all possible actions 280 `action` arguments | `struct` 281 table `reads` | EBPF table access 282 `action` body | code block 283 table `apply` | `switch` statement 284 counters | additional EBPF table 285 286 ### Code organization 287 288 The compiler code is organized in two folders: 289 290 * `compiler`: the complete compiler source code, in Python v2.x 291 The compiler entry point is `p4toEbpf.py`. 292 293 * `test`: testing code and data. There are two testing programs: 294 * `testP4toEbpf.py`: which compiles all P4 files in the testprograms folder 295 296 * `endToEndTest.py`: which compiles and executes the simple.p4 297 program, and includes a simple control plane 298 299 Currently the compiler contains no installation capabilities. 300 301 ### Invoking the compiler 302 303 Invoking the compiler is just a matter of invoking the python program 304 with a suitable input P4 file: 305 306 ``` 307 p4toEbpf.py file.p4 -o file.c 308 ``` 309 310 #### Compiler options 311 312 The P4 compiler first runs the C preprocessor on the input P4 file. 313 Some of the command-line options are passed directly to the 314 preprocessor. 315 316 The following compiler options are available: 317 318 Option | Meaning 319 -------|-------- 320 `-D macro` | Option passed to C preprocessor 321 `-I path` | Option passed to C preprocessor 322 `-U macro` | Option passed to C preprocessor 323 `-g [router|filter]` | Controls whether the generated code behaves like a router or a filter. 324 `-o outoutFile` | writes the generated C code to the specified output file. 325 326 The `-g` option controls the nature of the generated code: 327 328 * `-g filter` generates a filter; the only P4 action that has an 329 effect is the `drop()` action. Setting metadata in P4 (e.g., 330 `egress_port`) has no effect. 331 332 * `-g router` generates a simple router; both `drop()` and 333 `egress_port` impact packet processing. 334 335 #### Using the generated code 336 337 The resulting file contains the complete data structures, tables, and 338 a C function named `ebpf_filter` that implements the P4-specified 339 data-plane. This C file can be manipulated using the BCC tools; 340 please refer to the BCC project documentation and sample test files of 341 the P4 to EBPF source code for an in-depth understanding. A minimal 342 Python program that compiles and loads into the kernel the generated 343 file into EBPF is: 344 345 ``` 346 #!/usr/bin/env python3 347 from bcc import BPF 348 349 b = BPF(src_file="file.c", debug=0) 350 fn = b.load_func("ebpf_filter", BPF.SCHED_CLS) 351 ``` 352 353 ##### Connecting the generated program with the TC 354 355 The EBPF code that is generated is intended to be used as a classifier 356 attached to the ingress packet path using the Linux TC subsystem. The 357 same EBPF code should be attached to all interfaces. Note however 358 that all EBPF code instances share a single set of tables, which are 359 used to control the program behavior. 360 361 The following code fragment illustrates how the EBPF code can be 362 hooked up to the `eth0` interface using a Python program. (The `fn` 363 variable is the one produced by the previous code fragment). 364 365 ``` 366 from pyroute2 import IPRoute 367 368 ipr = IPRoute() 369 interface_name="eth0" 370 if_index = ipr.link_lookup(ifname=interface_name)[0] 371 ipr.tc("add", "ingress", if_index, "ffff:") 372 ipr.tc("add-filter", "bpf", if_index, ":1", fd=fn.fd, 373 name=fn.name, parent="ffff:", action="ok", classid=1) 374 ``` 375