1 {{+bindTo:partials.standard_nacl_article}} 2 3 <section id="nacl-sfi-model-on-x86-64-systems"> 4 <span id="x86-64-sandbox"></span><h1 id="nacl-sfi-model-on-x86-64-systems"><span id="x86-64-sandbox"></span>NaCl SFI model on x86-64 systems</h1> 5 <div class="contents local" id="contents" style="display: none"> 6 <ul class="small-gap"> 7 <li><a class="reference internal" href="#summary" id="id5">Summary</a></li> 8 <li><a class="reference internal" href="#binary-format" id="id6">Binary Format</a></li> 9 <li><a class="reference internal" href="#runtime-invariants" id="id7">Runtime Invariants</a></li> 10 <li><a class="reference internal" href="#text-segment-rules" id="id8">Text Segment Rules</a></li> 11 <li><a class="reference internal" href="#list-of-pseudo-instructions" id="id9">List of Pseudo-instructions</a></li> 12 </ul> 13 14 </div><section id="summary"> 15 <h2 id="summary">Summary</h2> 16 <p>This document addresses the details of the Software Fault Isolation 17 (SFI) model for executable code that can be run in Native Client on an 18 x86-64 system. An overview of this model can be found in the paper: 19 <a class="reference external" href="https://research.google.com/pubs/archive/35649.pdf">Adapting Software Fault Isolation to Contemporary CPU Architectures</a>. 20 The primary focus of the SFI model is a Windows x86-64 system but the 21 same techniques can be applied to run identical x86-64 binaries on 22 other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the 23 description of the SFI model tries to abstract away system 24 dependencies when possible.</p> 25 <p>Please note: throughout this document we use the AT&T notation for 26 assembler syntax, in which the target operand appears last, e.g. <code>mov 27 src, dst</code>.</p> 28 </section><section id="binary-format"> 29 <h2 id="binary-format">Binary Format</h2> 30 <p>The format of Native Client executable binaries is identical to the 31 x86-64 ELF binary format (<a class="reference external" href="http://en.wikipedia.org/wiki/Executable_and_Linkable_Format">[0]</a>, <a class="reference external" href="http://www.sco.com/developers/devspecs/gabi41.pdf">[1]</a>, <a class="reference external" href="http://www.sco.com/developers/gabi/latest/contents.html">[2]</a>, <a class="reference external" href="http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf">[3]</a>) for 32 Linux or BSD with a few extra requirements. The additional rules that 33 a Native Client ELF binary must follow are:</p> 34 <ul class="small-gap"> 35 <li>The ELF magic OS ABI field must be 123.</li> 36 <li>The ELF magic OS ABI VERSION field must be 5.</li> 37 <li>The ELF e_flags field must be 0x200000 (32-byte alignment).</li> 38 <li>There must be exactly one PT_LOAD text segment. It must begin at 39 0x20000 (128 kB) and be marked RX (no W). The contents of the text 40 segment must follow <a class="reference internal" href="#x86-64-text-segment-rules"><em>Text Segment Rules</em></a>.</li> 41 <li>There can be at most one PT_LOAD data segment marked R.</li> 42 <li>There can be at most one PT_LOAD data segment marked RW.</li> 43 <li>There can be at most one PT_GNU_STACK segment. It must be marked RW.</li> 44 <li>All segments must end before limit address (4 GiB).</li> 45 </ul> 46 </section><section id="runtime-invariants"> 47 <h2 id="runtime-invariants">Runtime Invariants</h2> 48 <p>To ensure fault isolation at runtime, the system must maintain a 49 number of runtime <em>invariants</em> across the lifetime of the running 50 program. Both the <em>Validator</em> and the <em>Service Runtime</em> are 51 responsible for maintaining the invariants. See the paper for the 52 rationale for the invariants:</p> 53 <ul class="small-gap"> 54 <li><code>RIP</code> always points to valid instruction boundary (the validator must 55 ensure this with direct jumps and direct calls).</li> 56 <li><code>R15</code> (aka <code>RBASE</code> and <code>RZP</code>) is never modified by code (the 57 validator must ensure this). Low 32 bits of <code>RZP</code> are all zero 58 (loader must ensure this).</li> 59 <li><code>RIP</code>, <code>RBP</code> and <code>RSP</code> are always in the <strong>safe zone</strong>: between 60 <code>R15</code> and <code>R15+4GiB</code>.</li> 61 </ul> 62 <blockquote> 63 <div><ul class="small-gap"> 64 <li>Exception: <code>RSP</code> and <code>RBP</code> are allowed to be in the range of 65 <code>0..4GiB</code> inside <em>pseudo-instructions</em>: <code>naclrestbp</code>, 66 <code>naclrestsp</code>, <code>naclspadj</code>, <code>naclasp</code>, <code>naclssp</code>.</li> 67 </ul> 68 </div></blockquote> 69 <ul class="small-gap"> 70 <li>84GiB are allocated for NaCl module (i.e. <strong>untrusted region</strong>):</li> 71 </ul> 72 <blockquote> 73 <div><ul class="small-gap"> 74 <li><code>R15-40GiB..R15</code> and <code>R15+4GIB..R15+44GiB</code> are buffer zones with 75 PROT_NONE flags.</li> 76 <li>The 4GB <em>safe zone</em> has pages with either PROT_WRITE or PROT_EXEC 77 but must not have PROT_WRITE+PROT_EXEC pages.</li> 78 <li>All executable code in PROT_EXEC pages is validatable and 79 guaranteed to obey the invariant.</li> 80 </ul> 81 </div></blockquote> 82 <ul class="small-gap"> 83 <li>Trampoline/springboard code is mapped to a non-writable region in 84 the <em>untrusted 84GB region</em>; each trampoline/springboard is 32-byte 85 aligned and fits within a single <em>bundle</em>.</li> 86 <li>The OS must not put any internal structures/code into the untrusted 87 region at any time (not using OS dynamic linker, etc)</li> 88 </ul> 89 </section><section id="text-segment-rules"> 90 <span id="x86-64-text-segment-rules"></span><h2 id="text-segment-rules"><span id="x86-64-text-segment-rules"></span>Text Segment Rules</h2> 91 <ul class="small-gap"> 92 <li>The validation process must ensure that the text segment complies 93 with the following rules. The validation process must complete 94 successfully strictly before executing any instruction of the 95 untrusted code.</li> 96 <li>The following instructions are illegal and must be rejected by the 97 validator (the list is not exhaustive as the validator uses a 98 whiteist, not a blacklist; this means there is a large but finite 99 list of instructions the validator allows, not a small list of 100 instructions the validator rejects):</li> 101 </ul> 102 <blockquote> 103 <div><ul class="small-gap"> 104 <li>any privileged instructions</li> 105 <li><code>mov</code> to/from segment registers</li> 106 <li><code>int</code></li> 107 <li><code>pusha</code>/<code>popa</code> (not dangerous but not needed for GCC)</li> 108 </ul> 109 </div></blockquote> 110 <ul class="small-gap"> 111 <li>There must be space for at least 32 bytes after the text segment and 112 before the next segment in ELF (towards higher addresses) that ends 113 strictly at a 64K boundary (a minimum page size for untrusted 114 code). This space will be padded with HLT instructions as part of 115 the validation process, along with the optional 64K page.</li> 116 <li>Neither instructions nor <em>pseudo-instructions</em> are permitted to span 117 a 32-byte boundary.</li> 118 <li>The ELF entry address must be 32-byte aligned.</li> 119 <li>Direct <code>CALL</code>/<code>JUMP</code> targets:</li> 120 </ul> 121 <blockquote> 122 <div><ul class="small-gap"> 123 <li>must point to a valid instruction boundary</li> 124 <li>must not point into a <em>pseudo-instruction</em></li> 125 <li>must not point between a <em>restricted register</em> (see below for 126 definition) producer instruction and its corresponding restricted 127 register consumer instruction.</li> 128 </ul> 129 </div></blockquote> 130 <ul class="small-gap"> 131 <li><code>CALL</code> instructions must be 5 bytes before a 32-byte boundary, so 132 that the return address will be 32-byte aligned.</li> 133 <li>Indirect call targets must be 32-byte aligned. Instead of indirect 134 <code>CALL</code>/<code>JMP</code> x, use <code>nacljmp</code> and <code>naclcall</code> (see below for 135 definitions of these <em>pseudo-instructions</em>)</li> 136 <li>All instructions that <strong>read</strong> or <strong>write</strong> from/to memory must use 137 one of the four registers <code>RZP</code>, <code>RIP</code>, <code>RBP</code> or <code>RSP</code> as a 138 base, restricted (see below) register index (multiplied by 0, 1, 2, 139 4 or 8) and constant displacement (optional).</li> 140 </ul> 141 <blockquote> 142 <div><ul class="small-gap"> 143 <li><p class="first">Exception to this rule: string instructions are allowed if used in 144 following sequences (the sequences should not cross <em>bundle</em> 145 boundaries; segment overrides are disallowed):</p> 146 <pre> 147 mov %edi, %edi 148 lea (%rZP,%rdi),%rdi 149 [rep] stos ; other string instructions can be used here 150 </pre> 151 <p>Note: this is identical to the <em>pseudo-instruction</em>: <code>[rep] stos 152 %?ax, %nacl:(%rdi),%rZP</code></p> 153 </li> 154 </ul> 155 </div></blockquote> 156 <ul class="small-gap"> 157 <li>An operand of a command is said to be a <strong>restricted register</strong> iff 158 it is a register that is the target of a 32-bit move in the 159 immediately-preceding command in the same <em>bundle</em> (consider the 160 previous command as additional sandboxing prefix):</li> 161 </ul> 162 <blockquote> 163 <div><pre> 164 ; any 32-bit register can be used here; the first operand is 165 ; unrestricted but often is the same register 166 mov ..., %eXX 167 </pre> 168 </div></blockquote> 169 <ul class="small-gap"> 170 <li>Instructions capable of changing <code>%RBP</code> and <code>%RSP</code> are 171 forbidden, except the instruction sequences in the whitelist below, 172 which must not cross <em>bundle</em> boundaries:</li> 173 </ul> 174 <blockquote> 175 <div><pre> 176 mov %rbp, %rsp 177 mov %rsp, %rbp 178 mov ..., %ebp 179 ; restoration of %RBP from memory, register or stack - keeps the 180 ; invariant intact 181 add %rZP, %rbp 182 mov ..., %esp 183 ; restoration of %RSP from memory, register or stack - keeps the 184 ; invariant intact 185 add %rZP, %rsp 186 lea xxx(%rbp), %esp 187 add %rZP, %rsp ; restoration of %RSP from %RBP with adjust 188 sub ..., %esp 189 add %rZP, %rsp ; stack space allocation 190 add ..., %esp 191 add %rZP, %rsp ; stack space deallocation 192 and $XX, %rsp ; alignment; XX must be between -128 and -1 193 pushq ... 194 popq ... ; except pop %RSP, pop %RBP 195 </pre> 196 </div></blockquote> 197 </section><section id="list-of-pseudo-instructions"> 198 <h2 id="list-of-pseudo-instructions">List of Pseudo-instructions</h2> 199 <p>Pseudo-instructions were introduced to let the compiler maintain the 200 invariants without needing to know the code alignment rules. The 201 assembler guarantees 32-bit alignment for all <em>pseudo-instructions</em> in 202 the table below. In addition, to the pseudo-instructions, one 203 pseudo-operand prefix is introduced: <code>%nacl</code>. Presence of the 204 <code>%nacl</code> operand prefix ensures that:</p> 205 <ul class="small-gap"> 206 <li>The instruction <code>"%mov %eXX, %eXX"</code> is added immediately before the 207 actual command using prefix <code>%nacl</code> (where <code>%eXX</code> is a 32-bit 208 part of the index register of the actual command, for example: in 209 operand <code>%nacl:(,%r11)</code>, the notation <code>%eXX</code> is referring to 210 <code>%r11d</code>)</li> 211 <li>The resulting sequence of two instructions does not cross the 212 <em>bundle</em> boundary.</li> 213 </ul> 214 <p>For example, the instruction:</p> 215 <pre> 216 mov %eax,%nacl:(%r15,%rdi,2) 217 </pre> 218 <p>is translated by the assembler to:</p> 219 <pre> 220 mov %edi,%edi 221 mov %eax,(%r15,%rdi,2) 222 </pre> 223 <p>The complete list of introduced <em>pseudo-instructions</em> is as follows:</p> 224 <table border=1> 225 <tbody> 226 <tr> 227 <td>Pseudo-instruction</td> 228 <td>Is translated to<br/> 229 </td> 230 </tr> 231 <tr> 232 <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> 233 <i>(sandboxed cmps)</i><br/> 234 </td> 235 <td>mov %esi,%esi<br/> 236 lea (%rZP,%rsi,1),%rsi<br/> 237 mov %edi,%edi<br/> 238 lea (%rZP,%rdi,1),%rdi<br/> 239 [rep] cmps (%rsi),(%rdi)<i><br/> 240 </i> 241 </td> 242 </tr> 243 <tr> 244 <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/> 245 <i>(sandboxed movs)</i><br/> 246 </td> 247 <td>mov %esi,%esi<br/> 248 lea (%rZP,%rsi,1),%rsi<br/> 249 mov %edi,%edi<br/> 250 lea (%rZP,%rdi,1),%rdi<br/> 251 [rep] movs (%rsi),(%rdi)<i><br/> 252 </i> 253 </td> 254 </tr> 255 <tr> 256 <td>naclasp ...,%rZP<br/> 257 <i>(sandboxed stack increment)</i></td> 258 <td>add ...,%esp<br/> 259 add %rZP,%rsp</td> 260 </tr> 261 <tr> 262 <td>naclcall %eXX,%rZP<br/> 263 <i>(sandboxed indirect call)</i></td> 264 <td>and $-32, %eXX<br/> 265 add %rZP, %rXX<br/> 266 call *%rXX<br/> 267 <i>Note: the assembler ensures all calls (including 268 naclcall) will end at the bundle boundary.</i></td> 269 </tr> 270 <tr> 271 <td>nacljmp %eXX,%rZP<br/> 272 <i>(sandboxed indirect jump)</i></td> 273 <td>and $-32,%eXX<br/> 274 add %rZP,%rXX<br/> 275 jmp *%rXX<br/> 276 </td> 277 </tr> 278 <tr> 279 <td>naclrestbp ...,%rZP<br/> 280 <i>(sandboxed %ebp/rbp restore)</i></td> 281 <td>mov ...,%ebp<br/> 282 add %rZP,%rbp</td> 283 </tr> 284 <tr> 285 <td>naclrestsp ...,%rZP 286 <i>(sandboxed %esp/rsp restore)</i></td> 287 <td>mov ...,%esp<br/> 288 add %rZP,%rsp</td> 289 </tr> 290 <tr> 291 <td>naclrestsp_noflags ...,%rZP 292 <i>(sandboxed %esp/rsp restore)</i></td> 293 <td>mov ...,%esp<br/> 294 lea (%rsp,%rZP,1),%rsp</td> 295 </tr> 296 <tr> 297 <td>naclspadj $N,%rZP<br/> 298 <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td> 299 <td>lea N(%rbp),%esp<br/> 300 add %rZP,%rsp</td> 301 </tr> 302 <tr> 303 <td>naclssp ...,%rZP<br/> 304 <i>(sandboxed stack decrement)</i></td> 305 <td>sub ...,%esp<br/> 306 add %rZP,%rsp</td> 307 </tr> 308 <tr> 309 <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/> 310 <i>(sandboxed stos)</i></td> 311 <td>mov %edi,%edi<br/> 312 lea (%rZP,%rdi,1),%rdi<br/> 313 [rep] scas (%rdi),%?ax<br/> 314 </td> 315 </tr> 316 <tr> 317 <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/> 318 <i>(sandboxed stos)</i></td> 319 <td>mov %edi,%edi<br/> 320 lea (%rZP,%rdi,1),%rdi<br/> 321 [rep] stos %?ax,(%rdi)<br/> 322 </td> 323 </tr> 324 </tbody> 325 </table></section></section> 326 327 {{/partials.standard_nacl_article}} 328