Home | History | Annotate | Download | only in sandbox_internals
      1 {{+bindTo:partials.standard_nacl_article}}
      2 
      3 <section id="nacl-sfi-model-on-x86-64-systems">
      4 <span id="x86-64-sandbox"></span><h1 id="nacl-sfi-model-on-x86-64-systems"><span id="x86-64-sandbox"></span>NaCl SFI model on x86-64 systems</h1>
      5 <div class="contents local" id="contents" style="display: none">
      6 <ul class="small-gap">
      7 <li><a class="reference internal" href="#summary" id="id5">Summary</a></li>
      8 <li><a class="reference internal" href="#binary-format" id="id6">Binary Format</a></li>
      9 <li><a class="reference internal" href="#runtime-invariants" id="id7">Runtime Invariants</a></li>
     10 <li><a class="reference internal" href="#text-segment-rules" id="id8">Text Segment Rules</a></li>
     11 <li><a class="reference internal" href="#list-of-pseudo-instructions" id="id9">List of Pseudo-instructions</a></li>
     12 </ul>
     13 
     14 </div><section id="summary">
     15 <h2 id="summary">Summary</h2>
     16 <p>This document addresses the details of the Software Fault Isolation
     17 (SFI) model for executable code that can be run in Native Client on an
     18 x86-64 system. An overview of this model can be found in the paper:
     19 <a class="reference external" href="https://research.google.com/pubs/archive/35649.pdf">Adapting Software Fault Isolation to Contemporary CPU Architectures</a>.
     20 The primary focus of the SFI model is a Windows x86-64 system but the
     21 same techniques can be applied to run identical x86-64 binaries on
     22 other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the
     23 description of the SFI model tries to abstract away system
     24 dependencies when possible.</p>
     25 <p>Please note: throughout this document we use the AT&amp;T notation for
     26 assembler syntax, in which the target operand appears last, e.g. <code>mov
     27 src, dst</code>.</p>
     28 </section><section id="binary-format">
     29 <h2 id="binary-format">Binary Format</h2>
     30 <p>The format of Native Client executable binaries is identical to the
     31 x86-64 ELF binary format (<a class="reference external" href="http://en.wikipedia.org/wiki/Executable_and_Linkable_Format">[0]</a>, <a class="reference external" href="http://www.sco.com/developers/devspecs/gabi41.pdf">[1]</a>, <a class="reference external" href="http://www.sco.com/developers/gabi/latest/contents.html">[2]</a>, <a class="reference external" href="http://downloads.openwatcom.org/ftp/devel/docs/elf-64-gen.pdf">[3]</a>) for
     32 Linux or BSD with a few extra requirements. The additional rules that
     33 a Native Client ELF binary must follow are:</p>
     34 <ul class="small-gap">
     35 <li>The ELF magic OS ABI field must be 123.</li>
     36 <li>The ELF magic OS ABI VERSION field must be 5.</li>
     37 <li>The ELF e_flags field must be 0x200000 (32-byte alignment).</li>
     38 <li>There must be exactly one PT_LOAD text segment. It must begin at
     39 0x20000 (128 kB) and be marked RX (no W). The contents of the text
     40 segment must follow <a class="reference internal" href="#x86-64-text-segment-rules"><em>Text Segment Rules</em></a>.</li>
     41 <li>There can be at most one PT_LOAD data segment marked R.</li>
     42 <li>There can be at most one PT_LOAD data segment marked RW.</li>
     43 <li>There can be at most one PT_GNU_STACK segment. It must be marked RW.</li>
     44 <li>All segments must end before limit address (4 GiB).</li>
     45 </ul>
     46 </section><section id="runtime-invariants">
     47 <h2 id="runtime-invariants">Runtime Invariants</h2>
     48 <p>To ensure fault isolation at runtime, the system must maintain a
     49 number of runtime <em>invariants</em> across the lifetime of the running
     50 program. Both the <em>Validator</em> and the <em>Service Runtime</em> are
     51 responsible for maintaining the invariants. See the paper for the
     52 rationale for the invariants:</p>
     53 <ul class="small-gap">
     54 <li><code>RIP</code> always points to valid instruction boundary (the validator must
     55 ensure this with direct jumps and direct calls).</li>
     56 <li><code>R15</code> (aka <code>RBASE</code> and <code>RZP</code>) is never modified by code (the
     57 validator must ensure this). Low 32 bits of <code>RZP</code> are all zero
     58 (loader must ensure this).</li>
     59 <li><code>RIP</code>, <code>RBP</code> and <code>RSP</code> are always in the <strong>safe zone</strong>: between
     60 <code>R15</code> and <code>R15+4GiB</code>.</li>
     61 </ul>
     62 <blockquote>
     63 <div><ul class="small-gap">
     64 <li>Exception: <code>RSP</code> and <code>RBP</code> are allowed to be in the range of
     65 <code>0..4GiB</code> inside <em>pseudo-instructions</em>: <code>naclrestbp</code>,
     66 <code>naclrestsp</code>, <code>naclspadj</code>, <code>naclasp</code>, <code>naclssp</code>.</li>
     67 </ul>
     68 </div></blockquote>
     69 <ul class="small-gap">
     70 <li>84GiB are allocated for NaCl module (i.e. <strong>untrusted region</strong>):</li>
     71 </ul>
     72 <blockquote>
     73 <div><ul class="small-gap">
     74 <li><code>R15-40GiB..R15</code> and <code>R15+4GIB..R15+44GiB</code> are buffer zones with
     75 PROT_NONE flags.</li>
     76 <li>The 4GB <em>safe zone</em> has pages with either PROT_WRITE or PROT_EXEC
     77 but must not have PROT_WRITE+PROT_EXEC pages.</li>
     78 <li>All executable code in PROT_EXEC pages is validatable and
     79 guaranteed to obey the invariant.</li>
     80 </ul>
     81 </div></blockquote>
     82 <ul class="small-gap">
     83 <li>Trampoline/springboard code is mapped to a non-writable region in
     84 the <em>untrusted 84GB region</em>; each trampoline/springboard is 32-byte
     85 aligned and fits within a single <em>bundle</em>.</li>
     86 <li>The OS must not put any internal structures/code into the untrusted
     87 region at any time (not using OS dynamic linker, etc)</li>
     88 </ul>
     89 </section><section id="text-segment-rules">
     90 <span id="x86-64-text-segment-rules"></span><h2 id="text-segment-rules"><span id="x86-64-text-segment-rules"></span>Text Segment Rules</h2>
     91 <ul class="small-gap">
     92 <li>The validation process must ensure that the text segment complies
     93 with the following rules. The validation process must complete
     94 successfully strictly before executing any instruction of the
     95 untrusted code.</li>
     96 <li>The following instructions are illegal and must be rejected by the
     97 validator (the list is not exhaustive as the validator uses a
     98 whiteist, not a blacklist; this means there is a large but finite
     99 list of instructions the validator allows, not a small list of
    100 instructions the validator rejects):</li>
    101 </ul>
    102 <blockquote>
    103 <div><ul class="small-gap">
    104 <li>any privileged instructions</li>
    105 <li><code>mov</code> to/from segment registers</li>
    106 <li><code>int</code></li>
    107 <li><code>pusha</code>/<code>popa</code> (not dangerous but not needed for GCC)</li>
    108 </ul>
    109 </div></blockquote>
    110 <ul class="small-gap">
    111 <li>There must be space for at least 32 bytes after the text segment and
    112 before the next segment in ELF (towards higher addresses) that ends
    113 strictly at a 64K boundary (a minimum page size for untrusted
    114 code). This space will be padded with HLT instructions as part of
    115 the validation process, along with the optional 64K page.</li>
    116 <li>Neither instructions nor <em>pseudo-instructions</em> are permitted to span
    117 a 32-byte boundary.</li>
    118 <li>The ELF entry address must be 32-byte aligned.</li>
    119 <li>Direct <code>CALL</code>/<code>JUMP</code> targets:</li>
    120 </ul>
    121 <blockquote>
    122 <div><ul class="small-gap">
    123 <li>must point to a valid instruction boundary</li>
    124 <li>must not point into a <em>pseudo-instruction</em></li>
    125 <li>must not point between a <em>restricted register</em> (see below for
    126 definition) producer instruction and its corresponding restricted
    127 register consumer instruction.</li>
    128 </ul>
    129 </div></blockquote>
    130 <ul class="small-gap">
    131 <li><code>CALL</code> instructions must be 5 bytes before a 32-byte boundary, so
    132 that the return address will be 32-byte aligned.</li>
    133 <li>Indirect call targets must be 32-byte aligned. Instead of indirect
    134 <code>CALL</code>/<code>JMP</code> x, use <code>nacljmp</code> and <code>naclcall</code> (see below for
    135 definitions of these <em>pseudo-instructions</em>)</li>
    136 <li>All instructions that <strong>read</strong> or <strong>write</strong> from/to memory must use
    137 one of the four registers <code>RZP</code>, <code>RIP</code>, <code>RBP</code> or <code>RSP</code> as a
    138 base, restricted (see below) register index (multiplied by 0, 1, 2,
    139 4 or 8) and constant displacement (optional).</li>
    140 </ul>
    141 <blockquote>
    142 <div><ul class="small-gap">
    143 <li><p class="first">Exception to this rule: string instructions are allowed if used in
    144 following sequences (the sequences should not cross <em>bundle</em>
    145 boundaries; segment overrides are disallowed):</p>
    146 <pre>
    147  mov %edi, %edi
    148  lea (%rZP,%rdi),%rdi
    149  [rep] stos  ; other string instructions can be used here
    150 </pre>
    151 <p>Note: this is identical to the <em>pseudo-instruction</em>: <code>[rep] stos
    152 %?ax, %nacl:(%rdi),%rZP</code></p>
    153 </li>
    154 </ul>
    155 </div></blockquote>
    156 <ul class="small-gap">
    157 <li>An operand of a command is said to be a <strong>restricted register</strong> iff
    158 it is a register that is the target of a 32-bit move in the
    159 immediately-preceding command in the same <em>bundle</em> (consider the
    160 previous command as additional sandboxing prefix):</li>
    161 </ul>
    162 <blockquote>
    163 <div><pre>
    164  ; any 32-bit register can be used here; the first operand is
    165  ; unrestricted but often is the same register
    166  mov ..., %eXX
    167 </pre>
    168 </div></blockquote>
    169 <ul class="small-gap">
    170 <li>Instructions capable of changing <code>%RBP</code> and <code>%RSP</code> are
    171 forbidden, except the instruction sequences in the whitelist below,
    172 which must not cross <em>bundle</em> boundaries:</li>
    173 </ul>
    174 <blockquote>
    175 <div><pre>
    176  mov %rbp, %rsp
    177  mov %rsp, %rbp
    178  mov ..., %ebp
    179  ; restoration of %RBP from memory, register or stack - keeps the
    180  ; invariant intact
    181  add %rZP, %rbp
    182  mov ..., %esp
    183  ; restoration of %RSP from memory, register or stack - keeps the
    184  ; invariant intact
    185  add %rZP, %rsp
    186  lea xxx(%rbp), %esp
    187  add %rZP, %rsp  ; restoration of %RSP from %RBP with adjust
    188  sub ..., %esp
    189  add %rZP, %rsp  ; stack space allocation
    190  add ..., %esp
    191  add %rZP, %rsp  ; stack space deallocation
    192  and $XX, %rsp  ; alignment; XX must be between -128 and -1
    193  pushq ...
    194  popq ...  ; except pop %RSP, pop %RBP
    195 </pre>
    196 </div></blockquote>
    197 </section><section id="list-of-pseudo-instructions">
    198 <h2 id="list-of-pseudo-instructions">List of Pseudo-instructions</h2>
    199 <p>Pseudo-instructions were introduced to let the compiler maintain the
    200 invariants without needing to know the code alignment rules. The
    201 assembler guarantees 32-bit alignment for all <em>pseudo-instructions</em> in
    202 the table below. In addition, to the pseudo-instructions, one
    203 pseudo-operand prefix is introduced: <code>%nacl</code>. Presence of the
    204 <code>%nacl</code> operand prefix ensures that:</p>
    205 <ul class="small-gap">
    206 <li>The instruction <code>&quot;%mov %eXX, %eXX&quot;</code> is added immediately before the
    207 actual command using prefix <code>%nacl</code> (where <code>%eXX</code> is a 32-bit
    208 part of the index register of the actual command, for example: in
    209 operand <code>%nacl:(,%r11)</code>,  the notation <code>%eXX</code> is referring to
    210 <code>%r11d</code>)</li>
    211 <li>The resulting sequence of two instructions does not cross the
    212 <em>bundle</em> boundary.</li>
    213 </ul>
    214 <p>For example, the instruction:</p>
    215 <pre>
    216 mov %eax,%nacl:(%r15,%rdi,2)
    217 </pre>
    218 <p>is translated by the assembler to:</p>
    219 <pre>
    220 mov %edi,%edi
    221 mov %eax,(%r15,%rdi,2)
    222 </pre>
    223 <p>The complete list of introduced <em>pseudo-instructions</em> is as follows:</p>
    224 <table border=1>
    225 <tbody>
    226 <tr>
    227 <td>Pseudo-instruction</td>
    228 <td>Is translated to<br/>
    229 </td>
    230 </tr>
    231 <tr>
    232 <td>[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP<br/>
    233 <i>(sandboxed cmps)</i><br/>
    234 </td>
    235 <td>mov %esi,%esi<br/>
    236 lea (%rZP,%rsi,1),%rsi<br/>
    237 mov %edi,%edi<br/>
    238 lea (%rZP,%rdi,1),%rdi<br/>
    239 [rep] cmps (%rsi),(%rdi)<i><br/>
    240 </i>
    241 </td>
    242 </tr>
    243 <tr>
    244 <td>[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP<br/>
    245 <i>(sandboxed movs)</i><br/>
    246 </td>
    247 <td>mov %esi,%esi<br/>
    248 lea (%rZP,%rsi,1),%rsi<br/>
    249 mov %edi,%edi<br/>
    250 lea (%rZP,%rdi,1),%rdi<br/>
    251 [rep] movs (%rsi),(%rdi)<i><br/>
    252 </i>
    253 </td>
    254 </tr>
    255 <tr>
    256 <td>naclasp ...,%rZP<br/>
    257 <i>(sandboxed stack increment)</i></td>
    258 <td>add ...,%esp<br/>
    259 add %rZP,%rsp</td>
    260 </tr>
    261 <tr>
    262 <td>naclcall %eXX,%rZP<br/>
    263 <i>(sandboxed indirect call)</i></td>
    264 <td>and $-32, %eXX<br/>
    265 add %rZP, %rXX<br/>
    266 call *%rXX<br/>
    267 <i>Note: the assembler ensures all calls (including
    268 naclcall) will end at the bundle boundary.</i></td>
    269 </tr>
    270 <tr>
    271 <td>nacljmp %eXX,%rZP<br/>
    272 <i>(sandboxed indirect jump)</i></td>
    273 <td>and $-32,%eXX<br/>
    274 add %rZP,%rXX<br/>
    275 jmp *%rXX<br/>
    276 </td>
    277 </tr>
    278 <tr>
    279 <td>naclrestbp ...,%rZP<br/>
    280 <i>(sandboxed %ebp/rbp restore)</i></td>
    281 <td>mov ...,%ebp<br/>
    282 add %rZP,%rbp</td>
    283 </tr>
    284 <tr>
    285 <td>naclrestsp ...,%rZP
    286 <i>(sandboxed %esp/rsp restore)</i></td>
    287 <td>mov ...,%esp<br/>
    288 add %rZP,%rsp</td>
    289 </tr>
    290 <tr>
    291 <td>naclrestsp_noflags ...,%rZP
    292 <i>(sandboxed %esp/rsp restore)</i></td>
    293 <td>mov ...,%esp<br/>
    294 lea (%rsp,%rZP,1),%rsp</td>
    295 </tr>
    296 <tr>
    297 <td>naclspadj $N,%rZP<br/>
    298 <i>(sandboxed %esp/rsp restore from %rbp; incudes $N offset)</i></td>
    299 <td>lea N(%rbp),%esp<br/>
    300 add %rZP,%rsp</td>
    301 </tr>
    302 <tr>
    303 <td>naclssp ...,%rZP<br/>
    304 <i>(sandboxed stack decrement)</i></td>
    305 <td>sub ...,%esp<br/>
    306 add %rZP,%rsp</td>
    307 </tr>
    308 <tr>
    309 <td>[rep] scas %nacl:(%rdi),%?ax,%rZP<br/>
    310 <i>(sandboxed stos)</i></td>
    311 <td>mov %edi,%edi<br/>
    312 lea (%rZP,%rdi,1),%rdi<br/>
    313 [rep] scas (%rdi),%?ax<br/>
    314 </td>
    315 </tr>
    316 <tr>
    317 <td>[rep] stos %?ax,%nacl:(%rdi),%rZP<br/>
    318 <i>(sandboxed stos)</i></td>
    319 <td>mov %edi,%edi<br/>
    320 lea (%rZP,%rdi,1),%rdi<br/>
    321 [rep] stos %?ax,(%rdi)<br/>
    322 </td>
    323 </tr>
    324 </tbody>
    325 </table></section></section>
    326 
    327 {{/partials.standard_nacl_article}}
    328