Home | History | Annotate | Download | only in docs
      1 =================
      2 DataFlowSanitizer
      3 =================
      4 
      5 .. toctree::
      6    :hidden:
      7 
      8    DataFlowSanitizerDesign
      9 
     10 .. contents::
     11    :local:
     12 
     13 Introduction
     14 ============
     15 
     16 DataFlowSanitizer is a generalised dynamic data flow analysis.
     17 
     18 Unlike other Sanitizer tools, this tool is not designed to detect a
     19 specific class of bugs on its own.  Instead, it provides a generic
     20 dynamic data flow analysis framework to be used by clients to help
     21 detect application-specific issues within their own code.
     22 
     23 Usage
     24 =====
     25 
     26 With no program changes, applying DataFlowSanitizer to a program
     27 will not alter its behavior.  To use DataFlowSanitizer, the program
     28 uses API functions to apply tags to data to cause it to be tracked, and to
     29 check the tag of a specific data item.  DataFlowSanitizer manages
     30 the propagation of tags through the program according to its data flow.
     31 
     32 The APIs are defined in the header file ``sanitizer/dfsan_interface.h``.
     33 For further information about each function, please refer to the header
     34 file.
     35 
     36 ABI List
     37 --------
     38 
     39 DataFlowSanitizer uses a list of functions known as an ABI list to decide
     40 whether a call to a specific function should use the operating system's native
     41 ABI or whether it should use a variant of this ABI that also propagates labels
     42 through function parameters and return values.  The ABI list file also controls
     43 how labels are propagated in the former case.  DataFlowSanitizer comes with a
     44 default ABI list which is intended to eventually cover the glibc library on
     45 Linux but it may become necessary for users to extend the ABI list in cases
     46 where a particular library or function cannot be instrumented (e.g. because
     47 it is implemented in assembly or another language which DataFlowSanitizer does
     48 not support) or a function is called from a library or function which cannot
     49 be instrumented.
     50 
     51 DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`.
     52 The pass treats every function in the ``uninstrumented`` category in the
     53 ABI list file as conforming to the native ABI.  Unless the ABI list contains
     54 additional categories for those functions, a call to one of those functions
     55 will produce a warning message, as the labelling behavior of the function
     56 is unknown.  The other supported categories are ``discard``, ``functional``
     57 and ``custom``.
     58 
     59 * ``discard`` -- To the extent that this function writes to (user-accessible)
     60   memory, it also updates labels in shadow memory (this condition is trivially
     61   satisfied for functions which do not write to user-accessible memory).  Its
     62   return value is unlabelled.
     63 * ``functional`` -- Like ``discard``, except that the label of its return value
     64   is the union of the label of its arguments.
     65 * ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F``
     66   is called, where ``F`` is the name of the function.  This function may wrap
     67   the original function or provide its own implementation.  This category is
     68   generally used for uninstrumentable functions which write to user-accessible
     69   memory or which have more complex label propagation behavior.  The signature
     70   of ``__dfsw_F`` is based on that of ``F`` with each argument having a
     71   label of type ``dfsan_label`` appended to the argument list.  If ``F``
     72   is of non-void return type a final argument of type ``dfsan_label *``
     73   is appended to which the custom function can store the label for the
     74   return value.  For example:
     75 
     76 .. code-block:: c++
     77 
     78   void f(int x);
     79   void __dfsw_f(int x, dfsan_label x_label);
     80 
     81   void *memcpy(void *dest, const void *src, size_t n);
     82   void *__dfsw_memcpy(void *dest, const void *src, size_t n,
     83                       dfsan_label dest_label, dfsan_label src_label,
     84                       dfsan_label n_label, dfsan_label *ret_label);
     85 
     86 If a function defined in the translation unit being compiled belongs to the
     87 ``uninstrumented`` category, it will be compiled so as to conform to the
     88 native ABI.  Its arguments will be assumed to be unlabelled, but it will
     89 propagate labels in shadow memory.
     90 
     91 For example:
     92 
     93 .. code-block:: none
     94 
     95   # main is called by the C runtime using the native ABI.
     96   fun:main=uninstrumented
     97   fun:main=discard
     98 
     99   # malloc only writes to its internal data structures, not user-accessible memory.
    100   fun:malloc=uninstrumented
    101   fun:malloc=discard
    102 
    103   # tolower is a pure function.
    104   fun:tolower=uninstrumented
    105   fun:tolower=functional
    106 
    107   # memcpy needs to copy the shadow from the source to the destination region.
    108   # This is done in a custom function.
    109   fun:memcpy=uninstrumented
    110   fun:memcpy=custom
    111 
    112 Example
    113 =======
    114 
    115 The following program demonstrates label propagation by checking that
    116 the correct labels are propagated.
    117 
    118 .. code-block:: c++
    119 
    120   #include <sanitizer/dfsan_interface.h>
    121   #include <assert.h>
    122 
    123   int main(void) {
    124     int i = 1;
    125     dfsan_label i_label = dfsan_create_label("i", 0);
    126     dfsan_set_label(i_label, &i, sizeof(i));
    127 
    128     int j = 2;
    129     dfsan_label j_label = dfsan_create_label("j", 0);
    130     dfsan_set_label(j_label, &j, sizeof(j));
    131 
    132     int k = 3;
    133     dfsan_label k_label = dfsan_create_label("k", 0);
    134     dfsan_set_label(k_label, &k, sizeof(k));
    135 
    136     dfsan_label ij_label = dfsan_get_label(i + j);
    137     assert(dfsan_has_label(ij_label, i_label));
    138     assert(dfsan_has_label(ij_label, j_label));
    139     assert(!dfsan_has_label(ij_label, k_label));
    140 
    141     dfsan_label ijk_label = dfsan_get_label(i + j + k);
    142     assert(dfsan_has_label(ijk_label, i_label));
    143     assert(dfsan_has_label(ijk_label, j_label));
    144     assert(dfsan_has_label(ijk_label, k_label));
    145 
    146     return 0;
    147   }
    148 
    149 Current status
    150 ==============
    151 
    152 DataFlowSanitizer is a work in progress, currently under development for
    153 x86\_64 Linux.
    154 
    155 Design
    156 ======
    157 
    158 Please refer to the :doc:`design document<DataFlowSanitizerDesign>`.
    159