Home | History | Annotate | Download | only in analyzer
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
      2           "http://www.w3.org/TR/html4/strict.dtd">
      3 <html>
      4 <head>
      5   <title>Open Projects</title>
      6   <link type="text/css" rel="stylesheet" href="menu.css">
      7   <link type="text/css" rel="stylesheet" href="content.css">
      8   <script type="text/javascript" src="scripts/menu.js"></script>  
      9 </head>
     10 <body>
     11 
     12 <div id="page">
     13 <!--#include virtual="menu.html.incl"-->
     14 <div id="content">
     15 
     16 <h1>Open Projects</h1>
     17 
     18 <p>This page lists several projects that would boost analyzer's usability and 
     19 power. Most of the projects listed here are infrastructure-related so this list 
     20 is an addition to the <a href="potential_checkers.html">potential checkers 
     21 list</a>. If you are interested in tackling one of these, please send an email 
     22 to the <a href=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>cfe-dev
     23 mailing list</a> to notify other members of the community.</p>
     24 
     25 <ul>  
     26   <li>Core Analyzer Infrastructure
     27   <ul>
     28     <li>Explicitly model standard library functions with <tt>BodyFarm</tt>.
     29     <p><tt><a href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt> 
     30     allows the analyzer to explicitly model functions whose definitions are 
     31     not available during analysis. Modeling more of the widely used functions 
     32     (such as the members of <tt>std::string</tt>) will improve precision of the
     33     analysis. 
     34     <i>(Difficulty: Easy)</i><p>
     35     </li>
     36     
     37     <li>Implement generalized loop execution modeling.
     38     <p>Currently, the analyzer simply unrolls each loop <tt>N</tt> times. This 
     39     means that it will not execute any code after the loop if the loop is 
     40     guaranteed to execute more than <tt>N</tt> times. This results in lost 
     41     basic block coverage. We could continue exploring the path if we could 
     42     model a generic <tt>i</tt>-th iteration of a loop.
     43     <i> (Difficulty: Hard)</i></p>
     44     </li>
     45 
     46     <li>Enhance CFG to model C++ temporaries properly.
     47     <p>There is an existing implementation of this, but it's not complete and
     48     is disabled in the analyzer.
     49     <i>(Difficulty: Medium)</i></p>    
     50 
     51     <li>Enhance CFG to model exception-handling properly.
     52     <p>Currently exceptions are treated as "black holes", and exception-handling
     53     control structures are poorly modeled (to be conservative). This could be
     54     much improved for both C++ and Objective-C exceptions.
     55     <i>(Difficulty: Medium)</i></p>    
     56 
     57     <li>Enhance CFG to model C++ <code>new</code> more precisely.
     58     <p>The current representation of <code>new</code> does not provide an easy
     59     way for the analyzer to model the call to a memory allocation function
     60     (<code>operator new</code>), then initialize the result with a constructor
     61     call. The problem is discussed at length in
     62     <a href="http://llvm.org/bugs/show_bug.cgi?id=12014">PR12014</a>.
     63     <i>(Difficulty: Easy)</i></p>    
     64 
     65     <li>Enhance CFG to model C++ <code>delete</code> more precisely.
     66     <p>Similarly, the representation of <code>delete</code> does not include
     67     the call to the destructor, followed by the call to the deallocation
     68     function (<code>operator delete</code>). One particular issue 
     69     (<tt>noreturn</tt> destructors) is discussed in
     70     <a href="http://llvm.org/bugs/show_bug.cgi?id=15599">PR15599</a>
     71     <i>(Difficulty: Easy)</i></p>    
     72 
     73     <li>Track type info through casts more precisely.
     74     <p>The DynamicTypePropagation checker is in charge of inferring a region's
     75     dynamic type based on what operations the code is performing. Casts are a
     76     rich source of type information that the analyzer currently ignores. They
     77     are tricky to get right, but might have very useful consequences.
     78     <i>(Difficulty: Medium)</i></p>    
     79 
     80     <li>Design and implement alpha-renaming.
     81     <p>Implement unifying two symbolic values along a path after they are 
     82     determined to be equal via comparison. This would allow us to reduce the 
     83     number of false positives and would be a building step to more advanced 
     84     analyses, such as summary-based interprocedural and cross-translation-unit 
     85     analysis. 
     86     <i>(Difficulty: Hard)</i></p>
     87     </li>    
     88   </ul>
     89   </li>
     90 
     91   <li>Bug Reporting 
     92   <ul>
     93     <li>Add support for displaying cross-file diagnostic paths in HTML output
     94     (used by <tt>scan-build</tt>).
     95     <p>Currently <tt>scan-build</tt> output does not display reports that span 
     96     multiple files. The main problem is that we do not have a good format to
     97     display such paths in HTML output. <i>(Difficulty: Medium)</i> </p>
     98     </li>
     99     
    100     <li>Relate bugs to checkers / "bug types"
    101     <p>We need to come up with an API which will relate bug reports 
    102     to the checkers that produce them and refactor the existing code to use the 
    103     new API. This would allow us to identify the checker from the bug report,
    104     which paves the way for selective control of certain checks.
    105     <i>(Difficulty: Easy-Medium)</i></p>
    106     </li>
    107     
    108     <li>Refactor path diagnostic generation in <a href="http://clang.llvm.org/doxygen/BugReporter_8cpp_source.html">BugReporter.cpp</a>.
    109     <p>It would be great to have more code reuse between "Minimal" and 
    110     "Extensive" PathDiagnostic generation algorithms. One idea is to create an 
    111     IR for representing path diagnostics, which would be later be used to 
    112     generate minimal or extensive report output. <i>(Difficulty: Medium)</i></p>
    113     </li>
    114   </ul>
    115   </li>
    116 
    117   <li>Other Infrastructure 
    118   <ul>
    119     <li>Rewrite <tt>scan-build</tt> (in Python).
    120     <p><i>(Difficulty: Easy)</i></p>
    121     </li>
    122 
    123     <li>Do a better job interposing on a compilation.
    124     <p>Currently, <tt>scan-build</tt> just sets the <tt>CC</tt> and <tt>CXX</tt>
    125     environment variables to its wrapper scripts, which then call into an
    126     underlying platform compiler. This is problematic for any project that
    127     doesn't exclusively use <tt>CC</tt> and <tt>CXX</tt> to control its
    128     compilers.
    129     <p><i>(Difficulty: Medium-Hard)</i></p>
    130     </li>
    131 
    132     <li>Create an <tt>analyzer_annotate</tt> attribute for the analyzer 
    133     annotations.
    134     <p>We would like to put all analyzer attributes behind a fence so that we 
    135     could add/remove them without worrying that compiler (not analyzer) users 
    136     depend on them. Design and implement such a generic analyzer attribute in 
    137     the compiler. <i>(Difficulty: Medium)</i></p>
    138     </li>
    139   </ul>
    140   </li>
    141 
    142   <li>Enhanced Checks
    143   <ul>
    144     <li>Implement a production-ready StreamChecker.
    145     <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 
    146     Hours talk 
    147     (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a>
    148     <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>). 
    149     We need to implement a production version of the checker with richer set of 
    150     APIs and evaluate it by running on real codebases. 
    151     <i>(Difficulty: Easy)</i></p>
    152     </li>
    153 
    154     <li>Extend Malloc checker with reasoning about custom allocator, 
    155     deallocator, and ownership-transfer functions.
    156     <p>This would require extending the MallocPessimistic checker to reason 
    157     about annotated functions. It is strongly desired that one would rely on 
    158     the <tt>analyzer_annotate</tt> attribute, as described above. 
    159     <i>(Difficulty: Easy)</i></p>
    160     </li>
    161 
    162     <li>Implement iterators invalidation checker.
    163     <p><i>(Difficulty: Easy)</i></p>
    164     </li>
    165     
    166     <li>Write checkers which catch Copy and Paste errors.
    167     <p>Take a look at the
    168     <a href="http://pages.cs.wisc.edu/~shanlu/paper/TSE-CPMiner.pdf">CP-Miner</a>
    169     paper for inspiration. 
    170     <i>(Difficulty: Medium-Hard)</i></p>
    171     </li>  
    172   </ul>
    173   </li>
    174 </ul>
    175 
    176 </div>
    177 </div>
    178 </body>
    179 </html>
    180 
    181