1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3 <html> 4 <head> 5 <title>Open Projects</title> 6 <link type="text/css" rel="stylesheet" href="menu.css"> 7 <link type="text/css" rel="stylesheet" href="content.css"> 8 <script type="text/javascript" src="scripts/menu.js"></script> 9 </head> 10 <body> 11 12 <div id="page"> 13 <!--#include virtual="menu.html.incl"--> 14 <div id="content"> 15 16 <h1>Open Projects</h1> 17 18 <p>This page lists several projects that would boost analyzer's usability and 19 power. Most of the projects listed here are infrastructure-related so this list 20 is an addition to the <a href="potential_checkers.html">potential checkers 21 list</a>. If you are interested in tackling one of these, please send an email 22 to the <a href=http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev>cfe-dev 23 mailing list</a> to notify other members of the community.</p> 24 25 <ul> 26 <li>Core Analyzer Infrastructure 27 <ul> 28 <li>Explicitly model standard library functions with <tt>BodyFarm</tt>. 29 <p><tt><a href="http://clang.llvm.org/doxygen/classclang_1_1BodyFarm.html">BodyFarm</a></tt> 30 allows the analyzer to explicitly model functions whose definitions are 31 not available during analysis. Modeling more of the widely used functions 32 (such as the members of <tt>std::string</tt>) will improve precision of the 33 analysis. 34 <i>(Difficulty: Easy)</i><p> 35 </li> 36 37 <li>Implement generalized loop execution modeling. 38 <p>Currently, the analyzer simply unrolls each loop <tt>N</tt> times. This 39 means that it will not execute any code after the loop if the loop is 40 guaranteed to execute more than <tt>N</tt> times. This results in lost 41 basic block coverage. We could continue exploring the path if we could 42 model a generic <tt>i</tt>-th iteration of a loop. 43 <i> (Difficulty: Hard)</i></p> 44 </li> 45 46 <li>Enhance CFG to model C++ temporaries properly. 47 <p>There is an existing implementation of this, but it's not complete and 48 is disabled in the analyzer. 49 <i>(Difficulty: Medium)</i></p> 50 51 <li>Enhance CFG to model exception-handling properly. 52 <p>Currently exceptions are treated as "black holes", and exception-handling 53 control structures are poorly modeled (to be conservative). This could be 54 much improved for both C++ and Objective-C exceptions. 55 <i>(Difficulty: Medium)</i></p> 56 57 <li>Enhance CFG to model C++ <code>new</code> more precisely. 58 <p>The current representation of <code>new</code> does not provide an easy 59 way for the analyzer to model the call to a memory allocation function 60 (<code>operator new</code>), then initialize the result with a constructor 61 call. The problem is discussed at length in 62 <a href="http://llvm.org/bugs/show_bug.cgi?id=12014">PR12014</a>. 63 <i>(Difficulty: Easy)</i></p> 64 65 <li>Enhance CFG to model C++ <code>delete</code> more precisely. 66 <p>Similarly, the representation of <code>delete</code> does not include 67 the call to the destructor, followed by the call to the deallocation 68 function (<code>operator delete</code>). One particular issue 69 (<tt>noreturn</tt> destructors) is discussed in 70 <a href="http://llvm.org/bugs/show_bug.cgi?id=15599">PR15599</a> 71 <i>(Difficulty: Easy)</i></p> 72 73 <li>Track type info through casts more precisely. 74 <p>The DynamicTypePropagation checker is in charge of inferring a region's 75 dynamic type based on what operations the code is performing. Casts are a 76 rich source of type information that the analyzer currently ignores. They 77 are tricky to get right, but might have very useful consequences. 78 <i>(Difficulty: Medium)</i></p> 79 80 <li>Design and implement alpha-renaming. 81 <p>Implement unifying two symbolic values along a path after they are 82 determined to be equal via comparison. This would allow us to reduce the 83 number of false positives and would be a building step to more advanced 84 analyses, such as summary-based interprocedural and cross-translation-unit 85 analysis. 86 <i>(Difficulty: Hard)</i></p> 87 </li> 88 </ul> 89 </li> 90 91 <li>Bug Reporting 92 <ul> 93 <li>Add support for displaying cross-file diagnostic paths in HTML output 94 (used by <tt>scan-build</tt>). 95 <p>Currently <tt>scan-build</tt> output does not display reports that span 96 multiple files. The main problem is that we do not have a good format to 97 display such paths in HTML output. <i>(Difficulty: Medium)</i> </p> 98 </li> 99 100 <li>Relate bugs to checkers / "bug types" 101 <p>We need to come up with an API which will relate bug reports 102 to the checkers that produce them and refactor the existing code to use the 103 new API. This would allow us to identify the checker from the bug report, 104 which paves the way for selective control of certain checks. 105 <i>(Difficulty: Easy-Medium)</i></p> 106 </li> 107 108 <li>Refactor path diagnostic generation in <a href="http://clang.llvm.org/doxygen/BugReporter_8cpp_source.html">BugReporter.cpp</a>. 109 <p>It would be great to have more code reuse between "Minimal" and 110 "Extensive" PathDiagnostic generation algorithms. One idea is to create an 111 IR for representing path diagnostics, which would be later be used to 112 generate minimal or extensive report output. <i>(Difficulty: Medium)</i></p> 113 </li> 114 </ul> 115 </li> 116 117 <li>Other Infrastructure 118 <ul> 119 <li>Rewrite <tt>scan-build</tt> (in Python). 120 <p><i>(Difficulty: Easy)</i></p> 121 </li> 122 123 <li>Do a better job interposing on a compilation. 124 <p>Currently, <tt>scan-build</tt> just sets the <tt>CC</tt> and <tt>CXX</tt> 125 environment variables to its wrapper scripts, which then call into an 126 underlying platform compiler. This is problematic for any project that 127 doesn't exclusively use <tt>CC</tt> and <tt>CXX</tt> to control its 128 compilers. 129 <p><i>(Difficulty: Medium-Hard)</i></p> 130 </li> 131 132 <li>Create an <tt>analyzer_annotate</tt> attribute for the analyzer 133 annotations. 134 <p>We would like to put all analyzer attributes behind a fence so that we 135 could add/remove them without worrying that compiler (not analyzer) users 136 depend on them. Design and implement such a generic analyzer attribute in 137 the compiler. <i>(Difficulty: Medium)</i></p> 138 </li> 139 </ul> 140 </li> 141 142 <li>Enhanced Checks 143 <ul> 144 <li>Implement a production-ready StreamChecker. 145 <p>A SimpleStreamChecker has been presented in the Building a Checker in 24 146 Hours talk 147 (<a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">slides</a> 148 <a href="http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4">video</a>). 149 We need to implement a production version of the checker with richer set of 150 APIs and evaluate it by running on real codebases. 151 <i>(Difficulty: Easy)</i></p> 152 </li> 153 154 <li>Extend Malloc checker with reasoning about custom allocator, 155 deallocator, and ownership-transfer functions. 156 <p>This would require extending the MallocPessimistic checker to reason 157 about annotated functions. It is strongly desired that one would rely on 158 the <tt>analyzer_annotate</tt> attribute, as described above. 159 <i>(Difficulty: Easy)</i></p> 160 </li> 161 162 <li>Implement iterators invalidation checker. 163 <p><i>(Difficulty: Easy)</i></p> 164 </li> 165 166 <li>Write checkers which catch Copy and Paste errors. 167 <p>Take a look at the 168 <a href="http://pages.cs.wisc.edu/~shanlu/paper/TSE-CPMiner.pdf">CP-Miner</a> 169 paper for inspiration. 170 <i>(Difficulty: Medium-Hard)</i></p> 171 </li> 172 </ul> 173 </li> 174 </ul> 175 176 </div> 177 </div> 178 </body> 179 </html> 180 181