Home | History | Annotate | Download | only in www
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
      2           "http://www.w3.org/TR/html4/strict.dtd">
      3 <html>
      4 <head>
      5   <meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
      6   <title>Clang - Expressive Diagnostics</title>
      7   <link type="text/css" rel="stylesheet" href="menu.css">
      8   <link type="text/css" rel="stylesheet" href="content.css">
      9   <style type="text/css">
     10   .warn { color:magenta; }
     11   .err { color:red; }
     12   .snip { color:darkgreen; }
     13   .point { color:blue; }
     14   </style>
     15 </head>
     16 <body>
     17 
     18 <!--#include virtual="menu.html.incl"-->
     19 
     20 <div id="content">
     21 
     22 
     23 <!--=======================================================================-->
     24 <h1>Expressive Diagnostics</h1>
     25 <!--=======================================================================-->
     26 
     27 <p>In addition to being fast and functional, we aim to make Clang extremely user
     28 friendly.  As far as a command-line compiler goes, this basically boils down to
     29 making the diagnostics (error and warning messages) generated by the compiler
     30 be as useful as possible.  There are several ways that we do this.  This section
     31 talks about the experience provided by the command line compiler, contrasting
     32 Clang output to GCC 4.2's output in several examples.
     33 <!--
     34 Other clients
     35 that embed Clang and extract equivalent information through internal APIs.-->
     36 </p>
     37 
     38 <h2>Column Numbers and Caret Diagnostics</h2>
     39 
     40 <p>First, all diagnostics produced by clang include full column number
     41 information. The clang command-line compiler driver uses this information
     42 to print "point diagnostics".
     43 (IDEs can use the information to display in-line error markup.)
     44 Precise error location in the source is a feature provided by many commercial
     45 compilers, but is generally missing from open source
     46 compilers.  This is nice because it makes it very easy to understand exactly
     47 what is wrong in a particular piece of code</p>
     48 
     49 <p>The point (the blue "^" character) exactly shows where the problem is, even
     50 inside of a string.  This makes it really easy to jump to the problem and
     51 helps when multiple instances of the same character occur on a line. (We'll 
     52 revisit this more in following examples.)</p>
     53 
     54 <pre>
     55   $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b>
     56   format-strings.c:91: warning: too few arguments for format
     57   $ <b>clang -fsyntax-only format-strings.c</b>
     58   format-strings.c:91:13: <span class="warn">warning:</span> '.*' specified field precision is missing a matching 'int' argument
     59   <span class="snip">  printf("%.*d");</span>
     60   <span class="point">            ^</span>
     61 </pre>
     62 
     63 <h2>Range Highlighting for Related Text</h2>
     64 
     65 <p>Clang captures and accurately tracks range information for expressions,
     66 statements, and other constructs in your program and uses this to make
     67 diagnostics highlight related information.  In the following somewhat
     68 nonsensical example you can see that you don't even need to see the original source code to
     69 understand what is wrong based on the Clang error. Because clang prints a
     70 point, you know exactly <em>which</em> plus it is complaining about.  The range
     71 information highlights the left and right side of the plus which makes it
     72 immediately obvious what the compiler is talking about.
     73 Range information is very useful for
     74 cases involving precedence issues and many other cases.</p>
     75 
     76 <pre>
     77   $ <b>gcc-4.2 -fsyntax-only t.c</b>
     78   t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
     79   $ <b>clang -fsyntax-only t.c</b>
     80   t.c:7:39: <span class="err">error:</span> invalid operands to binary expression ('int' and 'struct A')
     81   <span class="snip">  return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</span>
     82   <span class="point">                       ~~~~~~~~~~~~~~ ^ ~~~~~</span>
     83 </pre>
     84 
     85 <h2>Precision in Wording</h2>
     86 
     87 <p>A detail is that we have tried really hard to make the diagnostics that come
     88 out of clang contain exactly the pertinent information about what is wrong and
     89 why.  In the example above, we tell you what the inferred types are for
     90 the left and right hand sides, and we don't repeat what is obvious from the
     91 point (e.g., that this is a "binary +").</p>
     92 
     93 <p>Many other examples abound. In the following example, not only do we tell you that there is a problem with the *
     94 and point to it, we say exactly why and tell you what the type is (in case it is
     95 a complicated subexpression, such as a call to an overloaded function).  This
     96 sort of attention to detail makes it much easier to understand and fix problems
     97 quickly.</p>
     98 
     99 <pre>
    100   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    101   t.c:5: error: invalid type argument of 'unary *'
    102   $ <b>clang -fsyntax-only t.c</b>
    103   t.c:5:11: <span class="err">error:</span> indirection requires pointer operand ('int' invalid)
    104   <span class="snip">  int y = *SomeA.X;</span>
    105   <span class="point">          ^~~~~~~~</span>
    106 </pre>
    107 
    108 <h2>No Pretty Printing of Expressions in Diagnostics</h2>
    109 
    110 <p>Since Clang has range highlighting, it never needs to pretty print your code
    111 back out to you.  GCC can produce inscrutible error messages in some cases when
    112 it tries to do this.  In this example P and Q have type "int*":</p>
    113 
    114 <pre>
    115   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    116   #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object  is not a function
    117   $ <b>clang -fsyntax-only t.c</b>
    118   t.c:12:8: <span class="err">error:</span> called object type 'int' is not a function or function pointer
    119   <span class="snip">  (P-Q)();</span>
    120   <span class="point">  ~~~~~^</span>
    121 </pre>
    122 
    123 <p>This can be particularly bad in G++, which often emits errors
    124    containing lowered vtable references.  For example:</p>
    125   
    126 <pre>
    127   $ <b>cat t.cc</b>
    128   struct a {
    129     virtual int bar();
    130   };
    131   
    132   struct foo : public virtual a {
    133   };
    134   
    135   void test(foo *P) {
    136     return P->bar() + *P;
    137   }
    138   $ <b>gcc-4.2 t.cc</b>
    139   t.cc: In function 'void test(foo*)':
    140   t.cc:9: error: no match for 'operator+' in '(((a*)P) + (*(long int*)(P-&gt;foo::&lt;anonymous&gt;.a::_vptr$a + -0x00000000000000020)))-&gt;a::bar() + * P'
    141   t.cc:9: error: return-statement with a value, in function returning 'void'
    142   $ <b>clang t.cc</b>
    143   t.cc:9:18: <span class="err">error:</span> invalid operands to binary expression ('int' and 'foo')
    144   <span class="snip">  return P->bar() + *P;</span>
    145   <span class="point">         ~~~~~~~~ ^ ~~</span>
    146 </pre>
    147   
    148 
    149 <h2>Typedef Preservation and Selective Unwrapping</h2>
    150 
    151 <p>Many programmers use high-level user defined types, typedefs, and other
    152 syntactic sugar to refer to types in their program.  This is useful because they
    153 can abbreviate otherwise very long types and it is useful to preserve the
    154 typename in diagnostics.  However, sometimes very simple typedefs can wrap
    155 trivial types and it is important to strip off the typedef to understand what
    156 is going on.  Clang aims to handle both cases well.<p>
    157 
    158 <p>The following example shows where it is important to preserve
    159 a typedef in C. Here the type printed by GCC isn't even valid, but if the error
    160 were about a very long and complicated type (as often happens in C++) the error
    161 message would be ugly just because it was long and hard to read.</p>
    162 
    163 <pre>
    164   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    165   t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
    166   $ <b>clang -fsyntax-only t.c</b>
    167   t.c:15:11: <span class="err">error:</span> can't convert between vector values of different size ('__m128' and 'int const *')
    168   <span class="snip">  myvec[1]/P;</span>
    169   <span class="point">  ~~~~~~~~^~</span>
    170 </pre>
    171 
    172 <p>The following example shows where it is useful for the compiler to expose
    173 underlying details of a typedef. If the user was somehow confused about how the
    174 system "pid_t" typedef is defined, Clang helpfully displays it with "aka".</p>
    175 
    176 <pre>
    177   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    178   t.c:13: error: request for member 'x' in something not a structure or union
    179   $ <b>clang -fsyntax-only t.c</b>
    180   t.c:13:9: <span class="err">error:</span> member reference base type 'pid_t' (aka 'int') is not a structure or union
    181   <span class="snip">  myvar = myvar.x;</span>
    182   <span class="point">          ~~~~~ ^</span>
    183 </pre>
    184 
    185 <p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as:
    186 
    187 <blockquote>
    188 <pre>
    189 namespace services {
    190   struct WebService {  };
    191 }
    192 namespace myapp {
    193   namespace servers {
    194     struct Server {  };
    195   }
    196 }
    197 
    198 using namespace myapp;
    199 void addHTTPService(servers::Server const &amp;server, ::services::WebService const *http) {
    200   server += http;
    201 }
    202 </pre>
    203 </blockquote>
    204 
    205 <p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"):
    206 
    207 <pre>
    208   $ <b>g++-4.2 -fsyntax-only t.cpp</b>
    209   t.cpp:9: error: no match for 'operator+=' in 'server += http'
    210   $ <b>clang -fsyntax-only t.cpp</b>
    211   t.cpp:9:10: <span class="err">error:</span> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *')
    212     <span class="snip">server += http;</span>
    213     <span class="point">~~~~~~ ^  ~~~~</span>
    214 </pre>
    215 
    216 <p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector&lt;Real&gt;</code>) was spelled within the source code. For example:</p>
    217 
    218 <pre>
    219   $ <b>g++-4.2 -fsyntax-only t.cpp</b>
    220   t.cpp:12: error: no match for 'operator=' in 'str = vec'
    221   $ <b>clang -fsyntax-only t.cpp</b>
    222   t.cpp:12:7: <span class="err">error:</span> incompatible type assigning 'vector&lt;Real&gt;', expected 'std::string' (aka 'class std::basic_string&lt;char&gt;')
    223     <span class="snip">str = vec</span>;
    224         <span class="point">^ ~~~</span>
    225 </pre>
    226 
    227 <h2>Fix-it Hints</h2>
    228 
    229 <p>"Fix-it" hints provide advice for fixing small, localized problems
    230 in source code. When Clang produces a diagnostic about a particular
    231 problem that it can work around (e.g., non-standard or redundant
    232 syntax, missing keywords, common mistakes, etc.), it may also provide
    233 specific guidance in the form of a code transformation to correct the
    234 problem. In the following example, Clang warns about the use of a GCC
    235 extension that has been considered obsolete since 1993. The underlined
    236 code should be removed, then replaced with the code below the
    237 point line (".x =" or ".y =", respectively).</p>
    238 
    239 <pre>
    240   $ <b>clang t.c</b>
    241   t.c:5:28: <span class="warn">warning:</span> use of GNU old-style field designator extension
    242   <span class="snip">struct point origin = { x: 0.0, y: 0.0 };</span>
    243                           <span class="err">~~</span> <span class="point">^</span>
    244                           <span class="snip">.x = </span>
    245   t.c:5:36: <span class="warn">warning:</span> use of GNU old-style field designator extension
    246   <span class="snip">struct point origin = { x: 0.0, y: 0.0 };</span>
    247                                   <span class="err">~~</span> <span class="point">^</span>
    248                                   <span class="snip">.y = </span>
    249 </pre>
    250 
    251 <p>"Fix-it" hints are most useful for
    252 working around common user errors and misconceptions. For example, C++ users
    253 commonly forget the syntax for explicit specialization of class templates,
    254 as in the error in the following example. Again, after describing the problem,
    255 Clang provides the fix--add <code>template&lt;&gt;</code>--as part of the
    256 diagnostic.<p>
    257 
    258 <pre>
    259   $ <b>clang t.cpp</b>
    260   t.cpp:9:3: <span class="err">error:</span> template specialization requires 'template&lt;&gt;'
    261     struct iterator_traits&lt;file_iterator&gt; {
    262     <span class="point">^</span>
    263     <span class="snip">template&lt;&gt; </span>
    264 </pre>
    265 
    266 <h2>Automatic Macro Expansion</h2>
    267 
    268 <p>Many errors happen in macros that are sometimes deeply nested.  With
    269 traditional compilers, you need to dig deep into the definition of the macro to
    270 understand how you got into trouble.  The following simple example shows how
    271 Clang helps you out by automatically printing instantiation information and
    272 nested range information for diagnostics as they are instantiated through macros
    273 and also shows how some of the other pieces work in a bigger example.</p>
    274 
    275 <pre>
    276   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    277   t.c: In function 'test':
    278   t.c:80: error: invalid operands to binary &lt; (have 'struct mystruct' and 'float')
    279   $ <b>clang -fsyntax-only t.c</b>
    280   t.c:80:3: <span class="err">error:</span> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
    281   <span class="snip">  X = MYMAX(P, F);</span>
    282   <span class="point">      ^~~~~~~~~~~</span>
    283   t.c:76:94: note: instantiated from:
    284   <span class="snip">#define MYMAX(A,B)    __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a &lt; __b ? __b : __a; })</span>
    285   <span class="point">                                                                                         ~~~ ^ ~~~</span>
    286 </pre>
    287 
    288 <p>Here's another real world warning that occurs in the "window" Unix package (which
    289 implements the "wwopen" class of APIs):</p>
    290 
    291 <pre>
    292   $ <b>clang -fsyntax-only t.c</b>
    293   t.c:22:2: <span class="warn">warning:</span> type specifier missing, defaults to 'int'
    294   <span class="snip">        ILPAD();</span>
    295   <span class="point">        ^</span>
    296   t.c:17:17: note: instantiated from:
    297   <span class="snip">#define ILPAD() PAD((NROW - tt.tt_row) * 10)    /* 1 ms per char */</span>
    298   <span class="point">                ^</span>
    299   t.c:14:2: note: instantiated from:
    300   <span class="snip">        register i; \</span>
    301   <span class="point">        ^</span>
    302 </pre>
    303 
    304 <p>In practice, we've found that Clang's treatment of macros is actually more useful in multiply nested
    305 macros that in simple ones.</p>
    306 
    307 <h2>Quality of Implementation and Attention to Detail</h2>
    308 
    309 <p>Finally, we have put a lot of work polishing the little things, because
    310 little things add up over time and contribute to a great user experience.</p>
    311 
    312 <p>The following example shows a trivial little tweak, where we tell you to put the semicolon at
    313 the end of the line that is missing it (line 4) instead of at the beginning of
    314 the following line (line 5).  This is particularly important with fixit hints
    315 and point diagnostics, because otherwise you don't get the important context.
    316 </p>
    317 
    318 <pre>
    319   $ <b>gcc-4.2 t.c</b>
    320   t.c: In function 'foo':
    321   t.c:5: error: expected ';' before '}' token
    322   $ <b>clang t.c</b>
    323   t.c:4:8: <span class="err">error:</span> expected ';' after expression
    324   <span class="snip">  bar()</span>
    325   <span class="point">       ^</span>
    326   <span class="point">       ;</span>
    327 </pre>
    328 
    329 <p>The following example shows much better error recovery than GCC. The message coming out
    330 of GCC is completely useless for diagnosing the problem. Clang tries much harder
    331 and produces a much more useful diagnosis of the problem.</p>
    332 
    333 <pre>
    334   $ <b>gcc-4.2 t.c</b>
    335   t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token
    336   $ <b>clang t.c</b>
    337   t.c:3:1: <span class="err">error:</span> unknown type name 'foo_t'
    338   <span class="snip">foo_t *P = 0;</span>
    339   <span class="point">^</span>
    340 </pre>
    341 
    342 <p>The following example shows that we recover from the simple case of
    343 forgetting a ; after a struct definition much better than GCC.</p>
    344 
    345 <pre>
    346   $ <b>cat t.cc</b>
    347   template&lt;class T&gt;
    348   class a {}
    349   class temp {};
    350   a&lt;temp&gt; b;
    351   struct b {
    352   }
    353   $ <b>gcc-4.2 t.cc</b>
    354   t.cc:3: error: multiple types in one declaration
    355   t.cc:4: error: non-template type 'a' used as a template
    356   t.cc:4: error: invalid type in declaration before ';' token
    357   t.cc:6: error: expected unqualified-id at end of input
    358   $ <b>clang t.cc</b>
    359   t.cc:2:11: <span class="err">error:</span> expected ';' after class
    360   <span class="snip">class a {}</span>
    361   <span class="point">          ^</span>
    362   <span class="point">          ;</span>
    363   t.cc:6:2: <span class="err">error:</span> expected ';' after struct
    364   <span class="snip">}</span>
    365   <span class="point"> ^</span>
    366   <span class="point"> ;</span>
    367 </pre>
    368 
    369 <p>While each of these details is minor, we feel that they all add up to provide
    370 a much more polished experience.</p>
    371 
    372 </div>
    373 </body>
    374 </html>
    375