Home | History | Annotate | Download | only in www
      1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
      2           "http://www.w3.org/TR/html4/strict.dtd">
      3 <html>
      4 <head>
      5   <meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
      6   <title>Clang - Expressive Diagnostics</title>
      7   <link type="text/css" rel="stylesheet" href="menu.css" />
      8   <link type="text/css" rel="stylesheet" href="content.css" />
      9   <style type="text/css">
     10 </style>
     11 </head>
     12 <body>
     13 
     14 <!--#include virtual="menu.html.incl"-->
     15 
     16 <div id="content">
     17 
     18 
     19 <!--=======================================================================-->
     20 <h1>Expressive Diagnostics</h1>
     21 <!--=======================================================================-->
     22 
     23 <p>In addition to being fast and functional, we aim to make Clang extremely user
     24 friendly.  As far as a command-line compiler goes, this basically boils down to
     25 making the diagnostics (error and warning messages) generated by the compiler
     26 be as useful as possible.  There are several ways that we do this.  This section
     27 talks about the experience provided by the command line compiler, contrasting
     28 Clang output to GCC 4.2's output in several examples.
     29 <!--
     30 Other clients
     31 that embed Clang and extract equivalent information through internal APIs.-->
     32 </p>
     33 
     34 <h2>Column Numbers and Caret Diagnostics</h2>
     35 
     36 <p>First, all diagnostics produced by clang include full column number
     37 information. The clang command-line compiler driver uses this information
     38 to print "caret diagnostics".
     39 (IDEs can use the information to display in-line error markup.)
     40 Precise error location in the source is a feature provided by many commercial
     41 compilers, but is generally missing from open source
     42 compilers.  This is nice because it makes it very easy to understand exactly
     43 what is wrong in a particular piece of code</p>
     44 
     45 <p>The caret (the blue "^" character) exactly shows where the problem is, even
     46 inside of a string.  This makes it really easy to jump to the problem and
     47 helps when multiple instances of the same character occur on a line. (We'll 
     48 revisit this more in following examples.)</p>
     49 
     50 <pre>
     51   $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b>
     52   format-strings.c:91: warning: too few arguments for format
     53   $ <b>clang -fsyntax-only format-strings.c</b>
     54   format-strings.c:91:13: <font color="magenta">warning:</font> '.*' specified field precision is missing a matching 'int' argument
     55   <font color="darkgreen">  printf("%.*d");</font>
     56   <font color="blue">            ^</font>
     57 </pre>
     58 
     59 <h2>Range Highlighting for Related Text</h2>
     60 
     61 <p>Clang captures and accurately tracks range information for expressions,
     62 statements, and other constructs in your program and uses this to make
     63 diagnostics highlight related information.  In the following somewhat
     64 nonsensical example you can see that you don't even need to see the original source code to
     65 understand what is wrong based on the Clang error. Because clang prints a
     66 caret, you know exactly <em>which</em> plus it is complaining about.  The range
     67 information highlights the left and right side of the plus which makes it
     68 immediately obvious what the compiler is talking about.
     69 Range information is very useful for
     70 cases involving precedence issues and many other cases.</p>
     71 
     72 <pre>
     73   $ <b>gcc-4.2 -fsyntax-only t.c</b>
     74   t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
     75   $ <b>clang -fsyntax-only t.c</b>
     76   t.c:7:39: <font color="red">error:</font> invalid operands to binary expression ('int' and 'struct A')
     77   <font color="darkgreen">  return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font>
     78   <font color="blue">                       ~~~~~~~~~~~~~~ ^ ~~~~~</font>
     79 </pre>
     80 
     81 <h2>Precision in Wording</h2>
     82 
     83 <p>A detail is that we have tried really hard to make the diagnostics that come
     84 out of clang contain exactly the pertinent information about what is wrong and
     85 why.  In the example above, we tell you what the inferred types are for
     86 the left and right hand sides, and we don't repeat what is obvious from the
     87 caret (e.g., that this is a "binary +").</p>
     88 
     89 <p>Many other examples abound. In the following example, not only do we tell you that there is a problem with the *
     90 and point to it, we say exactly why and tell you what the type is (in case it is
     91 a complicated subexpression, such as a call to an overloaded function).  This
     92 sort of attention to detail makes it much easier to understand and fix problems
     93 quickly.</p>
     94 
     95 <pre>
     96   $ <b>gcc-4.2 -fsyntax-only t.c</b>
     97   t.c:5: error: invalid type argument of 'unary *'
     98   $ <b>clang -fsyntax-only t.c</b>
     99   t.c:5:11: <font color="red">error:</font> indirection requires pointer operand ('int' invalid)
    100   <font color="darkgreen">  int y = *SomeA.X;</font>
    101   <font color="blue">          ^~~~~~~~</font>
    102 </pre>
    103 
    104 <h2>No Pretty Printing of Expressions in Diagnostics</h2>
    105 
    106 <p>Since Clang has range highlighting, it never needs to pretty print your code
    107 back out to you.  GCC can produce inscrutible error messages in some cases when
    108 it tries to do this.  In this example P and Q have type "int*":</p>
    109 
    110 <pre>
    111   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    112   #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object  is not a function
    113   $ <b>clang -fsyntax-only t.c</b>
    114   t.c:12:8: <font color="red">error:</font> called object type 'int' is not a function or function pointer
    115   <font color="darkgreen">  (P-Q)();</font>
    116   <font color="blue">  ~~~~~^</font>
    117 </pre>
    118 
    119 <p>This can be particularly bad in G++, which often emits errors
    120    containing lowered vtable references.  For example:</p>
    121   
    122 <pre>
    123   $ <b>cat t.cc</b>
    124   struct a {
    125     virtual int bar();
    126   };
    127   
    128   struct foo : public virtual a {
    129   };
    130   
    131   void test(foo *P) {
    132     return P->bar() + *P;
    133   }
    134   $ <b>gcc-4.2 t.cc</b>
    135   t.cc: In function 'void test(foo*)':
    136   t.cc:9: error: no match for 'operator+' in '(((a*)P) + (*(long int*)(P-&gt;foo::&lt;anonymous&gt;.a::_vptr$a + -0x00000000000000020)))-&gt;a::bar() + * P'
    137   t.cc:9: error: return-statement with a value, in function returning 'void'
    138   $ <b>clang t.cc</b>
    139   t.cc:9:18: <font color="red">error:</font> invalid operands to binary expression ('int' and 'foo')
    140   <font color="darkgreen">  return P->bar() + *P;</font>
    141   <font color="blue">         ~~~~~~~~ ^ ~~</font>
    142 </pre>
    143   
    144 
    145 <h2>Typedef Preservation and Selective Unwrapping</h2>
    146 
    147 <p>Many programmers use high-level user defined types, typedefs, and other
    148 syntactic sugar to refer to types in their program.  This is useful because they
    149 can abbreviate otherwise very long types and it is useful to preserve the
    150 typename in diagnostics.  However, sometimes very simple typedefs can wrap
    151 trivial types and it is important to strip off the typedef to understand what
    152 is going on.  Clang aims to handle both cases well.<p>
    153 
    154 <p>The following example shows where it is important to preserve
    155 a typedef in C. Here the type printed by GCC isn't even valid, but if the error
    156 were about a very long and complicated type (as often happens in C++) the error
    157 message would be ugly just because it was long and hard to read.</p>
    158 
    159 <pre>
    160   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    161   t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
    162   $ <b>clang -fsyntax-only t.c</b>
    163   t.c:15:11: <font color="red">error:</font> can't convert between vector values of different size ('__m128' and 'int const *')
    164   <font color="darkgreen">  myvec[1]/P;</font>
    165   <font color="blue">  ~~~~~~~~^~</font>
    166 </pre>
    167 
    168 <p>The following example shows where it is useful for the compiler to expose
    169 underlying details of a typedef. If the user was somehow confused about how the
    170 system "pid_t" typedef is defined, Clang helpfully displays it with "aka".</p>
    171 
    172 <pre>
    173   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    174   t.c:13: error: request for member 'x' in something not a structure or union
    175   $ <b>clang -fsyntax-only t.c</b>
    176   t.c:13:9: <font color="red">error:</font> member reference base type 'pid_t' (aka 'int') is not a structure or union
    177   <font color="darkgreen">  myvar = myvar.x;</font>
    178   <font color="blue">          ~~~~~ ^</font>
    179 </pre>
    180 
    181 <p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as:
    182 
    183 <blockquote>
    184 <pre>
    185 namespace services {
    186   struct WebService {  };
    187 }
    188 namespace myapp {
    189   namespace servers {
    190     struct Server {  };
    191   }
    192 }
    193 
    194 using namespace myapp;
    195 void addHTTPService(servers::Server const &amp;server, ::services::WebService const *http) {
    196   server += http;
    197 }
    198 </pre>
    199 </blockquote>
    200 
    201 <p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"):
    202 
    203 <pre>
    204   $ <b>g++-4.2 -fsyntax-only t.cpp</b>
    205   t.cpp:9: error: no match for 'operator+=' in 'server += http'
    206   $ <b>clang -fsyntax-only t.cpp</b>
    207   t.cpp:9:10: <font color="red">error:</font> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *')
    208     <font color="darkgreen">server += http;</font>
    209     <font color="blue">~~~~~~ ^  ~~~~</font>
    210 </pre>
    211 
    212 <p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector&lt;Real&gt;</code>) was spelled within the source code. For example:</p>
    213 
    214 <pre>
    215   $ <b>g++-4.2 -fsyntax-only t.cpp</b>
    216   t.cpp:12: error: no match for 'operator=' in 'str = vec'
    217   $ <b>clang -fsyntax-only t.cpp</b>
    218   t.cpp:12:7: <font color="red">error:</font> incompatible type assigning 'vector&lt;Real&gt;', expected 'std::string' (aka 'class std::basic_string&lt;char&gt;')
    219     <font color="darkgreen">str = vec</font>;
    220         <font color="blue">^ ~~~</font>
    221 </pre>
    222 
    223 <h2>Fix-it Hints</h2>
    224 
    225 <p>"Fix-it" hints provide advice for fixing small, localized problems
    226 in source code. When Clang produces a diagnostic about a particular
    227 problem that it can work around (e.g., non-standard or redundant
    228 syntax, missing keywords, common mistakes, etc.), it may also provide
    229 specific guidance in the form of a code transformation to correct the
    230 problem. In the following example, Clang warns about the use of a GCC
    231 extension that has been considered obsolete since 1993. The underlined
    232 code should be removed, then replaced with the code below the
    233 caret line (".x =" or ".y =", respectively).</p>
    234 
    235 <pre>
    236   $ <b>clang t.c</b>
    237   t.c:5:28: <font color="magenta">warning:</font> use of GNU old-style field designator extension
    238   <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font>
    239                           <font color="red">~~</font> <font color="blue">^</font>
    240                           <font color="darkgreen">.x = </font>
    241   t.c:5:36: <font color="magenta">warning:</font> use of GNU old-style field designator extension
    242   <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font>
    243                                   <font color="red">~~</font> <font color="blue">^</font>
    244                                   <font color="darkgreen">.y = </font>
    245 </pre>
    246 
    247 <p>"Fix-it" hints are most useful for
    248 working around common user errors and misconceptions. For example, C++ users
    249 commonly forget the syntax for explicit specialization of class templates,
    250 as in the error in the following example. Again, after describing the problem,
    251 Clang provides the fix--add <code>template&lt;&gt;</code>--as part of the
    252 diagnostic.<p>
    253 
    254 <pre>
    255   $ <b>clang t.cpp</b>
    256   t.cpp:9:3: <font color="red">error:</font> template specialization requires 'template&lt;&gt;'
    257     struct iterator_traits&lt;file_iterator&gt; {
    258     <font color="blue">^</font>
    259     <font color="darkgreen">template&lt;&gt; </font>
    260 </pre>
    261 
    262 <h2>Automatic Macro Expansion</h2>
    263 
    264 <p>Many errors happen in macros that are sometimes deeply nested.  With
    265 traditional compilers, you need to dig deep into the definition of the macro to
    266 understand how you got into trouble.  The following simple example shows how
    267 Clang helps you out by automatically printing instantiation information and
    268 nested range information for diagnostics as they are instantiated through macros
    269 and also shows how some of the other pieces work in a bigger example.</p>
    270 
    271 <pre>
    272   $ <b>gcc-4.2 -fsyntax-only t.c</b>
    273   t.c: In function 'test':
    274   t.c:80: error: invalid operands to binary &lt; (have 'struct mystruct' and 'float')
    275   $ <b>clang -fsyntax-only t.c</b>
    276   t.c:80:3: <font color="red">error:</font> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
    277   <font color="darkgreen">  X = MYMAX(P, F);</font>
    278   <font color="blue">      ^~~~~~~~~~~</font>
    279   t.c:76:94: note: instantiated from:
    280   <font color="darkgreen">#define MYMAX(A,B)    __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a &lt; __b ? __b : __a; })</font>
    281   <font color="blue">                                                                                         ~~~ ^ ~~~</font>
    282 </pre>
    283 
    284 <p>Here's another real world warning that occurs in the "window" Unix package (which
    285 implements the "wwopen" class of APIs):</p>
    286 
    287 <pre>
    288   $ <b>clang -fsyntax-only t.c</b>
    289   t.c:22:2: <font color="magenta">warning:</font> type specifier missing, defaults to 'int'
    290   <font color="darkgreen">        ILPAD();</font>
    291   <font color="blue">        ^</font>
    292   t.c:17:17: note: instantiated from:
    293   <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10)    /* 1 ms per char */</font>
    294   <font color="blue">                ^</font>
    295   t.c:14:2: note: instantiated from:
    296   <font color="darkgreen">        register i; \</font>
    297   <font color="blue">        ^</font>
    298 </pre>
    299 
    300 <p>In practice, we've found that Clang's treatment of macros is actually more useful in multiply nested
    301 macros that in simple ones.</p>
    302 
    303 <h2>Quality of Implementation and Attention to Detail</h2>
    304 
    305 <p>Finally, we have put a lot of work polishing the little things, because
    306 little things add up over time and contribute to a great user experience.</p>
    307 
    308 <p>The following example shows a trivial little tweak, where we tell you to put the semicolon at
    309 the end of the line that is missing it (line 4) instead of at the beginning of
    310 the following line (line 5).  This is particularly important with fixit hints
    311 and caret diagnostics, because otherwise you don't get the important context.
    312 </p>
    313 
    314 <pre>
    315   $ <b>gcc-4.2 t.c</b>
    316   t.c: In function 'foo':
    317   t.c:5: error: expected ';' before '}' token
    318   $ <b>clang t.c</b>
    319   t.c:4:8: <font color="red">error:</font> expected ';' after expression
    320   <font color="darkgreen">  bar()</font>
    321   <font color="blue">       ^</font>
    322   <font color="blue">       ;</font>
    323 </pre>
    324 
    325 <p>The following example shows much better error recovery than GCC. The message coming out
    326 of GCC is completely useless for diagnosing the problem. Clang tries much harder
    327 and produces a much more useful diagnosis of the problem.</p>
    328 
    329 <pre>
    330   $ <b>gcc-4.2 t.c</b>
    331   t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token
    332   $ <b>clang t.c</b>
    333   t.c:3:1: <font color="red">error:</font> unknown type name 'foo_t'
    334   <font color="darkgreen">foo_t *P = 0;</font>
    335   <font color="blue">^</font>
    336 </pre>
    337 
    338 <p>The following example shows that we recover from the simple case of
    339 forgetting a ; after a struct definition much better than GCC.</p>
    340 
    341 <pre>
    342   $ <b>cat t.cc</b>
    343   template&lt;class T&gt;
    344   class a {}
    345   class temp {};
    346   a&lt;temp&gt; b;
    347   struct b {
    348   }
    349   $ <b>gcc-4.2 t.cc</b>
    350   t.cc:3: error: multiple types in one declaration
    351   t.cc:4: error: non-template type 'a' used as a template
    352   t.cc:4: error: invalid type in declaration before ';' token
    353   t.cc:6: error: expected unqualified-id at end of input
    354   $ <b>clang t.cc</b>
    355   t.cc:2:11: <font color="red">error:</font> expected ';' after class
    356   <font color="darkgreen">class a {}</font>
    357   <font color="blue">          ^</font>
    358   <font color="blue">          ;</font>
    359   t.cc:6:2: <font color="red">error:</font> expected ';' after struct
    360   <font color="darkgreen">}</font>
    361   <font color="blue"> ^</font>
    362   <font color="blue"> ;</font>
    363 </pre>
    364 
    365 <p>While each of these details is minor, we feel that they all add up to provide
    366 a much more polished experience.</p>
    367 
    368 </div>
    369 </body>
    370 </html>
    371