1 \documentclass{article} 2 \usepackage{fancyvrb} 3 \usepackage{graphicx} 4 \usepackage{fullpage} 5 \usepackage{relsize} 6 \usepackage{url} 7 \usepackage{hevea} 8 \usepackage[shortcuts]{extdash} 9 \usepackage{textcomp} 10 % \usepackage{verbdef} 11 12 \def\topfraction{.9} 13 \def\dbltopfraction{\topfraction} 14 \def\floatpagefraction{\topfraction} % default .5 15 \def\dblfloatpagefraction{\topfraction} % default .5 16 \def\textfraction{.1} 17 18 %HEVEA \footerfalse % Disable hevea advertisement in footer 19 20 \newcommand{\code}[1]{\ifmmode{\mbox{\relax\ttfamily{#1}}}\else{\relax\ttfamily #1}\fi} 21 %% Hevea version omits "\smaller" 22 %HEVEA \renewcommand{\code}[1]{\ifmmode{\mbox{\ttfamily{#1}}}\else{\ttfamily #1}\fi} 23 24 \newcommand{\includeimage}[2]{ 25 \begin{center} 26 \ifhevea\imgsrc{#1.png}\else 27 \resizebox{!}{#2}{\includegraphics{figures/#1}} 28 \vspace{-1.5\baselineskip} 29 \fi 30 \end{center}} 31 32 % Add line between figure and text 33 \makeatletter 34 \def\topfigrule{\kern3\p@ \hrule \kern -3.4\p@} % the \hrule is .4pt high 35 \def\botfigrule{\kern-3\p@ \hrule \kern 2.6\p@} % the \hrule is .4pt high 36 \def\dblfigrule{\kern3\p@ \hrule \kern -3.4\p@} % the \hrule is .4pt high 37 \makeatother 38 39 40 \title{Annotation File Format Specification} 41 % Hevea ignores \date, so move the date into \author 42 \author{\url{https://checkerframework.org/annotation-file-utilities/} \\ 43 \today} 44 \date{} 45 46 \begin{document} 47 48 \maketitle 49 50 %HEVEA \setcounter{tocdepth}{2} 51 \tableofcontents 52 53 \section{Purpose: External storage of annotations\label{purpose}} 54 55 Java annotations are meta-data about Java program elements, as in 56 ``\code{@Deprecated class Date \{ \ldots\ \}}''. 57 Ordinarily, Java annotations are written in the source code of a 58 \code{.java} Java source file. When \code{javac} compiles the source code, 59 it inserts the annotations in the resulting \code{.class} file (as 60 ``attributes''). 61 62 Sometimes, it is convenient to specify the annotations outside the source 63 code or the \code{.class} file. 64 \begin{itemize} 65 %BEGIN LATEX 66 \itemsep 0pt \parskip 0pt 67 %END LATEX 68 \item 69 When source code is not available, a textual file provides a format for 70 writing and storing annotations that is much easier to read and modify 71 than a \code{.class} file. Even if the eventual purpose is to insert the 72 annotations in the \code{.class} file, the annotations must be specified 73 in some textual format first. 74 \item 75 Even when source code is available, sometimes it should not be changed, 76 yet annotations must be stored somewhere for use by tools. 77 \item 78 A textual file for annotations can eliminate code clutter. A developer 79 performing some specialized task (such as code verification, 80 parallelization, etc.)\ can store annotations in an annotation file without 81 changing the main version of the source code. (The developer's private 82 version of the code could contain the annotations, but the developer 83 could move them to the separate file before committing changes.) 84 \item 85 Tool writers may find it more convenient to use a textual file, rather 86 than writing a Java or \code{.class} file parser. 87 \item 88 When debugging annotation-processing tools, a textual file format 89 (extracted from the Java or \code{.class} files) is easier to read and 90 is easier for use in testing. 91 \end{itemize} 92 93 All of these uses require an external, textual file format for Java annotations. 94 The external file format should be easy for people to create, read, and 95 modify. 96 % 97 An ``annotation file'' serves this purpose by specifying a set of 98 Java annotations. 99 The Annotation File Utilities 100 (\url{https://checkerframework.org/annotation-file-utilities/}) are a set 101 of tools that process annotation files. 102 103 The file format discussed in this document supports both standard Java SE 5 104 declaration annotations and also the type annotations that are introduced by Java SE 8. 105 The file format provides a simple syntax to represent the structure of a Java 106 program. For annotations in method bodies of \code{.class} files the annotation 107 file closely follows 108 section ``Class File Format Extensions'' of the JSR 308 design document~\cite{JSR308-webpage-201310}, 109 which explains how the annotations are stored in the \code{.class} 110 file. 111 In that sense, the current design is extremely low-level, and users 112 probably would not want to write the files by hand (but they might fill in a 113 template that a tool generated automatically). As future work, we should 114 design a more 115 user-friendly format that permits Java signatures to be directly specified. 116 For \code{.java} source files, the file format provides a separate, higher-level 117 syntax for annotations in method bodies. 118 119 120 121 %% I don't like this, as it may force distributing logically connected 122 %% elements all over a file system. Users should be permitted, but not 123 %% forced, to adopt such a file structure. -MDE 124 % Each file corresponds to exactly one 125 % ``.class'' file, so (for instance) inner classes are written in 126 % separate annotation files, named in the same ``{\tt 127 % OuterClass\$InnerClass}'' pattern as the ``.class'' file. 128 129 130 By convention, an annotation file ends with ``\code{.jaif}'' (for ``Java 131 annotation index file''), but this is not required. 132 133 134 % \verbdef\lineend|"\n"| 135 136 %BEGIN LATEX 137 \DefineShortVerb{\|} 138 \SaveVerb{newline}|\n| 139 \UndefineShortVerb{\|} 140 \newcommand{\lineend}{\bnflit{\UseVerb{newline}}} 141 %END LATEX 142 %HEVEA \newcommand{\bs}{\char"5C} 143 %HEVEA \newcommand{\lineend}{\bnflit{\bs{}n}} 144 145 % literal 146 \newcommand{\bnflit}[1]{\textrm{``}\textbf{#1}\textrm{''}} 147 % non-terminal 148 \newcommand{\bnfnt}[1]{\textsf{\emph{#1}}} 149 % comment 150 \newcommand{\bnfcmt}{\rm \# } 151 % alternative 152 \newcommand{\bnfor}{\ensuremath{|}} 153 154 155 \section{Grammar\label{grammar}} 156 157 This section describes the annotation file format in detail by presenting it in 158 the form of a grammar. Section~\ref{grammar-conventions} details the conventions 159 of the grammar. Section~\ref{java-file-grammar} shows how to represent the 160 basic structure of a Java program (classes, methods, etc.) in an annotation 161 file. Section~\ref{annotations-grammar} shows how to add annotations to an 162 annotation file. 163 164 \subsection{Grammar conventions\label{grammar-conventions}} 165 166 Throughout this document, ``name'' is any valid Java simple name or 167 binary name, ``type'' is any valid type, and ``value'' is any 168 valid Java constant, and quoted strings are literal values. 169 % 170 The Kleene qualifiers ``*'' (zero or more), ``?'' (zero or one), and ``+'' 171 (one or more) denote plurality of a grammar element. 172 % 173 A vertical bar (``\bnfor'') separates alternatives. 174 Parentheses (``()'') denote grouping, and square brackets (``[]'') 175 denote optional syntax, which is equivalent to ``( \ldots\ )\ ?''\ but more concise. 176 We use the hash/pound/octothorpe symbol (``\#'') for comments within the grammar. 177 178 In the annotation file, 179 besides its use as token separator, 180 whitespace (excluding 181 newlines) is optional with one exception: no space is permitted 182 between an ``@'' character and a subsequent name. Indentation is 183 ignored, but is encouraged to maintain readability of the hierarchy of 184 program elements in the class (see the example in Section~\ref{example}). 185 186 Comments can be written throughout the annotation file using the double-slash 187 syntax employed by Java for single-line comments: anything following 188 two adjacent slashes (``//'') until the first newline is a comment. 189 This is omitted from the grammar for simplicity. 190 Block comments (``/* \ldots\ */'') are not allowed. 191 192 The line end symbol \lineend{} is used for all the different line end 193 conventions, that is, Windows- and Unix-style newlines are supported. 194 195 196 \subsection{Java file grammar\label{java-file-grammar}} 197 198 This section shows how to represent the basic structure of a Java program 199 (classes, methods, etc.) in an annotation file. For Java elements that can 200 contain annotations, this section will reference grammar productions contained 201 in Section~\ref{annotations-grammar}, which describes how annotations are used 202 in an annotation file. 203 204 An annotation file has the same basic structure as a Java program. That is, 205 there are packages, classes, fields and methods. 206 207 The annotation file may omit certain program elements --- for instance, it 208 may mention only some of the packages in a program, or only some of the 209 classes in a package, or only some of the fields or methods of a class. 210 Program elements that do not appear in the annotation file are treated as 211 unannotated. 212 213 214 \subsubsection{Package definitions\label{package-definitions}} 215 216 At the root of an annotation file is one or more package definitions. 217 A package definition describes a package containing a list of annotation 218 definitions and classes. A package definition also contains any 219 annotations on the package (such as those from a 220 \code{package-info.java} file). 221 222 \begin{tabbing} 223 \qquad \= \kill 224 \bnfnt{annotation-file} ::= \\ 225 \qquad \bnfnt{package-definition}+ \\ 226 \\ 227 \bnfnt{package-definition} ::= \\ 228 \qquad \bnflit{package} ( \bnflit{:} ) \bnfor{} ( \bnfnt{name} \bnflit{:} \bnfnt{decl-annotation}* ) \lineend \\ 229 \qquad ( \bnfnt{annotation-definition} \bnfor{} \bnfnt{class-definition} ) * 230 \end{tabbing} 231 232 \noindent 233 Use a package line of \code{package:} for the default package. Note that 234 annotations on the default package are not allowed. 235 236 237 \subsubsection{Class definitions\label{class-definitions}} 238 239 A class definition describes the annotations present on a class declaration, 240 as well fields and methods of the class. It is organized according to 241 the hierarchy of fields and methods in the class. 242 Note that we use \bnfnt{class-definition} also for interfaces, enums, and 243 annotation types (to specify annotations in an existing annotation type, not to 244 be confused with \bnfnt{annotation-definition}s described in 245 Section~\ref{annotation-definitions}, which defines annotations to be used 246 throughout an annotation file); for syntactic simplicity, we use \bnflit{class} 247 for 248 all such definitions. 249 % TODO: add test cases for this. 250 251 Inner classes are treated as ordinary classes whose names happen to contain 252 \code{\$} signs and must be defined at the top level of a class definition file. 253 (To change this, the grammar would have to be extended with a closing 254 delimiter for classes; otherwise, it would be ambiguous whether a 255 field or method appearing after an inner class definition belonged to the 256 inner class or the outer class.) The syntax for inner class names is the same as 257 is used by the \code{javac} compiler. A good way to get an idea of the inner 258 class names for a class is to compile the class and look at the filenames of the 259 \code{.class} files that are produced. 260 261 \begin{tabbing} 262 \qquad \= \kill 263 \bnfnt{class-definition} ::= \\ 264 \qquad \bnflit{class} \bnfnt{name} \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 265 % TODO: is the order really important? eg. can fields and methods not 266 % be mixed? 267 \qquad \bnfnt{typeparam-definition}* \\ 268 \qquad \bnfnt{typeparam-bound}* \\ 269 \qquad \bnfnt{extends}* \\ 270 \qquad \bnfnt{implements}* \\ 271 \qquad \bnfnt{field-definition}* \\ 272 \qquad \bnfnt{staticinit}* \\ 273 \qquad \bnfnt{instanceinit}* \\ 274 \qquad \bnfnt{method-definition}* 275 \end{tabbing} 276 277 \noindent 278 Annotations on the \bnflit{class} line are annotations on the class declaration, 279 not the class name. 280 281 282 \paragraph{Type parameter definitions} 283 284 The \bnfnt{typeparam-definition} production defines annotations on the 285 declaration of a type parameter, such as on \code{K} and \code{T} in 286 287 \begin{verbatim} 288 public class Class<K> { 289 public <T> void m() { 290 ... 291 } 292 } 293 \end{verbatim} 294 295 or on the type parameters on the left-hand side of a member reference, 296 as on \code{String} in \code{List<String>::size}. 297 298 \begin{tabbing} 299 \qquad \= \kill 300 \bnfnt{typeparam-definition} ::= \\ 301 \qquad \bnfcmt The integer is the zero-based type parameter index. \\ 302 \qquad \bnflit{typeparam} \bnfnt{integer} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 303 \qquad \bnfnt{compound-type}* 304 \end{tabbing} 305 306 307 \paragraph{Type Parameter Bounds} 308 309 The \bnfnt{typeparam-bound} production defines annotations on a bound of a 310 type variable declaration, such as on \code{Number} and \code{Date} in 311 312 \begin{verbatim} 313 public class Class<K extends Number> { 314 public <T extends Date> void m() { 315 ... 316 } 317 } 318 \end{verbatim} 319 320 \begin{tabbing} 321 \qquad \= \kill 322 \bnfnt{typeparam-bound} ::= \\ 323 % The bound should really be a sub-element of the typeparam! 324 \qquad \bnfcmt The integers are respectively the parameter and bound indexes of \\ 325 \qquad \bnfcmt the type parameter bound~\cite{JSR308-webpage-201310}. \\ 326 \qquad \bnflit{bound} \bnfnt{integer} \bnflit{\&} \bnfnt{integer} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 327 \qquad \bnfnt{compound-type}* 328 \end{tabbing} 329 330 331 \paragraph{Implements and extends} 332 333 The \bnfnt{extends} and \bnfnt{implements} productions 334 define annotations on the names of classes a class \code{extends} or 335 \code{implements}. 336 (Note: For interface declarations, \bnfnt{implements} rather than 337 \bnfnt{extends} defines annotations on the names of extended 338 interfaces.) 339 340 \begin{tabbing} 341 \qquad \= \kill 342 \bnfnt{extends} ::= \\ 343 \qquad \bnflit{extends} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 344 \qquad \bnfnt{compound-type}* \\ 345 \\ 346 \bnfnt{implements} ::= \\ 347 \qquad \bnfcmt The integer is the zero-based index of the implemented interface. \\ 348 \qquad \bnflit{implements} \bnfnt{integer} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 349 \qquad \bnfnt{compound-type}* 350 \end{tabbing} 351 352 353 \paragraph{Static and instance initializers} 354 355 The \bnfnt{staticinit} and \bnfnt{instanceinit} productions 356 define annotations on code within static or instance initializer blocks. 357 358 \begin{tabbing} 359 \qquad \= \kill 360 \bnfnt{staticinit} ::= \\ 361 \qquad \bnfcmt The integer is the zero-based index of the implemented interface. \\ 362 \qquad \bnflit{staticinit} \bnflit{*} \bnfnt{integer} \bnflit{:} \lineend \\ 363 \qquad \bnfnt{compound-type}* 364 \\ 365 \bnfnt{instanceinit} ::= \\ 366 \qquad \bnfcmt The integer is the zero-based index of the implemented interface. \\ 367 \qquad \bnflit{instanceinit} \bnflit{*} \bnfnt{integer} \bnflit{:} \lineend \\ 368 \qquad \bnfnt{compound-type}* 369 \end{tabbing} 370 371 372 \subsubsection{Field definitions\label{field-definitons}} 373 374 A field definition can have annotations on the declaration, the type of the 375 field, or --- if in source code --- the field's initialization. 376 377 \begin{tabbing} 378 \qquad \= \kill 379 \bnfnt{field-definition} ::= \\ 380 \qquad \bnflit{field} \bnfnt{name} \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 381 \qquad \bnfnt{type-annotations}* \\ 382 \qquad \bnfnt{expression-annotations}* 383 \end{tabbing} 384 385 \noindent 386 Annotations on the \bnflit{field} line are on the field declaration, not the 387 type of the field. 388 389 The \bnfnt{expression-annotations} production specifies annotations on the 390 initialization expression of a field. If a field is initialized at declaration 391 then in bytecode the initialization is moved to the constructor when the class 392 is compiled. Therefore for bytecode, annotations on the initialization 393 expression go in the constructor (see Section~\ref{method-definitions}), rather 394 than the field definition. Source code annotations for the field initialization 395 expression are valid on the field definition. 396 397 398 \subsubsection{Method definitions\label{method-definitions}} 399 400 A method definition can have annotations on the method declaration, in the 401 method header (return type, parameters, etc.), as well as the method body. 402 403 \begin{tabbing} 404 \qquad \= \kill 405 \bnfnt{method-definition} ::= \\ 406 \qquad \bnflit{method} \bnfnt{method-key} \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 407 \qquad \bnfnt{typeparam-definition}* \\ 408 \qquad \bnfnt{typeparam-bound}* \\ 409 \qquad \bnfnt{return-type}? \\ 410 \qquad \bnfnt{receiver-definition}? \\ 411 \qquad \bnfnt{parameter-definition}* \\ 412 % TODO: method throws 413 \qquad \bnfnt{variable-definition}* \\ 414 \qquad \bnfnt{expression-annotations}* 415 \end{tabbing} 416 417 \noindent 418 The annotations on the \bnflit{method} line are on the method declaration, not 419 on the return value. The \bnfnt{method-key} consists of the name followed by the 420 signature in JVML format. For example, the following method 421 422 \begin{verbatim} 423 boolean foo(int[] i, String s) { 424 ... 425 } 426 \end{verbatim} 427 428 \noindent 429 has the \bnfnt{method-key}: 430 431 \begin{verbatim} 432 foo([ILjava/lang/String;)Z 433 \end{verbatim} 434 435 Note that the 436 signature is the erased signature of the method and does not contain generic 437 type information, but does contain the return type. Using \code{javap -s} makes 438 it easy to find the signature. The method keys ``\code{<init>}'' and 439 ``\code{<clinit>}'' are used to name instance (constructor) and class (static) 440 initialization methods. (The name of the constructor---that is, the final 441 element of the class name---can be used in place of ``\code{<init>}''.) 442 For both instance and class initializers, the ``return type'' part of the 443 signature should be \code{V} (for \code{void}). 444 445 % TODO: exception types in catch clause 446 % TODO: .class literals 447 % TODO: type arguments in constructor and method calls 448 449 450 \paragraph{Return type} 451 452 A return type defines the annotations on the return type of a method 453 declaration. It is also used for the result of a constructor. 454 455 \begin{tabbing} 456 \qquad \= \kill 457 \bnfnt{return-type} ::= \\ 458 \qquad \bnflit{return:} \bnfnt{type-annotation}* \lineend \\ 459 \qquad \bnfnt{compound-type}* 460 \end{tabbing} 461 462 463 \paragraph{Receiver definition} 464 465 A receiver definition defines the annotations on the type of the receiver 466 parameter in a method declaration. A method receiver is the implicit formal 467 parameter, \code{this}, used in non-static methods. For source code insertion, 468 the receiver parameter will be inserted if it does not already exist. 469 470 Only inner classes have a receiver. A top-level constructor does not have 471 a receiver, though it does have a result. The type of a constructor result 472 is represented as a return type. 473 474 \begin{tabbing} 475 \qquad \= \kill 476 \bnfnt{receiver-definition} ::= \\ 477 \qquad \bnflit{receiver:} \bnfnt{type-annotation}* \lineend \\ 478 \qquad \bnfnt{compound-type}* 479 \end{tabbing} 480 481 482 \paragraph{Parameter definition} 483 484 A formal parameter definition defines the annotations on a method formal 485 parameter declaration and the type of a method formal parameter, but 486 \emph{not} the receiver formal parameter. 487 488 \begin{tabbing} 489 \qquad \= \kill 490 \bnfnt{parameter-definition} ::= \\ 491 \qquad \bnfcmt The integer is the zero-based index of the formal parameter in the method. \\ 492 \qquad \bnflit{parameter} \bnfnt{integer} \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 493 \qquad \bnfnt{type-annotations}* 494 \end{tabbing} 495 496 \noindent 497 The annotations on the \bnflit{parameter} line are on the formal parameter 498 declaration, not on the type of the parameter. A parameter index of 0 is the 499 first formal parameter. The receiver parameter is not index 0. Use the 500 \bnfnt{receiver-definition} production to annotate the receiver parameter. 501 502 503 \subsection{Bytecode Locations\label{bytecode-locations}} 504 505 Certain elements in the body of a method or the initialization expression of a 506 field can be annotated. The \bnfnt{expression-annotations} rule describes the 507 annotations that can be added to a method body or a field initialization 508 expression: 509 510 \begin{tabbing} 511 \qquad \= \kill 512 \bnfnt{expression-annotations} ::= \\ 513 \qquad \bnfnt{typecast}* \\ 514 \qquad \bnfnt{instanceof}* \\ 515 \qquad \bnfnt{new}* \\ 516 \qquad \bnfnt{call}* \\ 517 \qquad \bnfnt{reference}* \\ 518 \qquad \bnfnt{lambda}* \\ 519 \qquad \bnfnt{source-insert-typecast}* \\ 520 \qquad \bnfnt{source-insert-annotation}* 521 \end{tabbing} 522 523 \noindent 524 Additionally, a variable declaration in a method body can be annotated with the 525 \bnfnt{variable-definition} rule, which appears below. 526 527 Because of the differences between Java source code and \code{.class} files, 528 the syntax for specifying code locations is different for \code{.class} files 529 and source code. For \code{.class} files we use a syntax called ``bytecode 530 offsets''. For source code we use a different syntax called ``source code 531 indexes''. These are both described below. 532 533 If you wish to be able to insert a given code annotation in both a \code{.class} file and a source 534 code file, the annotation file must redundantly specify the annotation's bytecode offset and source 535 code index. This can be done in a single \code{.jaif} file or two separate 536 \code{.jaif} files. It is not necessary to include 537 redundant information to insert annotations on signatures in both \code{.class} 538 files and source code. 539 540 Additionally, a new typecast with annotations (rather than an annotation added to an 541 existing typecast) can be inserted into source code. This uses a third 542 syntax that is described below under ``AST paths''. 543 A second way to insert a typecast is by specifying just an annotation, not 544 a full typecast (\code{insert-annotation} instead of 545 \code{insert-typecast}). In this case, the source annotation insertion 546 tool generates a full typecast if Java syntax requires one. 547 548 549 \subsubsection{Bytecode offsets\label{bytecode-offsets}} 550 551 For locations in bytecode, the 552 annotation file uses offsets into the bytecode array of the class file to 553 indicate the specific expression to which the annotation refers. Because 554 different compilation strategies yield different \code{.class} files, a 555 tool that maps such annotations from an annotation file into source code must 556 have access to the specific \code{.class} file that was used to generate 557 the annotation file. The 558 \code{javap -v} command is an effective technique to discover bytecode 559 offsets. Non-expression annotations such as those on methods, 560 fields, classes, etc., do not use a bytecode offset. 561 562 563 \subsubsection{Source code indexes\label{source-code-indexes}} 564 565 For locations in source code, the annotation file indicates the kind of 566 expression, plus a zero-based index to indicate which occurrence of that kind of 567 expression. For example, 568 569 \begin{verbatim} 570 public void method() { 571 Object o1 = new @A String(); 572 String s = (@B String) o1; 573 Object o2 = new @C Integer(0); 574 Integer i = (@D Integer) o2; 575 } 576 \end{verbatim} 577 578 \noindent 579 \code{@A} is on new, index 0. \code{@B} is on typecast, index 0. \code{@C} is on 580 new, index 1. \code{@D} is on typecast, index 1. 581 582 Source code indexes only include occurrences in the class that exactly matches 583 the name of the enclosing \bnfnt{class-definition} rule. Specifically, 584 occurrences in nested classes are not included. Use a new 585 \bnfnt{class-definition} rule with the name of the nested class for source code 586 insertions in a nested class. 587 588 589 \subsubsection{Code locations grammar\label{code-grammar}} 590 591 For each kind of expression, the grammar contains a separate location rule. 592 This location rule contains the bytecode offset syntax followed by the 593 source code index syntax. 594 595 The grammar uses \bnflit{\#} for bytecode offsets and \bnflit{*} for source code indexes. 596 597 \begin{tabbing} 598 \qquad \= \kill 599 \bnfnt{variable-location} ::= \\ 600 \qquad \bnfcmt Bytecode offset: the integers are respectively the index, start, and length \\ 601 \qquad \bnfcmt fields of the annotations on this variable~\cite{JSR308-webpage-201310}. \\ 602 \qquad (\bnfnt{integer} \bnflit{\#} \bnfnt{integer} \bnflit{+} \bnfnt{integer}) \\ 603 \qquad \bnfcmt Source code index: the \bnfnt{name} is the identifier of the local variable. \\ 604 \qquad \bnfcmt The \bnfnt{integer} is the optional zero-based index of the intended local \\ 605 \qquad \bnfcmt variable within all local variables with the given \bnfnt{name}. \\ 606 \qquad \bnfcmt The default value for the index is zero. \\ 607 \qquad \bnfor{} (\bnfnt{name} [\bnflit{*} \bnfnt{integer}]) \\ 608 \\ 609 \bnfnt{variable-definition} ::= \\ 610 \qquad \bnfcmt The annotations on the \bnflit{local} line are on the variable declaration, \\ 611 \qquad \bnfcmt not the type of the variable. \\ 612 \qquad \bnflit{local} \bnfnt{variable-location} \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 613 \qquad \bnfnt{type-annotations}* \\ 614 \\ 615 \bnfnt{typecast-location} ::= \\ 616 \qquad \bnfcmt Bytecode offset: the first integer is the offset field and the optional \\ 617 \qquad \bnfcmt second integer is the type index of an intersection type~\cite{JSR308-webpage-201310}. \\ 618 \qquad \bnfcmt The type index defaults to zero if not specified. \\ 619 \qquad (\bnflit{\#} \bnfnt{integer} [ \bnflit{,} \bnfnt{integer} ]) \\ 620 \qquad \bnfcmt Source code index: the first integer is the zero-based index of the typecast \\ 621 \qquad \bnfcmt within the method and the optional second integer is the type index of an \\ 622 \qquad \bnfcmt intersection type~\cite{JSR308-webpage-201310}. The type index defaults to zero if not specified. \\ 623 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer} [ \bnflit{,} \bnfnt{integer} ]) \\ 624 \\ 625 \bnfnt{typecast} ::= \\ 626 \qquad \bnflit{typecast} \bnfnt{typecast-location} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 627 \qquad \bnfnt{compound-type}* \\ 628 \\ 629 \bnfnt{instanceof-location} ::= \\ 630 \qquad \bnfcmt Bytecode offset: the integer is the offset field of the annotation~\cite{JSR308-webpage-201310}. \\ 631 \qquad (\bnflit{\#} \bnfnt{integer}) \\ 632 \qquad \bnfcmt Source code index: the integer is the zero-based index of the \code{instanceof} \\ 633 \qquad \bnfcmt within the method. \\ 634 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer}) \\ 635 \\ 636 \bnfnt{instanceof} ::= \\ 637 \qquad \bnflit{instanceof} \bnfnt{instanceof-location} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 638 \qquad \bnfnt{compound-type}* \\ 639 \\ 640 \bnfnt{new-location} ::= \\ 641 \qquad \bnfcmt Bytecode offset: the integer is the offset field of the annotation~\cite{JSR308-webpage-201310}. \\ 642 \qquad (\bnflit{\#} \bnfnt{integer}) \\ 643 \qquad \bnfcmt Source code index: the integer is the zero-based index of the object or array \\ 644 \qquad \bnfcmt creation within the method. \\ 645 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer}) \\ 646 \\ 647 \bnfnt{new} ::= \\ 648 \qquad \bnflit{new} \bnfnt{new-location} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 649 \qquad \bnfnt{compound-type}* 650 \\ 651 \bnfnt{call-location} ::= \\ 652 \qquad \bnfcmt Bytecode offset: the integer is the offset field of the annotation~\cite{JSR308-webpage-201310}. \\ 653 \qquad (\bnflit{\#} \bnfnt{integer}) \\ 654 \qquad \bnfcmt Source code index: the integer is the zero-based index of the method call \\ 655 \qquad \bnfcmt within the field or method definition. \\ 656 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer}) \\ 657 \\ 658 \bnfnt{call} ::= \\ 659 \qquad \bnflit{call} \bnfnt{call-location} \bnflit{:} \lineend \\ 660 \qquad \bnfnt{typearg-definition}* \\ 661 \\ 662 \bnfnt{reference-location} ::= \\ 663 \qquad \bnfcmt Bytecode offset: the integer is the offset field of the annotation~\cite{JSR308-webpage-201310}. \\ 664 \qquad (\bnflit{\#} \bnfnt{integer}) \\ 665 \qquad \bnfcmt Source code index: the integer is the zero-based index of the member \\ 666 \qquad \bnfcmt reference~\cite{JSR308-webpage-201310}. \\ 667 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer}) \\ 668 \\ 669 \bnfnt{reference} ::= \\ 670 \qquad \bnflit{reference} \bnfnt{reference-location} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 671 \qquad \bnfnt{compound-type}* \\ 672 \qquad \bnfnt{typearg-definition}* \\ 673 \\ 674 \bnfnt{lambda-location} ::= \\ 675 \qquad \bnfcmt Bytecode offset: the integer is the offset field of the annotation~\cite{JSR308-webpage-201310}. \\ 676 \qquad (\bnflit{\#} \bnfnt{integer}) \\ 677 \qquad \bnfcmt Source code index: the integer is the zero-based index of the lambda \\ 678 \qquad \bnfcmt expression~\cite{JSR308-webpage-201310}. \\ 679 \qquad \bnfor{} (\bnflit{*} \bnfnt{integer}) \\ 680 \\ 681 \bnfnt{lambda} ::= \\ 682 \qquad \bnflit{lambda} \bnfnt{lambda-location} \bnflit{:} \lineend \\ 683 %\qquad \bnfnt{return-type}? \\ 684 \qquad \bnfnt{parameter-definition}* \\ 685 \qquad \bnfnt{variable-definition}* \\ 686 \qquad \bnfnt{expression-annotations}* 687 \\ 688 \qquad \= \kill 689 \bnfnt{typearg-definition} ::= \\ 690 \qquad \bnfcmt The integer is the zero-based type argument index. \\ 691 \qquad \bnflit{typearg} \bnfnt{integer} \bnflit{:} \bnfnt{type-annotation}* \lineend \\ 692 \qquad \bnfnt{compound-type}* 693 \end{tabbing} 694 695 696 \subsubsection{AST paths\label{ast-paths}} 697 698 A path through the AST (abstract 699 syntax tree) specifies an arbitrary expression in source code to modify. 700 AST paths can be used in the \code{.jaif} file to specify a location to 701 insert either a bare annotation (\bnflit{insert-annotation}) or a cast 702 (\bnflit{insert-typecast}). 703 704 For a cast insertion, the \code{.jaif} file specifies the type to cast to. 705 The annotations on the \bnflit{insert-typecast} line will be inserted on 706 the outermost type of the type to cast to. If the type to cast to is a compound 707 type then annotations on parts of the compound type are specified with the 708 \bnfnt{compound-type} rule. If there are no annotations on 709 the \bnflit{insert-typecast} line then a cast with no annotations will be 710 inserted or, if compound type annotations are specified, a cast with annotations 711 only on the compound types will be inserted. 712 713 Note that the type specified on the \bnflit{insert-typecast} line cannot contain 714 any qualified type names. For example, use \code{Entry<String, Object>} instead 715 of \code{Map.Entry<java.lang.String, java.lang.Object>}. 716 717 \begin{tabbing} 718 \bnfnt{source-insert-typecast} ::= \\ 719 \qquad \bnfcmt \bnfnt{ast-path} is described below. \\ 720 \qquad \bnfcmt \bnfnt{type} is the un-annotated type to cast to. \\ 721 \qquad \bnflit{insert-typecast} \bnfnt{ast-path}\bnflit{:} \bnfnt{type-annotation}* \bnfnt{type} \lineend \\ 722 \qquad \bnfnt{compound-type}* 723 \end{tabbing} 724 725 An AST path represents a traversal through the AST. AST paths can only be 726 used in \bnfnt{field-definition}s and \bnfnt{method-definition}s. 727 An AST path starts with the first element under the definition. For 728 methods this is \code{Block} and for fields this is \code{Variable}. 729 730 An AST path is composed of one or more AST entries, separated by commas. Each 731 AST entry is composed of a tree kind, a child selector, and an optional 732 argument. An example AST entry is: 733 734 \begin{verbatim} 735 Block.statement 1 736 \end{verbatim} 737 738 The tree kind is \code{Block}, the child selector is \code{statement} and the 739 argument is \code{1}. 740 741 The available tree kinds correspond to the Java AST tree nodes (from the package 742 \code{com.sun.source.tree}), but with ``Tree'' removed from the name. For 743 example, the class \code{com.sun.source.tree.BlockTree} is represented as 744 \code{Block}. The child selectors correspond to the method names of the given 745 Java AST tree node, with ``get'' removed from the beginning of the method name 746 and the first letter lowercased. In cases where the child selector method 747 returns a list, the method name is made singular and the AST entry also contains 748 an argument to select the index of the list to take. For example, the method 749 \code{com\-.sun\-.source\-.tree\-.Block\-Tree\-.get\-Statements()} is represented as 750 \code{Block.statement} and requires an argument to select the statement to take. 751 752 The following is an example of an entire AST path: 753 754 \begin{verbatim} 755 Block.statement 1, Switch.case 1, Case.statement 0, ExpressionStatement.expression, 756 MethodInvocation.argument 0 757 \end{verbatim} 758 759 Since the above example starts with a \code{Block} it belongs in a 760 \bnfnt{method-definition}. This AST path would select an expression that is in 761 statement 1 of the method, case 1 of the switch statement, statement 0 of the 762 case, and argument 0 of a method call (\code{ExpressionStatement} is just a 763 wrapper around an expression that can also be a statement). 764 765 The following is an example of an annotation file with AST paths used to specify 766 where to insert casts. 767 768 \begin{verbatim} 769 package p: 770 annotation @A: 771 772 class ASTPathExample: 773 774 field a: 775 insert-typecast Variable.initializer, Binary.rightOperand: @A Integer 776 777 method m()V: 778 insert-typecast Block.statement 0, Variable.initializer: @A Integer 779 insert-typecast Block.statement 1, Switch.case 1, Case.statement 0, 780 ExpressionStatement.expression, MethodInvocation.argument 0: @A Integer 781 \end{verbatim} 782 783 And the matching source code: 784 785 \begin{verbatim} 786 package p; 787 788 public class ASTPathExample { 789 790 private int a = 12 + 13; 791 792 public void m() { 793 int x = 1; 794 switch (x + 2) { 795 case 1: 796 System.out.println(1); 797 break; 798 case 2: 799 System.out.println(2 + x); 800 break; 801 default: 802 System.out.println(-1); 803 } 804 } 805 } 806 \end{verbatim} 807 808 The following is the output, with the casts inserted. 809 810 \begin{verbatim} 811 package p; 812 import p.A; 813 814 public class ASTPathExample { 815 816 private int a = 12 + ((@A Integer) (13)); 817 818 public void m() { 819 int x = ((@A Integer) (1)); 820 switch (x + 2) { 821 case 1: 822 System.out.println(1); 823 break; 824 case 2: 825 System.out.println(((@A Integer) (2 + x))); 826 break; 827 default: 828 System.out.println(-1); 829 } 830 } 831 } 832 \end{verbatim} 833 834 Using \code{insert-annotation} instead of \code{insert-typecast} yields 835 almost the same result --- it also inserts a cast. The sole difference 836 is the inability to specify the type in the cast expression. If you use 837 \code{insert-annotation}, then the annotation inserter infers the type, 838 which is \code{int} in this case. 839 840 Note that a cast can be inserted on any expression, not 841 just the deepest expression in the AST. For example, a cast could be inserted on 842 the expression \code{i + j}, the identifier \code{i}, and/or the identifier \code{j}. 843 844 To help create correct AST paths it may be useful to view the AST of a class. 845 The Checker Framework has a processor to do this. The following command will 846 output indented AST nodes for the entire input program. 847 848 \begin{verbatim} 849 javac -processor org.checkerframework.common.util.debug.TreeDebug ASTPathExample.java 850 \end{verbatim} 851 852 The following is the grammar for AST paths. 853 854 \begin{tabbing} 855 \qquad \= \kill 856 \bnfnt{ast-path} ::= \\ 857 \qquad \bnfnt{ast-entry} [ \bnflit{,} \bnfnt{ast-entry} ]+ \\ 858 \\ 859 \bnfnt{ast-entry} ::= \\ 860 \qquad \bnfnt{annotated-type} \\ 861 \qquad \bnfor{} \bnfnt{annotation} \\ 862 \qquad \bnfor{} \bnfnt{array-access} \\ 863 \qquad \bnfor{} \bnfnt{array-type} \\ 864 \qquad \bnfor{} \bnfnt{assert} \\ 865 \qquad \bnfor{} \bnfnt{assignment} \\ 866 \qquad \bnfor{} \bnfnt{binary} \\ 867 \qquad \bnfor{} \bnfnt{block} \\ 868 \qquad \bnfor{} \bnfnt{case} \\ 869 \qquad \bnfor{} \bnfnt{catch} \\ 870 \qquad \bnfor{} \bnfnt{compound-assignment} \\ 871 \qquad \bnfor{} \bnfnt{conditional-expression} \\ 872 \qquad \bnfor{} \bnfnt{do-while-loop} \\ 873 \qquad \bnfor{} \bnfnt{enhanced-for-loop} \\ 874 \qquad \bnfor{} \bnfnt{expression-statement} \\ 875 \qquad \bnfor{} \bnfnt{for-loop} \\ 876 \qquad \bnfor{} \bnfnt{if} \\ 877 \qquad \bnfor{} \bnfnt{instance-of} \\ 878 \qquad \bnfor{} \bnfnt{intersection-type} \\ 879 \qquad \bnfor{} \bnfnt{labeled-statement} \\ 880 \qquad \bnfor{} \bnfnt{lambda-expression} \\ 881 \qquad \bnfor{} \bnfnt{member-reference} \\ 882 \qquad \bnfor{} \bnfnt{member-select} \\ 883 \qquad \bnfor{} \bnfnt{method-invocation} \\ 884 \qquad \bnfor{} \bnfnt{new-array} \\ 885 \qquad \bnfor{} \bnfnt{new-class} \\ 886 \qquad \bnfor{} \bnfnt{parameterized-type} \\ 887 \qquad \bnfor{} \bnfnt{parenthesized} \\ 888 \qquad \bnfor{} \bnfnt{return} \\ 889 \qquad \bnfor{} \bnfnt{switch} \\ 890 \qquad \bnfor{} \bnfnt{synchronized} \\ 891 \qquad \bnfor{} \bnfnt{throw} \\ 892 \qquad \bnfor{} \bnfnt{try} \\ 893 \qquad \bnfor{} \bnfnt{type-cast} \\ 894 \qquad \bnfor{} \bnfnt{type-parameter} \\ 895 \qquad \bnfor{} \bnfnt{unary} \\ 896 \qquad \bnfor{} \bnfnt{union-type} \\ 897 \qquad \bnfor{} \bnfnt{variable-type} \\ 898 \qquad \bnfor{} \bnfnt{while-loop} \\ 899 \qquad \bnfor{} \bnfnt{wildcard-tree} \\ 900 \\ 901 \bnfnt{annotated-type} :: = \\ 902 \qquad \bnflit{AnnotatedType} \bnflit{.} ( ( \bnflit{annotation} \bnfnt{integer} ) \bnfor{} \bnflit{underlyingType} ) \\ 903 \\ 904 \bnfnt{annotation} ::= \\ 905 \qquad \bnflit{Annotation} \bnflit{.} ( \bnflit{type} \bnfor{} \bnflit{argument} \bnfnt{integer} ) \\ 906 \\ 907 \bnfnt{array-access} ::= \\ 908 \qquad \bnflit{ArrayAccess} \bnflit{.} ( \bnflit{expression} \bnfor{} \bnflit{index} ) \\ 909 \\ 910 \bnfnt{array-type} ::= \\ 911 \qquad \bnflit{ArrayType} \bnflit{.} \bnflit{type} \\ 912 \\ 913 \bnfnt{assert} ::= \\ 914 \qquad \bnflit{Assert} \bnflit{.} ( \bnflit{condition} \bnfor{} \bnflit{detail} ) \\ 915 \\ 916 \bnfnt{assignment} ::= \\ 917 \qquad \bnflit{Assignment} \bnflit{.} ( \bnflit{variable} \bnfor{} \bnflit{expression} ) \\ 918 \\ 919 \bnfnt{binary} ::= \\ 920 \qquad \bnflit{Binary} \bnflit{.} ( \bnflit{leftOperand} \bnfor{} \bnflit{rightOperand} ) \\ 921 \\ 922 \bnfnt{block} ::= \\ 923 \qquad \bnflit{Block} \bnflit{.} \bnflit{statement} \bnfnt{integer} \\ 924 \\ 925 \bnfnt{case} ::= \\ 926 \qquad \bnflit{Case} \bnflit{.} ( \bnflit{expression} \bnfor{} ( \bnflit{statement} \bnfnt{integer} ) ) \\ 927 \\ 928 \bnfnt{catch} ::= \\ 929 \qquad \bnflit{Catch} \bnflit{.} ( \bnflit{parameter} \bnfor{} \bnflit{block} ) \\ 930 \\ 931 \bnfnt{compound-assignment} ::= \\ 932 \qquad \bnflit{CompoundAssignment} \bnflit{.} ( \bnflit{variable} \bnfor{} \bnflit{expression} ) \\ 933 \\ 934 \bnfnt{conditional-expression} ::= \\ 935 \qquad \bnflit{ConditionalExpression} \bnflit{.} ( \bnflit{condition} \bnfor{} \bnflit{trueExpression} \bnfor{} \bnflit{falseExpression} ) \\ 936 \\ 937 \bnfnt{do-while-loop} ::= \\ 938 \qquad \bnflit{DoWhileLoop} \bnflit{.} ( \bnflit{condition} \bnfor{} \bnflit{statement} ) \\ 939 \\ 940 \bnfnt{enhanced-for-loop} ::= \\ 941 \qquad \bnflit{EnhancedForLoop} \bnflit{.} ( \bnflit{variable} \bnfor{} \bnflit{expression} \bnfor{} \bnflit{statement} ) \\ 942 \\ 943 \bnfnt{expression-statement} ::= \\ 944 \qquad \bnflit{ExpressionStatement} \bnflit{.} \bnflit{expression} \\ 945 \\ 946 \bnfnt{for-loop} ::= \\ 947 \qquad \bnflit{ForLoop} \bnflit{.} ( ( \bnflit{initializer} \bnfnt{integer} ) \bnfor{} \bnflit{condition} \bnfor{} ( \bnflit{update} \bnfnt{integer} ) \bnfor{} \bnflit{statement} ) \\ 948 \\ 949 \bnfnt{if} ::= \\ 950 \qquad \bnflit{If} \bnflit{.} ( \bnflit{condition} \bnfor{} \bnflit{thenStatement} \bnfor{} \bnflit{elseStatement} ) \\ 951 \\ 952 \bnfnt{instance-of} ::= \\ 953 \qquad \bnflit{InstanceOf} \bnflit{.} ( \bnflit{expression} \bnfor{} \bnflit{type} ) \\ 954 \\ 955 \bnfnt{intersection-type} ::= \\ 956 \qquad \bnflit{IntersectionType} \bnflit{.} \bnflit{bound} \bnfnt{integer} \\ 957 \\ 958 \bnfnt{labeled-statement} ::= \\ 959 \qquad \bnflit{LabeledStatement} \bnflit{.} \bnflit{statement} \\ 960 \\ 961 \bnfnt{lambda-expression} ::= \\ 962 \qquad \bnflit{LambdaExpression} \bnflit{.} ( ( \bnflit{parameter} \bnfnt{integer} ) \bnfor{} \bnflit{body} ) \\ 963 \\ 964 \bnfnt{member-reference} ::= \\ 965 \qquad \bnflit{MemberReference} \bnflit{.} ( \bnflit{qualifierExpression} \bnfor{} ( \bnflit{typeArgument} \bnfnt{integer} ) ) \\ 966 \\ 967 \bnfnt{member-select} ::= \\ 968 \qquad \bnflit{MemberSelect} \bnflit{.} \bnflit{expression} \\ 969 \\ 970 \bnfnt{method-invocation} ::= \\ 971 \qquad \bnflit{MethodInvocation} \bnflit{.} ( ( \bnflit{typeArgument} \bnfnt{integer} ) \bnfor{} \bnflit{methodSelect} \\ 972 \qquad \bnfor{} ( \bnflit{argument} \bnfnt{integer} ) ) \\ 973 \\ 974 \bnfnt{new-array} ::= \\ 975 \qquad \bnflit{NewArray} \bnflit{.} ( \bnflit{type} \bnfor{} ( \bnflit{dimension} \bnfor{} \bnflit{initializer} ) \bnfnt{integer} ) \\ 976 \\ 977 \bnfnt{new-class} ::= \\ 978 \qquad \bnflit{NewClass} \bnflit{.} ( \bnflit{enclosingExpression} \bnfor{} ( \bnflit{typeArgument} \bnfnt{integer} ) \bnfor{} \bnflit{identifier} \\ 979 \qquad \bnfor{} ( \bnflit{argument} \bnfnt{integer} ) \bnfor{} \bnflit{classBody} ) \\ 980 \\ 981 \bnfnt{parameterized-type} ::= \\ 982 \qquad \bnflit{ParameterizedType} \bnflit{.} ( \bnflit{type} \bnfor{} ( \bnflit{typeArgument} \bnfnt{integer} ) ) \\ 983 \\ 984 \bnfnt{parenthesized} ::= \\ 985 \qquad \bnflit{Parenthesized} \bnflit{.} \bnflit{expression} \\ 986 \\ 987 \bnfnt{return} ::= \\ 988 \qquad \bnflit{Return} \bnflit{.} \bnflit{expression} \\ 989 \\ 990 \bnfnt{switch} ::= \\ 991 \qquad \bnflit{Switch} \bnflit{.} ( \bnflit{expression} \bnfor{} ( \bnflit{case} \bnfnt{integer} ) ) \\ 992 \\ 993 \bnfnt{synchronized} ::= \\ 994 \qquad \bnflit{Synchronized} \bnflit{.} ( \bnflit{expression} \bnfor{} \bnflit{block} ) \\ 995 \\ 996 \bnfnt{throw} ::= \\ 997 \qquad \bnflit{Throw} \bnflit{.} \bnflit{expression} \\ 998 \\ 999 \bnfnt{try} ::= \\ 1000 \qquad \bnflit{Try} \bnflit{.} ( \bnflit{block} \bnfor{} ( \bnflit{catch} \bnfnt{integer} ) \bnfor{} \bnflit{finallyBlock} \bnfor{} ( \bnflit{resource} \bnfnt{integer} ) ) \\ 1001 \\ 1002 \bnfnt{type-cast} ::= \\ 1003 \qquad \bnflit{TypeCast} \bnflit{.} ( \bnflit{type} \bnfor{} \bnflit{expression} ) \\ 1004 \\ 1005 \bnfnt{type-parameter} ::= \\ 1006 \qquad \bnflit{TypeParameter} \bnflit{.} \bnflit{bound} \bnfnt{integer} \\ 1007 \\ 1008 \bnfnt{unary} ::= \\ 1009 \qquad \bnflit{Unary} \bnflit{.} \bnflit{expression} \\ 1010 \\ 1011 \bnfnt{union-type} ::= \\ 1012 \qquad \bnflit{UnionType} \bnflit{.} \bnflit{typeAlternative} \bnfnt{integer} \\ 1013 \\ 1014 \bnfnt{variable} ::= \\ 1015 \qquad \bnflit{Variable} \bnflit{.} ( \bnflit{type} \bnfor{} \bnflit{initializer} ) \\ 1016 \\ 1017 \bnfnt{while-loop} ::= \\ 1018 \qquad \bnflit{WhileLoop} \bnflit{.} ( \bnflit{condition} \bnfor{} \bnflit{statement} ) \\ 1019 \\ 1020 \bnfnt{wildcard} ::= \\ 1021 \qquad \bnflit{Wildcard} \bnflit{.} \bnflit{bound} \\ 1022 \\ 1023 \end{tabbing} 1024 1025 1026 \subsection{Annotations\label{annotations-grammar}} 1027 1028 This section describes the details of how annotations are defined, how 1029 annotations are used, and the different kinds of annotations in an annotation 1030 file. 1031 1032 1033 \subsubsection{Annotation definitions\label{annotation-definitions}} 1034 1035 An annotation definition describes the annotation's fields and their 1036 types, so that they may be referenced in a compact way throughout the 1037 annotation file. Any annotation that is used in an annotation file 1038 % either on a program element or as a field of another annotation definition. 1039 must be defined before use. 1040 (This requirement makes it impossible to define, in an 1041 annotation file, an annotation that is meta-annotated with itself.) 1042 The two exceptions to this rule are the \code{@java.lang.annotation.Target} and 1043 \code{@java.lang.annotation.Retention} meta-annotations. These meta-annotations 1044 are often used in annotation definitions so for ease of use are they not required to 1045 be defined themselves. 1046 In the annotation file, the annotation definition appears within the 1047 package that defines the annotation. The annotation may be applied to 1048 elements of any package. 1049 1050 Note that these annotation definitions should not be confused with the 1051 \code{@interface} syntax used in a Java source file to declare an annotation. An 1052 annotation definition in an annotation file is only used internally. An 1053 annotation definition in an annotation file will often mirror an 1054 \code{@interface} annotation declaration in a Java source file in order to use 1055 that annotation in an annotation file. 1056 1057 % TODO, see https://github.com/typetools/annotation-tools/issues/25 1058 % The Annotation File Utilities can read annotation definitions from the 1059 % classpath, so it is optional to define them in the annotation file. 1060 1061 \begin{tabbing} 1062 \qquad \= \kill 1063 \bnfnt{annotation-definition} ::= \\ 1064 \qquad \bnfcmt The \bnfnt{decl-annotation}s are the meta-annotations on this annotation. \\ 1065 \qquad \bnflit{annotation} \bnflit{@}\bnfnt{name} 1066 \bnflit{:} \bnfnt{decl-annotation}* \lineend \\ 1067 \qquad \bnfnt{annotation-field-definition}* \\ 1068 \\ 1069 \bnfnt{annotation-field-definition} ::= \\ 1070 \qquad \bnfnt{annotation-field-type} \bnfnt{name} \lineend \\ 1071 \\ 1072 \bnfnt{annotation-field-type} ::= \\ 1073 \qquad \bnfcmt \bnfnt{primitive-type} is any Java primitive type (\code{int}, \code{boolean}, etc.). \\ 1074 \qquad \bnfcmt These are described in detail in Section~\ref{types-and-values}. \\ 1075 \qquad (\bnfnt{primitive-type} \bnfor{} \bnflit{String} \bnfor{} \bnflit{Class} 1076 \bnfor{} (\bnflit{enum} \bnfnt{name}) \bnfor{} (\bnflit{annotation-field} \bnfnt{name})) \bnflit{[]}? \\ 1077 \qquad \bnfor{} \bnflit{unknown[]} \lineend 1078 \end{tabbing} 1079 1080 1081 \subsubsection{Annotation uses\label{annotation-uses}} 1082 1083 Java SE 8 has two kinds of annotations: ``declaration annotations'' and ``type 1084 annotations''. Declaration annotations can be written only on method formal 1085 parameters and the declarations of packages, classes, methods, fields, and local 1086 variables. Type annotations can be written on any use of a type, and on type 1087 parameter declarations. Type annotations must be meta-annotated 1088 with \code{ElementType.TYPE\_USE} and/or \code{ElementType.TYPE\_PARAMETER}. 1089 These meta-annotations are described in more detail in the JSR 308 1090 specification~\cite{JSR308-webpage-201310}. 1091 1092 The previous rules have used two productions for annotation uses in an 1093 annotation file: \bnfnt{decl-annotation} and \bnfnt{type-annotation}. 1094 The \bnfnt{decl-annotation} and \bnfnt{type-annotation} productions use the same 1095 syntax to specify an annotation. These two different rules exist only to show 1096 which type of annotation is valid in a given location. A declaration annotation 1097 must be used where the \bnfnt{decl-annotation} production is used and a type 1098 annotation must be used where the \bnfnt{type-annotation} production is used. 1099 1100 The syntax for an annotation is the same as in a Java source file. 1101 1102 \begin{tabbing} 1103 \qquad \= \kill 1104 \bnfnt{decl-annotation} ::= \\ 1105 \qquad \bnfcmt \bnfnt{annotation} must be a declaration annotation. \\ 1106 \qquad \bnfnt{annotation} \\ 1107 \\ 1108 \bnfnt{type-annotation} ::= \\ 1109 \qquad \bnfcmt \bnfnt{annotation} must be a type annotation. \\ 1110 \qquad \bnfnt{annotation} \\ 1111 \\ 1112 \bnfnt{annotation} ::= \\ 1113 \qquad \bnfcmt The name may be the annotation's simple name, unless the file \\ 1114 \qquad \bnfcmt contains definitions for two annotations with the same simple name. \\ 1115 \qquad \bnfcmt In this case, the fully-qualified name of the annotation name is required. \\ 1116 % TODO: 1117 % Perhaps we could add that if a class is in the same package 1118 % as an annotation it may always use the simple name (even if there's another 1119 % annotation with the same simple name in another package)? - MP 06/28 1120 \qquad \bnflit{@}\bnfnt{name} [ \bnflit{(} \bnfnt{annotation-field} [ \bnflit{,} \bnfnt{annotation-field} ]+ \bnflit{)} ] \\ 1121 \\ 1122 \bnfnt{annotation-field} ::= \\ 1123 \qquad \bnfcmt In Java, if a single-field annotation has a field named \\ 1124 \qquad \bnfcmt ``\code{value}'', then that field name may be elided in uses of the\\ 1125 \qquad \bnfcmt annotation: ``\code{@A(12)}'' rather than ``\code{@A(value=12)}''. \\ 1126 \qquad \bnfcmt The same convention holds in an annotation file. \\ 1127 \qquad \bnfnt{name} \bnflit{=} \bnfnt{value} 1128 \end{tabbing} 1129 1130 \noindent 1131 Certain Java elements allow both declaration and type annotations (for example, 1132 formal method parameters). For these elements, the \bnfnt{type-annotations} 1133 rule is used to differentiate between the declaration annotations and the type 1134 annotations. 1135 1136 \begin{tabbing} 1137 \qquad \= \kill 1138 \bnfnt{type-annotations} ::= \\ 1139 \qquad \bnfcmt holds the type annotations, as opposed to the declaration annotations. \\ 1140 \qquad \bnflit{type:} \bnfnt{type-annotation}* \lineend \\ 1141 \qquad \bnfnt{compound-type}* 1142 \end{tabbing} 1143 1144 1145 \paragraph{Compound type annotations} 1146 1147 A compound type is a parameterized, wildcard, array, or nested type. Annotations 1148 may be on any type in a compound type. In order to specify the location of an 1149 annotation within a compound type we use a ``type path''. A 1150 type path is composed one or more pairs of type kind and type argument index. 1151 1152 \begin{tabbing} 1153 \qquad \= \kill 1154 \bnfnt{type-kind} ::= \\ 1155 \qquad \bnflit{0} \bnfcmt annotation is deeper in this array type \\ 1156 \qquad \bnfor{} \bnflit{1} \bnfcmt annotation is deeper in this nested type \\ 1157 \qquad \bnfor{} \bnflit{2} \bnfcmt annotation is on the bound of this wildcard type argument \\ 1158 \qquad \bnfor{} \bnflit{3} \bnfcmt annotation is on the i'th type argument of this parameterized type \\ 1159 \\ 1160 \bnfnt{type-path} ::= \\ 1161 \qquad \bnfcmt The \bnfnt{integer} is the type argument index. \\ 1162 \qquad \bnfnt{type-kind} \bnflit{,} \bnfnt{integer} [ \bnflit{,} \bnfnt{type-kind} \bnflit{,} \bnfnt{integer} ]* \\ 1163 \\ 1164 \bnfnt{compound-type} ::= \\ 1165 \qquad \bnflit{inner-type} \bnfnt{type-path} \bnflit{:} \bnfnt{annotation}* \lineend 1166 \end{tabbing} 1167 1168 \noindent 1169 The type argument index used in the \bnfnt{type-path} rule must be \bnflit{0} unless the \bnfnt{type-kind} is 1170 \bnflit{3}. In this case, the type argument index selects which type argument 1171 of a parameterized type to use. 1172 1173 \urldef\cftp\url|https://checkerframework.org/jsr308/specification/java-annotation-design.html#class-file:ext:type_path| 1174 Type paths are explained in more detail, with many examples to ease 1175 understanding, in Section 3.4 of the JSR 308 Specification.\footnotemark 1176 \footnotetext{\cftp} 1177 1178 1179 \section{Example\label{example}} 1180 1181 Consider the code of Figure~\ref{fig:java-example}. 1182 Figure~\ref{fig:annotation-file-examples} shows two legal annotation files 1183 each of which represents its annotations. 1184 1185 1186 \begin{figure} 1187 \begin{verbatim} 1188 package p1; 1189 1190 import p2.*; // for the annotations @A through @D 1191 import java.util.*; 1192 1193 public @A(12) class Foo { 1194 1195 public int bar; // no annotation 1196 private @B List<@C String> baz; 1197 1198 public Foo(@D("spam") Foo this, @B List<@C String> a) { 1199 @B List<@C String> l = new LinkedList<@C String>(); 1200 l = (@B List<@C String>)l; 1201 } 1202 } 1203 \end{verbatim} 1204 \caption{Example Java code with annotations.} 1205 \label{fig:java-example} 1206 \end{figure} 1207 1208 1209 \begin{figure} 1210 \begin{tabular}{|c|c|} 1211 \hline 1212 \begin{minipage}[t]{.5\textwidth} 1213 \begin{verbatim} 1214 package p2: 1215 annotation @A: 1216 int value 1217 annotation @B: 1218 annotation @C: 1219 annotation @D: 1220 String value 1221 1222 package p1: 1223 class Foo: @A(value=12) 1224 1225 field bar: 1226 1227 field baz: @B 1228 inner-type 0: @C 1229 1230 method <init>( 1231 Ljava/util/List;)V: 1232 parameter 0: @B 1233 inner-type 0: @C 1234 receiver: @D(value="spam") 1235 local 1 #3+5: @B 1236 inner-type 0: @C 1237 typecast #7: @B 1238 inner-type 0: @C 1239 new #0: 1240 inner-type 0: @C 1241 \end{verbatim} 1242 \end{minipage} 1243 & 1244 \begin{minipage}[t]{.45\textwidth} 1245 \begin{verbatim} 1246 package p2: 1247 annotation @A 1248 int value 1249 1250 package p2: 1251 annotation @B 1252 1253 package p2: 1254 annotation @C 1255 1256 package p2: 1257 annotation @D 1258 String value 1259 1260 package p1: 1261 class Foo: @A(value=12) 1262 1263 package p1: 1264 class Foo: 1265 field baz: @B 1266 1267 package p1: 1268 class Foo: 1269 field baz: 1270 inner-type 0: @C 1271 1272 // ... definitions for p1.Foo.<init> 1273 // omitted for brevity 1274 \end{verbatim} 1275 \end{minipage} 1276 \\ 1277 \hline 1278 \end{tabular} 1279 1280 \caption{Two distinct annotation files each corresponding to the code of 1281 Figure~\ref{fig:java-example}.} 1282 \label{fig:annotation-file-examples} 1283 \end{figure} 1284 1285 1286 \section{Types and values\label{types-and-values}} 1287 1288 The Java language permits several types for annotation fields: primitives, 1289 \code{String}s, \code{java.lang.Class} tokens (possibly parameterized), 1290 enumeration constants, annotations, and one-dimensional arrays of these. 1291 1292 These \textbf{types} are represented in an annotation file as follows: 1293 \begin{itemize} 1294 \item Primitive: the name of the primitive type, such as \code{boolean}. 1295 \item String: \code{String}. 1296 \item Class token: \code{Class}; the parameterization, if any, is not 1297 represented in annotation files. 1298 \item Enumeration constant: \code{enum} followed by the binary name of 1299 the enumeration class, such as \code{enum java.lang.Thread\$State}. 1300 \item Annotation: \code{@} followed by the binary name of the annotation type. 1301 \item Array: The representation of the element type followed by \code{[]}, such 1302 as \code{String[]}, with one exception: an annotation definition may specify 1303 a field type as \code{unknown[]} if, in all occurrences of that annotation in 1304 the annotation file, the field value is a zero-length array.\footnotemark 1305 \footnotetext{There is a design flaw in the format of array field values in a 1306 class file. An array does not itself specify an element type; instead, each 1307 element specifies its type. If the annotation type \code{X} has an array field 1308 \code{arr} but \code{arr} is zero-length in every \code{@X} annotation in the 1309 class file, there is no way to determine the element type of \code{arr} from the 1310 class file. This exception makes it possible to define \code{X} when the class 1311 file is converted to an annotation file.} 1312 \end{itemize} 1313 1314 Annotation field \textbf{values} are represented in an annotation file as follows: 1315 \begin{itemize} 1316 \item Numeric primitive value: literals as they would appear in Java source 1317 code. 1318 \item Boolean: \code{true} or \code{false}. 1319 \item Character: A single character or escape sequence in single quotes, such 1320 as \code{'A'} or \code{'\char`\\''}. 1321 \item String: A string literal as it would appear in source code, such as 1322 \code{"\char`\\"Yields falsehood when quined\char`\\" yields falsehood when quined."}. 1323 \item Class token: The binary name of the class (using \code{\$} for 1324 inner classes) or the name of the primitive type or \code{void}, possibly 1325 followed by \code{[]}s representing array layers, followed by \code{.class}. 1326 Examples: \code{java.lang.Integer[].class}, \code{java.util.Map\$Entry.class}, 1327 and \code{int.class}. 1328 \item Enumeration constant: the name of the enumeration constant, such as 1329 \code{RUNNABLE}. 1330 \item Array: a sequence of elements inside \code{\char`\{\char`\}} with a comma 1331 between each pair of adjacent elements; a comma following the last element is 1332 optional as in Java. Also as in Java, the braces may be omitted if the 1333 array has only one element. 1334 Examples: \code{\char`\{1\char`\}}, \code{1}, 1335 \code{\char`\{true, false,\char`\}} and \code{\char`\{\char`\}}. 1336 \end{itemize} 1337 1338 The following example annotation file shows how types and values are represented. 1339 1340 \begin{verbatim} 1341 package p1: 1342 1343 annotation @ClassInfo: 1344 String remark 1345 Class favoriteClass 1346 Class favoriteCollection // it's probably Class<? extends Collection> 1347 // in source, but no parameterization here 1348 char favoriteLetter 1349 boolean isBuggy 1350 enum p1.DebugCategory[] defaultDebugCategories 1351 @p1.CommitInfo lastCommit 1352 1353 annotation @CommitInfo: 1354 byte[] hashCode 1355 int unixTime 1356 String author 1357 String message 1358 1359 class Foo: @p1.ClassInfo( 1360 remark="Anything named \"Foo\" is bound to be good!", 1361 favoriteClass=java.lang.reflect.Proxy.class, 1362 favoriteCollection=java.util.LinkedHashSet.class, 1363 favoriteLetter='F', 1364 isBuggy=true, 1365 defaultDebugCategories={DEBUG_TRAVERSAL, DEBUG_STORES, DEBUG_IO}, 1366 lastCommit=@p1.CommitInfo( 1367 hashCode={31, 41, 59, 26, 53, 58, 97, 92, 32, 38, 46, 26, 43, 38, 32, 79}, 1368 unixTime=1152109350, 1369 author="Joe Programmer", 1370 message="First implementation of Foo" 1371 ) 1372 ) 1373 \end{verbatim} 1374 1375 1376 \section{Alternative formats\label{alternative-formats}} 1377 1378 We mention multiple alternatives to the format described in this document. 1379 Each of them has its own merits. 1380 In the future, the other formats could be implemented, along with tools for 1381 converting among them. 1382 % Then, we can see which of the formats programmers prefer in practice. 1383 1384 1385 1386 An alternative to the format described in this document would be XML\@. 1387 % It would be easy to use an XML format to augment the one proposed here, but 1388 XML does not seem to provide any compelling advantages. Programmers 1389 interact with annotation files in two ways: textually (when reading, writing, 1390 and editing annotation files) and programmatically (when writing 1391 annotation-processing tools). Textually, XML can be 1392 very hard to read; style sheets mitigate this 1393 problem, but editing XML files remains tedious and error-prone. 1394 Programmatically, a layer of abstraction (an API) is needed in any event, so it 1395 makes little difference what the underlying textual representation is. 1396 XML files are easier to parse, but the parsing code only needs to be 1397 written once and is abstracted away by an API to the data structure. 1398 1399 1400 Another alternative is a format like the \code{.spec}/\code{.jml} files 1401 of JML~\cite{LeavensBR2006:JML}. The format is similar to Java code, but 1402 all method bodies are empty, and users can annotate the public members of a 1403 class. This is easy for Java programmers to read and understand. (It is a 1404 bit more complex to implement, but that is not particularly germane.) 1405 Because it does not permit complete specification of a class's annotations 1406 (it does not permit annotation of method bodies), it is not appropriate for 1407 certain tools, such as type inference tools. However, it might be desirable 1408 to adopt such a format for public members, and to use the format 1409 described in this document primarily for method bodies. 1410 1411 1412 The Checker Framework~\cite{DietlDEMS2011,CF} uses two additional formats for 1413 annotations. The first format is called ``stub files.'' A stub file is similar 1414 to the \code{.spec}/\code{.jml} files described in the previous paragraph. It 1415 uses Java syntax, only allows annotations on method headers and does not require 1416 method bodies. A stub file is used to add annotations to method headers of 1417 existing Java classes. For example, the Checker Framework uses stub files to add 1418 annotations to method headers of libraries (such as the JDK) without modifying 1419 the source code or bytecode of the library. A single stub file can contain 1420 multiple packages and classes. This format only allows annotations on method 1421 headers, not class headers, fields, and method bodies like in a \code{.jaif} 1422 file. Further, stub files are only used by the Checker Framework at run time, 1423 they cannot be used to insert annotations into a source or classfile. 1424 1425 1426 The Checker Framework also uses a format called an ``annotated JDK.'' The 1427 annotated JDK is a \code{.jar} file containing the JDK with annotations. It is 1428 created with the Annotation File Utilities, but the annotations are stored in a 1429 format similar to a stub file, instead of in a \code{.jaif} file. The annotated 1430 JDK starts with a source file for each file in the JDK to be annotated. Like a 1431 stub file, each source file only contains method headers with annotations. The 1432 annotated JDK also supports annotations in the class header. To build the 1433 annotated JDK \code{.jar} file, the source files are compiled, then the 1434 \code{extract-annotations} script is run on them to generate a \code{.jaif} file 1435 for each source file. The \code{insert-annotations} script then inserts the 1436 annotations contained in each \code{.jaif} file into the corresponding JDK class 1437 file. These are then packaged up into a single \code{.jar} file. Like a stub 1438 files, the annotated JDK is easier to read and write since it uses Java syntax. 1439 However, the annotated JDK requires a different file for each original Java 1440 source file. It does not allow annotations on fields and in method bodies. The 1441 annotated JDK also only contains annotations in the JDK and not other Java 1442 files. 1443 1444 1445 1446 \bibliographystyle{alpha} 1447 \bibliography{annotation-file-format,bibstring-unabbrev,types,ernst,invariants,generals,alias,crossrefs} 1448 1449 \end{document} 1450 1451 % LocalWords: java javac OuterClass InnerClass TODO Kleene MP subannotations 1452 % LocalWords: enum arr quined int pt instanceof RUNTIME JVML ILjava boolean 1453 % LocalWords: programmatically jml ernst jaif whitespace 0pt decl enums 1454 % LocalWords: filenames typeparam javap init clinit ast un lowercased io 1455 % LocalWords: ExpressionStatement AnnotatedType underlyingType ArrayType 1456 % LocalWords: ArrayAccess leftOperand rightOperand CompoundAssignment 1457 % LocalWords: ConditionalExpression trueExpression falseExpression 1458 % LocalWords: DoWhileLoop EnhancedForLoop ForLoop thenStatement NewArray 1459 % LocalWords: elseStatement InstanceOf LabeledStatement LambdaExpression 1460 % LocalWords: MemberReference qualifierExpression typeArgument NewClass 1461 % LocalWords: MemberSelect MethodInvocation methodSelect classBody 1462 % LocalWords: enclosingExpression ParameterizedType finallyBlock AScene 1463 % LocalWords: TypeCast UnionType typeAlternative WhileLoop ElementType 1464 % LocalWords: AClass AMethod AElement objectweb anno tations parseScene 1465 % LocalWords: CriterionList isSatisifiedBy CriteriaList afu getPositions 1466 % LocalWords: InPackageCriterion InClassCriterion InMethodCriterion 1467 % LocalWords: ParamCriterion inserter RUNNABLE ASM src asm 1468