1 <?xml version="1.0" encoding="UTF-8" ?> 2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 3 <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> 4 <head> 5 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> 6 <link rel="stylesheet" href="resources/doc.css" charset="UTF-8" type="text/css" /> 7 <link rel="stylesheet" href="../coverage/jacoco-resources/prettify.css" charset="UTF-8" type="text/css" /> 8 <link rel="shortcut icon" href="resources/report.gif" type="image/gif" /> 9 <script type="text/javascript" src="../coverage/jacoco-resources/prettify.js"></script> 10 <title>JaCoCo - Implementation Design</title> 11 </head> 12 <body onload="prettyPrint()"> 13 14 <div class="breadcrumb"> 15 <a href="../index.html" class="el_report">JaCoCo</a> > 16 <a href="index.html" class="el_group">Documentation</a> > 17 <span class="el_source">Implementation Design</span> 18 </div> 19 <div id="content"> 20 21 <h1>Implementation Design</h1> 22 23 <p> 24 This is a unordered list of implementation design decisions. Each topic tries 25 to follow this structure: 26 </p> 27 28 <ul> 29 <li>Problem statement</li> 30 <li>Proposed Solution</li> 31 <li>Alternatives and Discussion</li> 32 </ul> 33 34 35 <h2>Coverage Analysis Mechanism</h2> 36 37 <p class="intro"> 38 Coverage information has to be collected at runtime. For this purpose JaCoCo 39 creates instrumented versions of the original class definitions. The 40 instrumentation process happens on-the-fly during class loading using so 41 called Java agents. 42 </p> 43 44 <p> 45 There are several different approaches to collect coverage information. For 46 each approach different implementation techniques are known. The following 47 diagram gives an overview with the techniques used by JaCoCo highlighted: 48 </p> 49 50 <img src="resources/implementation.png" alt="Coverage Implementation Techniques"/> 51 52 <p> 53 Byte code instrumentation is very fast, can be implemented in pure Java and 54 works with every Java VM. On-the-fly instrumentation with the Java agent 55 hook can be added to the JVM without any modification of the target 56 application. 57 </p> 58 59 <p> 60 The Java agent hook requires at least 1.5 JVMs. Class files compiled with 61 debug information (line numbers) allow for source code highlighting. Unluckily 62 some Java language constructs get compiled to byte code that produces 63 unexpected highlighting results, especially in case of implicitly generated 64 code like default constructors or control structures for finally statements. 65 </p> 66 67 68 <h2>Coverage Agent Isolation</h2> 69 70 <p class="intro"> 71 The Java agent is loaded by the application class loader. Therefore the 72 classes of the agent live in the same name space like the application classes 73 which can result in clashes especially with the third party library ASM. The 74 JoCoCo build therefore moves all agent classes into a unique package. 75 </p> 76 77 <p> 78 The JaCoCo build renames all classes contained in the 79 <code>jacocoagent.jar</code> into classes with a 80 <code>org.jacoco.agent.rt_<randomid></code> prefix, including the 81 required ASM library classes. The identifier is created from a random number. 82 As the agent does not provide any API, no one should be affected by this 83 renaming. This trick also allows that JaCoCo tests can be verified with 84 JaCoCo. 85 </p> 86 87 88 <h2>Minimal Java Version</h2> 89 90 <p class="intro"> 91 JaCoCo requires Java 1.5. 92 </p> 93 94 <p> 95 The Java agent mechanism used for on-the-fly instrumentation became available 96 with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more 97 efficient, less error-prone – and more fun than with older versions. 98 JaCoCo will still allow to run against Java code compiled for these. 99 </p> 100 101 102 <h2>Byte Code Manipulation</h2> 103 104 <p class="intro"> 105 Instrumentation requires mechanisms to modify and generate Java byte code. 106 JaCoCo uses the ASM library for this purpose internally. 107 </p> 108 109 <p> 110 Implementing the Java byte code specification would be an extensive and 111 error-prone task. Therefore an existing library should be used. The 112 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to 113 use and very efficient in terms of memory and CPU usage. It is actively 114 maintained and includes as huge regression test suite. Its simplified BSD 115 license is approved by the Eclipse Foundation for usage with EPL products. 116 </p> 117 118 <h2>Java Class Identity</h2> 119 120 <p class="intro"> 121 Each class loaded at runtime needs a unique identity to associate coverage data with. 122 JaCoCo creates such identities by a CRC64 hash code of the raw class definition. 123 </p> 124 125 <p> 126 In multi-classloader environments the plain name of a class does not 127 unambiguously identify a class. For example OSGi allows to use different 128 versions of the same class to be loaded within the same VM. In complex 129 deployment scenarios the actual version of the test target might be different 130 from current development version. A code coverage report should guarantee that 131 the presented figures are extracted from a valid test target. A hash code of 132 the class definitions allows to differentiate between classes and versions of 133 classes. The CRC64 hash computation is simple and fast resulting in a small 64 134 bit identifier. 135 </p> 136 137 <p> 138 The same class definition might be loaded by class loaders which will result 139 in different classes for the Java runtime system. For coverage analysis this 140 distinction should be irrelevant. Class definitions might be altered by other 141 instrumentation based technologies (e.g. AspectJ). In this case the hash code 142 will change and identity gets lost. On the other hand code coverage analysis 143 based on classes that have been somehow altered will produce unexpected 144 results. The CRC64 code might produce so called <i>collisions</i>, i.e. 145 creating the same hash code for two different classes. Although CRC64 is not 146 cryptographically strong and collision examples can be easily computed, for 147 regular class files the collision probability is very low. 148 </p> 149 150 <h2>Coverage Runtime Dependency</h2> 151 152 <p class="intro"> 153 Instrumented code typically gets a dependency to a coverage runtime which is 154 responsible for collecting and storing execution data. JaCoCo uses JRE types 155 only in generated instrumentation code. 156 </p> 157 158 <p> 159 Making a runtime library available to all instrumented classes can be a 160 painful or impossible task in frameworks that use their own class loading 161 mechanisms. Since Java 1.6 <code>java.lang.instrument.Instrumentation</code> 162 has an API to extends the bootsstrap loader. As our minimum target is Java 1.5 163 JaCoCo decouples the instrumented classes and the coverage runtime through 164 official JRE API types only. The instrumented classes communicate through the 165 <code>Object.equals(Object)</code> method with the runtime. A instrumented 166 class can retrieve its probe array instance with the following code. Note 167 that only JRE APIs are used: 168 </p> 169 170 171 <pre class="source lang-java linenums"> 172 Object access = ... // Retrieve instance 173 174 Object[] args = new Object[3]; 175 args[0] = Long.valueOf(8060044182221863588); // class id 176 args[1] = "com/example/MyClass"; // class name 177 args[2] = Integer.valueOf(24); // probe count 178 179 access.equals(args); 180 181 boolean[] probes = (boolean[]) args[0]; 182 </pre> 183 184 <p> 185 The most tricky part takes place in line 1 and is not shown in the snippet 186 above. The object instance providing access to the coverage runtime through 187 its <code>equals()</code> method has to be obtained. Different approaches have 188 been implemented and tested so far: 189 </p> 190 191 <ul> 192 <li><b><code>SystemPropertiesRuntime</code></b>: This approach stores the 193 object instance under a system property. This solution breaks the contract 194 that system properties must only contain <code>java.lang.String</code> 195 values and therefore causes trouble in applications that rely on this 196 definition (e.g. Ant).</li> 197 <li><b><code>LoggerRuntime</code></b>: Here we use a shared 198 <code>java.util.logging.Logger</code> and communicate through the logging 199 parameter array instead of a <code>equals()</code> method. The coverage 200 runtime registers a custom <code>Handler</code> to receive the parameter 201 array. This approach might break environments that install their own log 202 managers (e.g. Glassfish).</li> 203 <li><b><code>URLStreamHandlerRuntime</code></b>: This runtime registers a 204 <code>URLStreamHandler</code> for a "jacoco-xxxxx" protocol. Instrumented 205 classes open a connection on this protocol. The returned connection object 206 is the one that provides access to the coverage runtime through its 207 <code>equals()</code> method. However to register the protocol the runtime 208 needs to access internal members of the <code>java.net.URL</code> class.</li> 209 <li><b><code>ModifiedSystemClassRuntime</code></b>: This approach adds a 210 public static field to an existing JRE class through instrumentation. Unlike 211 the other methods above this is only possible for environments where a Java 212 agent is active.</li> 213 </ul> 214 215 <p> 216 The current JaCoCo Java agent implementation uses the 217 <code>ModifiedSystemClassRuntime</code> adding a field to the class 218 <code>java.lang.UnknownError</code>. Versions 0.5.0 - 0.7.9 were adding field 219 to the class <code>java.util.UUID</code>, having bigger chance of conflict 220 with other agents. 221 </p> 222 223 224 <h2>Memory Usage</h2> 225 226 <p class="intro"> 227 Coverage analysis for huge projects with several thousand classes or hundred 228 thousand lines of code should be possible. To allow this with reasonable 229 memory usage the coverage analysis is based on streaming patterns and 230 "depth first" traversals. 231 </p> 232 233 <p> 234 The complete data tree of a huge coverage report is too big to fit into a 235 reasonable heap memory configuration. Therefore the coverage analysis and 236 report generation is implemented as "depth first" traversals. Which means that 237 at any point in time only the following data has to be held in working memory: 238 </p> 239 240 <ul> 241 <li>A single class which is currently processed.</li> 242 <li>The summary information of all parents of this class (package, groups).</li> 243 </ul> 244 245 <h2>Java Element Identifiers</h2> 246 247 <p class="intro"> 248 The Java language and the Java VM use different String representation formats 249 for Java elements. For example while a type reference in Java reads like 250 <code>java.lang.Object</code>, the VM references the same type as 251 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only. 252 </p> 253 254 <p> 255 Using VM identifiers directly does not cause any transformation overhead at 256 runtime. There are several programming languages based on the Java VM that 257 might use different notations. Specific transformations should therefore only 258 happen at the user interface level, for example during report generation. 259 </p> 260 261 <h2>Modularization of the JaCoCo implementation</h2> 262 263 <p class="intro"> 264 JaCoCo is implemented in several modules providing different functionality. 265 These modules are provided as OSGi bundles with proper manifest files. But 266 there are no dependencies on OSGi itself. 267 </p> 268 269 <p> 270 Using OSGi bundles allows well defined dependencies at development time and 271 at runtime in OSGi containers. As there are no dependencies on OSGi, the 272 bundles can also be used like regular JAR files. 273 </p> 274 275 </div> 276 <div class="footer"> 277 <span class="right"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</span> 278 <a href="license.html">Copyright</a> © @copyright.years@ Mountainminds GmbH & Co. KG and Contributors 279 </div> 280 281 </body> 282 </html> 283