1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> 2 <html lang="en"> 3 <head> 4 <meta http-equiv="content-type" content="text/html; charset=utf-8"> 5 <title>llvmpipe</title> 6 <link rel="stylesheet" type="text/css" href="mesa.css"> 7 </head> 8 <body> 9 10 <div class="header"> 11 <h1>The Mesa 3D Graphics Library</h1> 12 </div> 13 14 <iframe src="contents.html"></iframe> 15 <div class="content"> 16 17 <h1>Introduction</h1> 18 19 <p> 20 The Gallium llvmpipe driver is a software rasterizer that uses LLVM to 21 do runtime code generation. 22 Shaders, point/line/triangle rasterization and vertex processing are 23 implemented with LLVM IR which is translated to x86 or x86-64 machine 24 code. 25 Also, the driver is multithreaded to take advantage of multiple CPU cores 26 (up to 8 at this time). 27 It's the fastest software rasterizer for Mesa. 28 </p> 29 30 31 <h1>Requirements</h1> 32 33 <ul> 34 <li> 35 <p>An x86 or amd64 processor; 64-bit mode recommended.</p> 36 <p> 37 Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will 38 yield the most efficient code. The fewer features the CPU has the more 39 likely is that you run into underperforming, buggy, or incomplete code. 40 </p> 41 <p> 42 See /proc/cpuinfo to know what your CPU supports. 43 </p> 44 </li> 45 <li> 46 <p>LLVM: version 3.4 recommended; 3.3 or later required.</p> 47 <p> 48 For Linux, on a recent Debian based distribution do: 49 </p> 50 <pre> 51 aptitude install llvm-dev 52 </pre> 53 <p> 54 For a RPM-based distribution do: 55 </p> 56 <pre> 57 yum install llvm-devel 58 </pre> 59 60 <p> 61 For Windows you will need to build LLVM from source with MSVC or MINGW 62 (either natively or through cross compilers) and CMake, and set the LLVM 63 environment variable to the directory you installed it to. 64 65 LLVM will be statically linked, so when building on MSVC it needs to be 66 built with a matching CRT as Mesa, and you'll need to pass 67 <code>-DLLVM_USE_CRT_xxx=yyy</code> as described below. 68 </p> 69 70 <table border="1"> 71 <tr> 72 <th rowspan="2">LLVM build-type</th> 73 <th colspan="2" align="center">Mesa build-type</th> 74 </tr> 75 <tr> 76 <th>debug,checked</th> 77 <th>release,profile</th> 78 </tr> 79 <tr> 80 <th>Debug</th> 81 <td><code>-DLLVM_USE_CRT_DEBUG=MTd</code></td> 82 <td><code>-DLLVM_USE_CRT_DEBUG=MT</code></td> 83 </tr> 84 <tr> 85 <th>Release</th> 86 <td><code>-DLLVM_USE_CRT_RELEASE=MTd</code></td> 87 <td><code>-DLLVM_USE_CRT_RELEASE=MT</code></td> 88 </tr> 89 </table> 90 91 <p> 92 You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86 93 to cmake. 94 </p> 95 </li> 96 97 <li> 98 <p>scons (optional)</p> 99 </li> 100 </ul> 101 102 103 <h1>Building</h1> 104 105 To build everything on Linux invoke scons as: 106 107 <pre> 108 scons build=debug libgl-xlib 109 </pre> 110 111 Alternatively, you can build it with GNU make, if you prefer, by invoking it as 112 113 <pre> 114 make linux-llvm 115 </pre> 116 117 but the rest of these instructions assume that scons is used. 118 119 For Windows the procedure is similar except the target: 120 121 <pre> 122 scons platform=windows build=debug libgl-gdi 123 </pre> 124 125 126 <h1>Using</h1> 127 128 <h2>Linux</h2> 129 130 <p>On Linux, building will create a drop-in alternative for libGL.so into</p> 131 132 <pre> 133 build/foo/gallium/targets/libgl-xlib/libGL.so 134 </pre> 135 or 136 <pre> 137 lib/gallium/libGL.so 138 </pre> 139 140 <p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p> 141 142 <p>For performance evaluation pass build=release to scons, and use the corresponding 143 lib directory without the "-debug" suffix.</p> 144 145 146 <h2>Windows</h2> 147 148 <p> 149 On Windows, building will create 150 <code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code> 151 which is a drop-in alternative for system's <code>opengl32.dll</code>. To use 152 it put it in the same directory as your application. It can also be used by 153 replacing the native ICD driver, but it's quite an advanced usage, so if you 154 need to ask, don't even try it. 155 </p> 156 157 <p> 158 There is however an easy way to replace the OpenGL software renderer that comes 159 with Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without 160 any OpenGL drivers): 161 </p> 162 163 <ul> 164 <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li> 165 <li><p>load this registry settings:</p> 166 <pre>REGEDIT4 167 168 ; http://technet.microsoft.com/en-us/library/cc749368.aspx 169 ; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596 170 [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL] 171 "DLL"="mesadrv.dll" 172 "DriverVersion"=dword:00000001 173 "Flags"=dword:00000001 174 "Version"=dword:00000002 175 </pre> 176 </li> 177 <li>Ditto for 64 bits drivers if you need them.</li> 178 </ul> 179 180 181 <h1>Profiling</h1> 182 183 <p> 184 To profile llvmpipe you should build as 185 </p> 186 <pre> 187 scons build=profile <same-as-before> 188 </pre> 189 190 <p> 191 This will ensure that frame pointers are used both in C and JIT functions, and 192 that no tail call optimizations are done by gcc. 193 </p> 194 195 <h2>Linux perf integration</h2> 196 197 <p> 198 On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>: 199 </p> 200 201 <pre> 202 perf record -g /my/application 203 perf report 204 </pre> 205 206 <p> 207 When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with 208 symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm, 209 which can be used by the bin/perf-annotate-jit script to produce disassembly of 210 the generated code annotated with the samples. 211 </p> 212 213 <p>You can obtain a call graph via 214 <a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p> 215 216 217 <h1>Unit testing</h1> 218 219 <p> 220 Building will also create several unit tests in 221 build/linux-???-debug/gallium/drivers/llvmpipe: 222 </p> 223 224 <ul> 225 <li> lp_test_blend: blending 226 <li> lp_test_conv: SIMD vector conversion 227 <li> lp_test_format: pixel unpacking/packing 228 </ul> 229 230 <p> 231 Some of this tests can output results and benchmarks to a tab-separated-file 232 for posterior analysis, e.g.: 233 </p> 234 <pre> 235 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv 236 </pre> 237 238 239 <h1>Development Notes</h1> 240 241 <ul> 242 <li> 243 When looking to this code by the first time start in lp_state_fs.c, and 244 then skim through the lp_bld_* functions called in there, and the comments 245 at the top of the lp_bld_*.c functions. 246 </li> 247 <li> 248 The driver-independent parts of the LLVM / Gallium code are found in 249 src/gallium/auxiliary/gallivm/. The filenames and function prefixes 250 need to be renamed from "lp_bld_" to something else though. 251 </li> 252 <li> 253 We use LLVM-C bindings for now. They are not documented, but follow the C++ 254 interfaces very closely, and appear to be complete enough for code 255 generation. See 256 <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html"> 257 this stand-alone example</a>. See the llvm-c/Core.h file for reference. 258 </li> 259 </ul> 260 261 <h1 id="recommended_reading">Recommended Reading</h1> 262 263 <ul> 264 <li> 265 <p>Rasterization</p> 266 <ul> 267 <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li> 268 <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li> 269 <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li> 270 <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li> 271 <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li> 272 </ul> 273 </li> 274 <li> 275 <p>Texture sampling</p> 276 <ul> 277 <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li> 278 <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li> 279 <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li> 280 <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li> 281 <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li> 282 <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li> 283 </ul> 284 </li> 285 <li> 286 <p>SIMD</p> 287 <ul> 288 <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li> 289 </ul> 290 </li> 291 <li> 292 <p>Optimization</p> 293 <ul> 294 <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li> 295 <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li> 296 <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li> 297 <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li> 298 </ul> 299 </li> 300 <li> 301 <p>LLVM</p> 302 <ul> 303 <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li> 304 <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li> 305 </ul> 306 </li> 307 <li> 308 <p>General</p> 309 <ul> 310 <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li> 311 <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li> 312 </ul> 313 </li> 314 </ul> 315 316 </div> 317 </body> 318 </html> 319