1 This document compares the D3D10/D3D11 device driver interface with Gallium. 2 It is written from the perspective of a developer implementing a D3D10/D3D11 driver as a Gallium state tracker. 3 4 Note that naming and other cosmetic differences are not noted, since they don't really matter and would severely clutter the document. 5 Gallium/OpenGL terminology is used in preference to D3D terminology. 6 7 NOTE: this document tries to be complete but most likely isn't fully complete and also not fully correct: please submit patches if you spot anything incorrect 8 9 Also note that this is specifically for the DirectX 10/11 Windows Vista/7 DDI interfaces. 10 DirectX 9 has both user-mode (for Vista) and kernel mode (pre-Vista) interfaces, but they are significantly different from Gallium due to the presence of a lot of fixed function functionality. 11 12 The user-visible DirectX 10/11 interfaces are distinct from the kernel DDI, but they match very closely. 13 14 * Accessing Microsoft documentation 15 16 See http://msdn.microsoft.com/en-us/library/dd445501.aspx ("D3D11DDI_DEVICEFUNCS") for D3D documentation. 17 18 Also see http://download.microsoft.com/download/f/2/d/f2d5ee2c-b7ba-4cd0-9686-b6508b5479a1/direct3d10_web.pdf ("The Direct3D 10 System" by David Blythe) for an introduction to Direct3D 10 and the rationale for its design. 19 20 The Windows Driver Kit contains the actual headers, as well as shader bytecode documentation. 21 22 To get the headers from Linux, run the following, in a dedicated directory: 23 wget http://download.microsoft.com/download/4/A/2/4A25C7D5-EFBE-4182-B6A9-AE6850409A78/GRMWDK_EN_7600_1.ISO 24 sudo mount -o loop GRMWDK_EN_7600_1.ISO /mnt/tmp 25 cabextract -x /mnt/tmp/wdk/headers_cab001.cab 26 rename 's/^_(.*)_[0-9]*$/$1/' * 27 sudo umount /mnt/tmp 28 29 d3d10umddi.h contains the DDI interface analyzed in this document: note that it is much easier to read this online on MSDN. 30 d3d{10,11}TokenizedProgramFormat.hpp contains the shader bytecode definitions: this is not available on MSDN. 31 d3d9types.h contains DX9 shader bytecode, and DX9 types 32 d3dumddi.h contains the DirectX 9 DDI interface 33 34 * Glossary 35 36 BC1: DXT1 37 BC2: DXT3 38 BC3: DXT5 39 BC5: RGTC1 40 BC6H: BPTC float 41 BC7: BPTC 42 CS = compute shader: OpenCL-like shader 43 DS = domain shader: tessellation evaluation shader 44 HS = hull shader: tessellation control shader 45 IA = input assembler: primitive assembly 46 Input layout: vertex elements 47 OM = output merger: blender 48 PS = pixel shader: fragment shader 49 Primitive topology: primitive type 50 Resource: buffer or texture 51 Shader resource (view): sampler view 52 SO = stream out: transform feedback 53 Unordered access view: view supporting random read/write access (usually from compute shaders) 54 55 * Legend 56 57 -: features D3D11 has and Gallium lacks 58 +: features Gallium has and D3D11 lacks 59 !: differences between D3D11 and Gallium 60 *: possible improvements to Gallium 61 >: references to comparisons of special enumerations 62 #: comment 63 64 * Gallium functions with no direct D3D10/D3D11 equivalent 65 66 clear 67 + Gallium supports clearing both render targets and depth/stencil with a single call 68 69 fence_signalled 70 fence_finish 71 + D3D10/D3D11 don't appear to support explicit fencing; queries can often substitute though, and flushing is supported 72 73 set_clip_state 74 + Gallium supports fixed function user clip planes, D3D10/D3D11 only support using the vertex shader for them 75 76 set_polygon_stipple 77 + Gallium supports polygon stipple 78 79 clearRT/clearDS 80 + Gallium supports subrectangle fills of surfaces, D3D10 only supports full clears of views 81 82 * DirectX 10/11 DDI functions and Gallium equivalents 83 84 AbandonCommandList (D3D11 only) 85 - Gallium does not support deferred contexts 86 87 CalcPrivateBlendStateSize 88 CalcPrivateDepthStencilStateSize 89 CalcPrivateDepthStencilViewSize 90 CalcPrivateElementLayoutSize 91 CalcPrivateGeometryShaderWithStreamOutput 92 CalcPrivateOpenedResourceSize 93 CalcPrivateQuerySize 94 CalcPrivateRasterizerStateSize 95 CalcPrivateRenderTargetViewSize 96 CalcPrivateResourceSize 97 CalcPrivateSamplerSize 98 CalcPrivateShaderResourceViewSize 99 CalcPrivateShaderSize 100 CalcDeferredContextHandleSize (D3D11 only) 101 CalcPrivateCommandListSize (D3D11 only) 102 CalcPrivateDeferredContextSize (D3D11 only) 103 CalcPrivateTessellationShaderSize (D3D11 only) 104 CalcPrivateUnorderedAccessViewSize (D3D11 only) 105 ! D3D11 allocates private objects itself, using the size computed here 106 * Gallium could do something similar to be able to put the private data inline into state tracker objects: this would allow them to fit in the same cacheline and improve performance 107 108 CheckDeferredContextHandleSizes (D3D11 only) 109 - Gallium does not support deferred contexts 110 111 CheckFormatSupport -> screen->is_format_supported 112 ! Gallium passes usages to this function, D3D11 returns them 113 - Gallium does not differentiate between blendable and non-blendable render targets 114 ! Gallium includes sample count directly, D3D11 uses additional query 115 116 CheckMultisampleQualityLevels 117 ! is merged with is_format_supported 118 119 CommandListExecute (D3D11 only) 120 - Gallium does not support command lists 121 122 CopyStructureCount (D3D11 only) 123 - Gallium does not support unordered access views (views that can be written to arbitrarily from compute shaders) 124 125 ClearDepthStencilView -> clear_depth_stencil 126 ClearRenderTargetView -> clear_render_target 127 # D3D11 is not totally clear about whether this applies to any view or only a "currently-bound view" 128 + Gallium allows to clear both depth/stencil and render target(s) in a single operation 129 + Gallium supports double-precision depth values (but not rgba values!) 130 * May want to also support double-precision rgba or use "float" for "depth" 131 132 ClearUnorderedAccessViewFloat (D3D11 only) 133 ClearUnorderedAccessViewUint (D3D11 only) 134 - Gallium does not support unordered access views (views that can be written to arbitrarily from compute shaders) 135 136 CreateBlendState (extended in D3D10.1) -> create_blend_state 137 # D3D10 does not support per-RT blend modes (but per-RT blending), only D3D10.1 does 138 + Gallium supports logic ops 139 + Gallium supports dithering 140 + Gallium supports using the broadcast alpha component of the blend constant color 141 142 CreateCommandList (D3D11 only) 143 - Gallium does not support command lists 144 145 CreateComputeShader (D3D11 only) 146 - Gallium does not support compute shaders 147 148 CreateDeferredContext (D3D11 only) 149 - Gallium does not support deferred contexts 150 151 CreateDomainShader (D3D11 only) 152 - Gallium does not support domain shaders 153 154 CreateHullShader (D3D11 only) 155 - Gallium does not support hull shaders 156 157 CreateUnorderedAccessView (D3D11 only) 158 - Gallium does not support unordered access views 159 160 CreateDepthStencilState -> create_depth_stencil_alpha_state 161 ! D3D11 has both a global stencil enable, and front/back enables; Gallium has only front/back enables 162 + Gallium has per-face writemask/valuemasks, D3D11 uses the same value for back and front 163 + Gallium supports the alpha test, which D3D11 lacks 164 165 CreateDepthStencilView -> create_surface 166 CreateRenderTargetView -> create_surface 167 ! Gallium merges depthstencil and rendertarget views into pipe_surface 168 - lack of render-to-buffer support 169 + Gallium supports using 3D texture zslices as a depth/stencil buffer (in theory) 170 171 CreateElementLayout -> create_vertex_elements_state 172 ! D3D11 allows sparse vertex elements (via InputRegister); in Gallium they must be specified sequentially 173 ! D3D11 has an extra flag (InputSlotClass) that is the same as instance_divisor == 0 174 175 CreateGeometryShader -> create_gs_state 176 CreateGeometryShaderWithStreamOutput -> create_gs_state + create_stream_output_state 177 CreatePixelShader -> create_fs_state 178 CreateVertexShader -> create_vs_state 179 > bytecode is different (see D3d10tokenizedprogramformat.hpp) 180 ! D3D11 describes input/outputs separately from bytecode; Gallium has the tgsi_scan.c module to extract it from TGSI 181 @ TODO: look into DirectX 10/11 semantics specification and bytecode 182 183 CheckCounter 184 CheckCounterInfo 185 CreateQuery -> create_query 186 ! D3D11 implements fences with "event" queries 187 * others are performance counters, we may want them but they are not critical 188 189 CreateRasterizerState 190 + Gallium, like OpenGL, supports PIPE_POLYGON_MODE_POINT 191 + Gallium, like OpenGL, supports per-face polygon fill modes 192 + Gallium, like OpenGL, supports culling everything 193 + Gallium, like OpenGL, supports two-side lighting; D3D11 only has the facing attribute 194 + Gallium, like OpenGL, supports per-fill-mode polygon offset enables 195 + Gallium, like OpenGL, supports polygon smoothing 196 + Gallium, like OpenGL, supports polygon stipple 197 + Gallium, like OpenGL, supports point smoothing 198 + Gallium, like OpenGL, supports point sprites 199 + Gallium supports specifying point quad rasterization 200 + Gallium, like OpenGL, supports per-point point size 201 + Gallium, like OpenGL, supports line smoothing 202 + Gallium, like OpenGL, supports line stipple 203 + Gallium supports line last pixel rule specification 204 + Gallium, like OpenGL, supports provoking vertex convention 205 + Gallium supports D3D9 rasterization rules 206 + Gallium supports fixed line width 207 + Gallium supports fixed point size 208 209 CreateResource -> texture_create or buffer_create 210 ! D3D11 passes the dimensions of all mipmap levels to the create call, while Gallium has an implicit floor(x/2) rule 211 # Note that hardware often has the implicit rule, so the D3D11 interface seems to make little sense 212 # Also, the D3D11 API does not allow the user to specify mipmap sizes, so this really seems a dubious decision on Microsoft's part 213 - D3D11 supports specifying initial data to write in the resource 214 - Gallium does not support unordered access buffers 215 ! D3D11 specifies mapping flags (i.e. read/write/discard);:it's unclear what they are used for here 216 - D3D11 supports odd things in the D3D10_DDI_RESOURCE_MISC_FLAG enum (D3D10_DDI_RESOURCE_MISC_DISCARD_ON_PRESENT, D3D11_DDI_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS, D3D11_DDI_RESOURCE_MISC_BUFFER_STRUCTURED) 217 - Gallium does not support indirect draw call parameter buffers 218 ! D3D11 supports specifying hardware modes and other stuff here for scanout resources 219 ! D3D11 implements cube maps as 2D array textures 220 221 CreateSampler 222 - D3D11 supports a monochrome convolution filter for "text filtering" 223 + Gallium supports non-normalized coordinates 224 + Gallium supports CLAMP, MIRROR_CLAMP and MIRROR_CLAMP_TO_BORDER 225 + Gallium supports setting min/max/mip filters and anisotropy independently 226 227 CreateShaderResourceView (extended in D3D10.1) -> create_sampler_view 228 + Gallium supports specifying a swizzle 229 ! D3D11 implements "cube views" as views into a 2D array texture 230 231 CsSetConstantBuffers (D3D11 only) 232 CsSetSamplers (D3D11 only) 233 CsSetShader (D3D11 only) 234 CsSetShaderResources (D3D11 only) 235 CsSetShaderWithIfaces (D3D11 only) 236 CsSetUnorderedAccessViews (D3D11 only) 237 - Gallium does not support compute shaders 238 239 DestroyBlendState 240 DestroyCommandList (D3D11 only) 241 DestroyDepthStencilState 242 DestroyDepthStencilView 243 DestroyDevice 244 DestroyElementLayout 245 DestroyQuery 246 DestroyRasterizerState 247 DestroyRenderTargetView 248 DestroyResource 249 DestroySampler 250 DestroyShader 251 DestroyShaderResourceView 252 DestroyUnorderedAccessView (D3D11 only) 253 # these are trivial 254 255 Dispatch (D3D11 only) 256 - Gallium does not support compute shaders 257 258 DispatchIndirect (D3D11 only) 259 - Gallium does not support compute shaders 260 261 Draw -> draw_vbo 262 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better 263 264 DrawAuto -> draw_auto 265 266 DrawIndexed -> draw_vbo 267 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better 268 + D3D11 lacks explicit range, which is required for OpenGL 269 270 DrawIndexedInstanced -> draw_vbo 271 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better 272 273 DrawIndexedInstancedIndirect (D3D11 only) 274 # this allows to use an hardware buffer to specify the parameters for multiple draw_vbo calls 275 - Gallium does not support draw call parameter buffers and indirect draw 276 277 DrawInstanced -> draw_vbo 278 ! D3D11 sets primitive modes separately with IaSetTopology: it's not obvious which is better 279 280 DrawInstancedIndirect (D3D11 only) 281 # this allows to use an hardware buffer to specify the parameters for multiple draw_vbo calls 282 - Gallium does not support draw call parameter buffers and indirect draws 283 284 DsSetConstantBuffers (D3D11 only) 285 DsSetSamplers (D3D11 only) 286 DsSetShader (D3D11 only) 287 DsSetShaderResources (D3D11 only) 288 DsSetShaderWithIfaces (D3D11 only) 289 - Gallium does not support domain shaders 290 291 Flush -> flush 292 ! Gallium supports fencing, D3D11 just has a dumb glFlush-like function 293 294 GenMips 295 - Gallium lacks a mipmap generation interface, and does this manually with the 3D engine 296 * it may be useful to add a mipmap generation interface, since the hardware (especially older cards) may have a better way than using the 3D engine 297 298 GsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_GEOMETRY, i, phBuffers[i]) 299 300 GsSetSamplers 301 - Gallium does not support sampling in geometry shaders 302 303 GsSetShader -> bind_gs_state 304 305 GsSetShaderWithIfaces (D3D11 only) 306 - Gallium does not support shader interfaces 307 308 GsSetShaderResources 309 - Gallium does not support sampling in geometry shaders 310 311 HsSetConstantBuffers (D3D11 only) 312 HsSetSamplers (D3D11 only) 313 HsSetShader (D3D11 only) 314 HsSetShaderResources (D3D11 only) 315 HsSetShaderWithIfaces (D3D11 only) 316 - Gallium does not support hull shaders 317 318 IaSetIndexBuffer -> set_index_buffer 319 + Gallium supports 8-bit indices 320 # the D3D11 interface allows index-size-unaligned byte offsets into the index buffer; most drivers will abort with an assertion 321 322 IaSetInputLayout -> bind_vertex_elements_state 323 324 IaSetTopology 325 ! Gallium passes the topology = primitive type to the draw calls 326 * may want to add an interface for this 327 - Gallium lacks support for DirectX 11 tessellated primitives 328 + Gallium supports line loops, triangle fans, quads, quad strips and polygons 329 330 IaSetVertexBuffers -> set_vertex_buffers 331 - Gallium only allows setting all vertex buffers at once, while D3D11 supports setting a subset 332 333 OpenResource -> texture_from_handle 334 335 PsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_FRAGMENT, i, phBuffers[i]) 336 * may want to split into fragment/vertex-specific versions 337 338 PsSetSamplers -> bind_fragment_sampler_states 339 * may want to allow binding subsets instead of all at once 340 341 PsSetShader -> bind_fs_state 342 343 PsSetShaderWithIfaces (D3D11 only) 344 - Gallium does not support shader interfaces 345 346 PsSetShaderResources -> set_fragment_sampler_views 347 * may want to allow binding subsets instead of all at once 348 349 QueryBegin -> begin_query 350 351 QueryEnd -> end_query 352 353 QueryGetData -> get_query_result 354 - D3D11 supports reading an arbitrary data chunk for query results, Gallium only supports reading a 64-bit integer 355 + D3D11 doesn't seem to support actually waiting for the query result (?!) 356 - D3D11 supports optionally not flushing command buffers here and instead returning DXGI_DDI_ERR_WASSTILLDRAWING 357 358 RecycleCommandList (D3D11 only) 359 RecycleCreateCommandList (D3D11 only) 360 RecycleDestroyCommandList (D3D11 only) 361 - Gallium does not support command lists 362 363 RecycleCreateDeferredContext (D3D11 only) 364 - Gallium does not support deferred contexts 365 366 RelocateDeviceFuncs 367 - Gallium does not support moving pipe_context, while D3D11 seems to, using this 368 369 ResetPrimitiveID (D3D10.1+ only, #ifdef D3D10PSGP) 370 # used to do vertex processing on the GPU on Intel G45 chipsets when it is faster this way (see www.intel.com/Assets/PDF/whitepaper/322931.pdf) 371 # presumably this resets the primitive id system value 372 - Gallium does not support vertex pipeline bypass anymore 373 374 ResourceCopy 375 ResourceCopyRegion 376 ResourceConvert (D3D10.1+ only) 377 ResourceConvertRegion (D3D10.1+ only) 378 -> resource_copy_region 379 380 ResourceIsStagingBusy -> 381 - Gallium lacks this 382 + Gallium can use fences 383 384 ResourceReadAfterWriteHazard 385 - Gallium lacks this 386 387 ResourceResolveSubresource -> resource_resolve 388 389 ResourceMap 390 ResourceUnmap 391 DynamicConstantBufferMapDiscard 392 DynamicConstantBufferUnmap 393 DynamicIABufferMapDiscard 394 DynamicIABufferMapNoOverwrite 395 DynamicIABufferUnmap 396 DynamicResourceMapDiscard 397 DynamicResourceUnmap 398 StagingResourceMap 399 StagingResourceUnmap 400 -> transfer functions 401 ! Gallium and D3D have different semantics for transfers 402 * D3D separates vertex/index buffers from constant buffers 403 ! D3D separates some buffer flags into specialized calls 404 405 ResourceUpdateSubresourceUP -> transfer functionality, transfer_inline_write in gallium-resources 406 DefaultConstantBufferUpdateSubresourceUP -> transfer functionality, transfer_inline_write in gallium-resources 407 408 SetBlendState -> bind_blend_state, set_blend_color and set_sample_mask 409 ! D3D11 fuses bind_blend_state, set_blend_color and set_sample_mask in a single function 410 411 SetDepthStencilState -> bind_depth_stencil_alpha_state and set_stencil_ref 412 ! D3D11 fuses bind_depth_stencil_alpha_state and set_stencil_ref in a single function 413 414 SetPredication -> render_condition 415 # here both D3D11 and Gallium seem very limited (hardware is too, probably though) 416 # ideally, we should support nested conditional rendering, as well as more complex tests (checking for an arbitrary range, after an AND with arbitrary mask ) 417 # of couse, hardware support is probably as limited as OpenGL/D3D11 418 + Gallium, like NV_conditional_render, supports by-region and wait flags 419 - D3D11 supports predication conditional on being equal any value (along with occlusion predicates); Gallium only supports on non-zero 420 421 SetRasterizerState -> bind_rasterizer_state 422 423 SetRenderTargets (extended in D3D11) -> set_framebuffer_state 424 ! Gallium passed a width/height here, D3D11 does not 425 ! Gallium lacks ClearTargets (but this is redundant and the driver can trivially compute this if desired) 426 - Gallium does not support unordered access views 427 - Gallium does not support geometry shader selection of texture array image / 3D texture zslice 428 429 SetResourceMinLOD (D3D11 only) -> pipe_sampler_view::tex::first_level 430 431 SetScissorRects 432 - Gallium lacks support for multiple geometry-shader-selectable scissor rectangles D3D11 has 433 434 SetTextFilterSize 435 - Gallium lacks support for text filters 436 437 SetVertexPipelineOutput (D3D10.1+ only) 438 # used to do vertex processing on the GPU on Intel G45 chipsets when it is faster this way (see www.intel.com/Assets/PDF/whitepaper/322931.pdf) 439 - Gallium does not support vertex pipeline bypass anymore 440 441 SetViewports 442 - Gallium lacks support for multiple geometry-shader-selectable viewports D3D11 has 443 444 ShaderResourceViewReadAfterWriteHazard 445 - Gallium lacks support for this 446 + Gallium has texture_barrier 447 448 SoSetTargets -> set_stream_output_buffers 449 450 VsSetConstantBuffers -> for(i = StartBuffer; i < NumBuffers; ++i) set_constant_buffer(PIPE_SHADER_VERTEX, i, phBuffers[i]) 451 * may want to split into fragment/vertex-specific versions 452 453 VsSetSamplers -> bind_vertex_sampler_states 454 * may want to allow binding subsets instead of all at once 455 456 VsSetShader -> bind_vs_state 457 458 VsSetShaderWithIfaces (D3D11 only) 459 - Gallium does not support shader interfaces 460 461 VsSetShaderResources -> set_fragment_sampler_views 462 * may want to allow binding subsets instead of all at once 463