Home | History | Annotate | Download | only in docs
      1 <!-- markdownlint-disable MD041 -->
      2 <!-- Copyright 2015-2019 LunarG, Inc. -->
      3 [![Khronos Vulkan][1]][2]
      4 
      5 [1]: https://vulkan.lunarg.com/img/Vulkan_100px_Dec16.png "https://www.khronos.org/vulkan/"
      6 [2]: https://www.khronos.org/vulkan/
      7 
      8 # GPU-Assisted Validation
      9 
     10 [![Creative Commons][3]][4]
     11 
     12 [3]: https://i.creativecommons.org/l/by-nd/4.0/88x31.png "Creative Commons License"
     13 [4]: https://creativecommons.org/licenses/by-nd/4.0/
     14 
     15 GPU-Assisted validation is implemented in the SPIR-V Tools optimizer and the `VK_LAYER_LUNARG_core_validation` layer.
     16 This document covers the design of the layer portion of the implementation.
     17 
     18 ## Basic Operation
     19 
     20 The basic operation of GPU-Assisted validation is comprised of instrumenting shader code to perform run-time checking in shaders and
     21 reporting any error conditions to the layer.
     22 The layer then reports the errors to the user via the same reporting mechanisms used by the rest of the validation system.
     23 
     24 The layer instruments the shaders by passing the shader's SPIR-V bytecode to the SPIR-V optimizer component and
     25 instructs the optimizer to perform an instrumentation pass to add the additional instructions to perform the run-time checking.
     26 The layer then passes the resulting modified SPIR-V bytecode to the driver as part of the process of creating a ShaderModule.
     27 
     28 As the shader is executed, the instrumented shader code performs the run-time checks.
     29 If a check detects an error condition, the instrumentation code writes an error record into the GPU's device memory.
     30 This record is small and is on the order of a dozen 32-bit words.
     31 Since multiple shader stages and multiple invocations of a shader can all detect errors, the instrumentation code
     32 writes error records into consecutive memory locations as long as there is space available in the pre-allocated block of device memory.
     33 
     34 The layer inspects this device memory block after completion of a queue submission.
     35 If the GPU had written an error record to this memory block,
     36 the layer analyzes this error record and constructs a validation error message
     37 which is then reported in the same manner as other validation messages.
     38 If the shader was compiled with debug information (source code and SPIR-V instruction mapping to source code lines), the layer
     39 also provides the line of shader source code that provoked the error as part of the validation error message.
     40 
     41 ## GPU-Assisted Validation Checks
     42 
     43 The initial release (Jan 2019) of GPU-Assisted Validation includes checking for out-of-bounds descriptor array indexing
     44 for image/texel descriptor types.
     45 
     46 Future releases are planned to add checking for other hazards such as proper population of descriptors when using the
     47 `descriptorBindingPartiallyBound` feature of the `VK_EXT_descriptor_indexing` extension.
     48 
     49 ### Out-of-Bounds(OOB) Descriptor Array Indexing
     50 
     51 Checking for correct indexing of descriptor arrays is sometimes referred to as "bind-less validation".
     52 It is called "bind-less" because a binding in a descriptor set may contain an array of like descriptors.
     53 And unless there is a constant or compile-time indication of which descriptor in the array is selected,
     54 the descriptor binding status is considered to be ambiguous, leaving the actual binding to be determined at run-time.
     55 
     56 As an example, a fragment shader program may use a variable to index an array of combined image samplers.
     57 Such a line might look like:
     58 
     59 ```glsl
     60 uFragColor = light * texture(tex[tex_ind], texcoord.xy);
     61 ```
     62 
     63 The array of combined image samplers is `tex` and has 6 samplers in the array.
     64 The complete validation error message issued when `tex_ind` indexes past the array is:
     65 
     66 ```terminal
     67 ERROR : VALIDATION - Message Id Number: 0 | Message Id Name: UNASSIGNED-Image descriptor index out of bounds
     68         Index of 6 used to index descriptor array of length 6.  Command buffer (CubeDrawCommandBuf)(0xbc24b0).
     69         Pipeline (0x45). Shader Module (0x43). Shader Instruction Index = 108.  Stage = Fragment.
     70         Fragment coord (x,y) = (419.5, 254.5). Shader validation error occurred in file:
     71         /home/user/src/Vulkan-ValidationLayers/external/Vulkan-Tools/cube/cube.frag at line 45.
     72 45:    uFragColor = light * texture(tex[tex_ind], texcoord.xy);
     73 ```
     74 
     75 ## GPU-Assisted Validation Options
     76 
     77 Here are the options related to activating GPU-Assisted Validation:
     78 
     79 1. Enable GPU-Assisted Validation - GPU-Assisted Validation is off by default and must be enabled.
     80 
     81     GPU-Assisted Validation is disabled by default because the shader instrumentation may introduce significant
     82     shader performance degradation and additional resource consumption.
     83     GPU-Assisted Validation requires additional resources such as device memory and descriptors.
     84     It is desirable for the user to opt-in to this feature because of these requirements.
     85     In addition, there are several limitations that may adversely affect application behavior,
     86     as described later in this document.
     87 
     88 2. Reserve a Descriptor Set Binding Slot - Modifies the value of the `VkPhysicalDeviceLimits::maxBoundDescriptorSets`
     89    property to return a value one less than the actual device's value to "reserve" a descriptor set binding slot for use by GPU validation.
     90 
     91    This option is likely only of interest to applications that dynamically adjust their descriptor set bindings to adjust for
     92    the limits of the device.
     93 
     94 ### Enabling and Specifying Options with a Configuration File
     95 
     96 The existing layer configuration file mechanism can be used to enable GPU-Assisted Validation.
     97 This mechanism is described on the
     98 [LunarXchange website](https://vulkan.lunarg.com/doc/sdk/latest/windows/layer_configuration.html),
     99 in the "Layers Overview and Configuration" document.
    100 
    101 To turn on GPU validation, add the following to your layer settings file, which is often
    102 named `vk_layer_settings.txt`.
    103 
    104 ```code
    105 lunarg_core_validation.gpu_validation = all
    106 ```
    107 
    108 To turn on GPU validation and request to reserve a binding slot:
    109 
    110 ```code
    111 lunarg_core_validation.gpu_validation = all,reserve_binding_slot
    112 ```
    113 
    114 Some platforms do not support configuration of the validation layers with this configuration file.
    115 Programs running on these platforms must then use the programmatic interface.
    116 
    117 ### Enabling and Specifying Options with the Programmatic Interface
    118 
    119 The `VK_EXT_validation_features` extension can be used to enable GPU-Assisted Validation at CreateInstance time.
    120 
    121 Here is sample code illustrating how to enable it:
    122 
    123 ```C
    124 VkValidationFeatureEnableEXT enables[] = {VK_VALIDATION_FEATURE_ENABLE_GPU_ASSISTED_EXT};
    125 VkValidationFeaturesEXT features = {};
    126 features.sType = VK_STRUCTURE_TYPE_VALIDATION_FEATURES_EXT;
    127 features.enabledValidationFeatureCount = 1;
    128 features.pEnabledValidationFeatures = enables;
    129 
    130 VkInstanceCreateInfo info = {};
    131 info.pNext = &features;
    132 ```
    133 
    134 Use the `VK_VALIDATION_FEATURE_ENABLE_GPU_ASSISTED_RESERVE_BINDING_SLOT_EXT` enum to reserve a binding slot.
    135 
    136 ## GPU-Assisted Validation Limitations
    137 
    138 There are several limitations that may impede the operation of GPU-Assisted Validation:
    139 
    140 ### Vulkan 1.1
    141 
    142 Vulkan 1.1 or later is required because the GPU instrumentation code uses SPIR-V 1.3 features.
    143 Vulkan 1,1 is required to ensure that SPIR-V 1.3 is available.
    144 
    145 ### Descriptor Types
    146 
    147 The current implementation works with image and texel descriptor types.
    148 A complete list appears later in this document.
    149 
    150 ### Descriptor Set Binding Limit
    151 
    152 This is probably the most important limitation and is related to the
    153 `VkPhysicalDeviceLimits::maxBoundDescriptorSets` device limit.
    154 
    155 When applications use all the available descriptor set binding slots,
    156 GPU-Assisted Validation cannot be performed because it needs a descriptor set to
    157 locate the memory for writing the error report record.
    158 
    159 This problem is most likely to occur on devices, often mobile, that support only the
    160 minimum required value for `VkPhysicalDeviceLimits::maxBoundDescriptorSets`, which is 4.
    161 Some applications may be written to use 4 slots since this is the highest value that
    162 is guaranteed by the specification.
    163 When such an application using 4 slots runs on a device with only 4 slots,
    164 then GPU-Assisted Validation cannot be performed.
    165 
    166 In this implementation, this condition is detected and gracefully recovered from by
    167 building the graphics pipeline with non-instrumented shaders instead of instrumented ones.
    168 An error message is also displayed informing the user of the condition.
    169 
    170 Applications don't have many options in this situation and it is anticipated that
    171 changing the application to free a slot is difficult.
    172 
    173 ### Device Memory
    174 
    175 GPU-Assisted Validation does allocate device memory for the error report buffers.
    176 This can lead to a greater chance of memory exhaustion, especially in cases where
    177 the application is trying to use all of the available memory.
    178 The extra memory allocations are also not visible to the application, making it
    179 impossible for the application to account for them.
    180 
    181 If GPU-Assisted Validation device memory allocations fail, the device could become
    182 unstable because some previously-built pipelines may contain instrumented shaders.
    183 This is a condition that is nearly impossible to recover from, so the layer just
    184 prints an error message and refrains from any further allocations or instrumentations.
    185 There is a reasonable chance to recover from these conditions,
    186 especially if the instrumentation does not write any error records.
    187 
    188 ### Descriptors
    189 
    190 This is roughly the same problem as the device memory problem mentioned above,
    191 but for descriptors.
    192 Any failure to allocate a descriptor set means that the instrumented shader code
    193 won't have a place to write error records, resulting in unpredictable device
    194 behavior.
    195 
    196 ### Other Device Limits
    197 
    198 This implementation uses additional resources that may count against the following limits,
    199 and possibly others:
    200 
    201 * `maxMemoryAllocationCount`
    202 * `maxBoundDescriptorSets`
    203 * `maxPerStageDescriptorStorageBuffers`
    204 * `maxPerStageResources`
    205 * `maxDescriptorSetStorageBuffers`
    206 * `maxFragmentCombinedOutputResources`
    207 
    208 The implementation does not take steps to avoid exceeding these limits
    209 and does not update the tracking performed by other validation functions.
    210 
    211 ### A Note About the `VK_EXT_buffer_device_address` Extension
    212 
    213 The recently introduced `VK_EXT_buffer_device_address` extension can be used
    214 to implement GPU-Assisted Validation without some of the limitations described above.
    215 This approach would use this extension to obtain a GPU device pointer to a storage
    216 buffer and make it available to the shader via a specialization constant.
    217 This technique removes the need to create descriptors, use a descriptor set slot,
    218 modify pipeline layouts, etc, and would relax some of the limitations listed above.
    219 
    220 This alternate implementation is under consideration.
    221 
    222 ## GPU-Assisted Validation Internal Design
    223 
    224 This section may be of interest to readers who are interested on how GPU-Assisted Validation is implemented.
    225 It isn't necessarily required for using the feature.
    226 
    227 ### General
    228 
    229 In general, the implementation does:
    230 
    231 * For each draw call, allocate a block of device memory to hold a single debug output record written by the
    232     instrumented shader code.
    233     There is a device memory manager to handle this efficiently.
    234 
    235     There is probably little advantage in providing a larger buffer in order to obtain more debug records.
    236     It is likely, especially for fragment shaders, that multiple errors occurring near each other have the same root cause.
    237 
    238     A block is allocated on a per draw basis to make it possible to associate a shader debug error record with
    239     a draw within a command buffer.
    240     This is done partly to give the user more information in the error report, namely the command buffer handle/name and the draw within that command buffer.
    241     An alternative design allocates this block on a per-device or per-queue basis and should work.
    242     However, it is not possible to identify the command buffer that causes the error if multiple command buffers
    243     are submitted at once.
    244 * For each draw call, allocate a descriptor set and update it to point to the block of device memory just allocated.
    245     There is a descriptor set manager to handle this efficiently.
    246     Also make an additional call down the chain to create a bind descriptor set command to bind our descriptor set at the desired index.
    247     This has the effect of binding the device memory block belonging to this draw so that the GPU instrumentation
    248     writes into this buffer for when the draw is executed.
    249     The end result is that each draw call has its own device memory block containing GPU instrumentation error
    250     records, if any occurred while executing that draw.
    251 * Determine the descriptor set binding index that is eventually used to bind the descriptor set just allocated and updated.
    252     Usually, it is `VkPhysicalDeviceLimits::maxBoundDescriptorSets` minus one.
    253     For devices that have a very high or no limit on this bound, pick an index that isn't too high, but above most other device
    254     maxima such as 32.
    255 * When creating a ShaderModule, pass the SPIR-V bytecode to the SPIR-V optimizer to perform the instrumentation pass.
    256     Pass the desired descriptor set binding index to the optimizer via a parameter so that the instrumented
    257     code knows which descriptor to use for writing error report data to the memory block.
    258     Use the instrumented bytecode to create the ShaderModule.
    259 * For all pipeline layouts, add our descriptor set to the layout, at the binding index determined earlier.
    260     Fill any gaps with empty descriptor sets.
    261 
    262     If the incoming layout already has a descriptor set placed at our desired index, the layer must not add its
    263     descriptor set to the layout, replacing the one in the incoming layout.
    264     Instead, the layer leaves the layout alone and later replaces the instrumented shaders with
    265     non-instrumented ones when the pipeline layout is later used to create a graphics pipeline.
    266     The layer issues an error message to report this condition.
    267 * When creating a GraphicsPipeline, check to see if the pipeline is using the debug binding index.
    268     If it is, replace the instrumented shaders in the pipeline with non-instrumented ones.
    269 * After calling QueueSubmit, perform a wait on the queue to allow the queue to finish executing.
    270     Then map and examine the device memory block for each draw that was submitted.
    271     If any debug record is found, generate a validation error message for each record found.
    272 
    273 The above describes only the high-level details of GPU-Assisted Validation operation.
    274 More detail is found in the discussion of the individual hooked functions below.
    275 
    276 ### Initialization
    277 
    278 When the core validation layer loads, it examines the user options from both the layer settings file and the
    279 `VK_EXT_validation_features` extension.
    280 Note that it also processes the subsumed `VK_EXT_validation_flags` extension for simple backwards compatibility.
    281 From these options, the layer sets instance-scope flags in the core validation layer tracking data to indicate if
    282 GPU-Assisted Validation has been requested, along with any other associated options.
    283 
    284 ### "Calling Down the Chain"
    285 
    286 Much of the GPU-Assisted Validation implementation involves making "application level" Vulkan API
    287 calls outside of the application's API usage to create resources and perform its required operations
    288 inside of the core validation layer.
    289 These calls are not routed up through the top of the loader/layer/driver call stack via the loader.
    290 Instead, they are simply dispatched via the core validation layer's dispatch table.
    291 
    292 These calls therefore don't pass through core validation or any other validation layers that may be
    293 loaded/dispatched prior to code validation.
    294 This doesn't present any particular problem, but it does raise some issues:
    295 
    296 * The additional API calls are not fully validated
    297 
    298   This implies that this additional code may never be checked for validation errors.
    299   To address this, the code can "just" be written carefully so that it is "valid" Vulkan,
    300   which is hard to do.
    301 
    302   Or, this code can be checked by loading a core validation layer with
    303   GPU validation enabled on top of "normal" standard validation in the
    304   layer stack, which effectively validates the API usage of this code.
    305   This sort of checking is performed by layer developers to check that the additional
    306   Vulkan usage is valid.
    307 
    308   This validation can be accomplished by:
    309   
    310   * Building the core validation layer with a hack to force GPU-Assisted Validation to be enabled.
    311   Can't use the exposed mechanisms because we probably don't want it on twice.
    312   * Rename this layer binary to something else like "core_validation2" to keep it apart from the
    313   "normal" core validation.
    314   * Create a new JSON file with the new layer name.
    315   * Set up the layer stack so that the "core_validation2" layer is on top of or before the standard validation
    316   layer
    317   * Then run tests and check for validation errors pointing to API usage in the "core_validation2" layer.
    318 
    319   This should only need to be done after making any major changes to the implementation.
    320 
    321   Another approach involves capturing an application trace with `vktrace` and then playing
    322   it back with `vkreplay`.
    323 
    324 * The additional API calls are not state-tracked
    325 
    326   This means that things like device memory allocations and descriptor allocations are not
    327   tracked and do not show up in any of the bookkeeping performed by the validation layers.
    328   For example, any device memory allocation performed by GPU-Assisted Validation won't be
    329   counted towards the maximum number of allocations allowed by a device.
    330   This could lead to an early allocation failure that is not accompanied by a validation error.
    331 
    332   This shortcoming is left as not addressed in this implementation because it is anticipated that
    333   a later implementation of GPU-Assisted Validation using the `VK_EXT_buffer_device_address`
    334   extension will have less of a need to allocate these
    335   tracked resources and it therefore becomes less of an issue.
    336 
    337 ### Code Structure and Relationship to the Core Validation Layer
    338 
    339 The GPU-Assisted Validation code is largely contained in one
    340 [file](https://github.com/KhronosGroup/Vulkan-ValidationLayers/blob/master/layers/gpu_validation.cpp), with "hooks" in
    341 the other core validation code that call functions in this file.
    342 These hooks in the core validation code look something like this:
    343 
    344 ```C
    345 if (GetEnables(dev_data)->gpu_validation) {
    346     GpuPreCallRecordDestroyPipeline(dev_data, pipeline_state);
    347 }
    348 ```
    349 
    350 The GPU-Assisted Validation code is linked into the shared library for the core validation layer.
    351 
    352 #### Review of Core Validation Code Structure
    353 
    354 Each function for a Vulkan API command intercepted in the core validation layer is usually split up
    355 into several decomposed functions in order to organize the implementation.
    356 These functions take the form of:
    357 
    358 * PreCallValidate&lt;foo&gt;: Perform validation steps before calling down the chain
    359 * PostCallValidate&lt;foo&gt;: Perform validation steps after calling down the chain
    360 * PreCallRecord&lt;foo&gt;: Perform state recording before calling down the chain
    361 * PostCallRecord&lt;foo&gt;: Perform state recording after calling down the chain
    362 
    363 The GPU-Assisted Validation functions follow this pattern not by hooking into the top-level core validation API shim, but
    364 by hooking one of these decomposed functions.
    365 In a few unusual cases, the GPU-Assisted Validation function "takes over" the call to the driver (down the chain) and so
    366 must hook the top-level API shim.
    367 These functions deviate from the above naming convention to make their purpose more evident.
    368 
    369 The design of each hooked function follows:
    370 
    371 #### GpuPreCallRecordCreateDevice
    372 
    373 * Modify the `VkPhysicalDeviceFeatures` to turn on two additional physical device features:
    374   * `fragmentStoresAndAtomics`
    375   * `vertexPipelineStoresAndAtomics`
    376 
    377 #### GpuPostCallRecordCreateDevice
    378 
    379 * Determine and record (save in device state) the desired descriptor set binding index.
    380 * Initialize device memory manager
    381   * Determine error record block size based on the maximum size of the error record and alignment limits of the device.
    382 * Initialize descriptor set manager
    383 * Make a descriptor set layout to describe our descriptor set
    384 * Make a descriptor set layout to describe a "dummy" descriptor set that contains no descriptors
    385   * This is used to "pad" pipeline layouts to fill any gaps between the used bind indices and our bind index
    386 * Record these objects in the per-device state
    387 
    388 #### GpuPreCallRecordDestroyDevice
    389 
    390 * Destroy descriptor set layouts created in CreateDevice
    391 * Clean up descriptor set manager
    392 * Clean up device memory manager
    393 * Clean up device state
    394 
    395 #### GpuAllocateValidationResources
    396 
    397 * For each Draw or Dispatch call:
    398   * Get a descriptor set from the descriptor set manager
    399   * Get a device memory block from the device memory manager
    400   * Update (write) the descriptor set with the memory info
    401   * Check to see if the layout for the pipeline just bound is using our selected bind index
    402   * If no conflict, add an additional command to the command buffer to bind our descriptor set at our selected index
    403 * Record the above objects in the per-CB state
    404 Note that the Draw and Dispatch calls include vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect, vkCmdDispatch, and vkCmdDispatchIndirect. 
    405 
    406 #### GpuPreCallRecordFreeCommandBuffers
    407 
    408 * For each command buffer:
    409   * Give the memory blocks back to the device memory manager
    410   * Give the descriptor sets back to the descriptor set manager
    411   * Clean up CB state
    412 
    413 #### GpuOverrideDispatchCreateShaderModule
    414 
    415 This function is called from CreateShaderModule and can't really be called from one of the decomposed functions
    416 because it replaces the SPIR-V, which requires modifying the bytecode passed down to the driver.
    417 This routine sets up to call the SPIR-V optimizer to run the "BindlessCheckPass", replacing the original SPIR-V with the instrumented SPIR-V
    418 which is then used in the call down the chain to CreateShaderModule.
    419 
    420 This function generates a "unique shader ID" that is passed to the SPIR-V optimizer,
    421 which the instrumented code puts in the debug error record to identify the shader.
    422 This ID is returned by this function so it can be recorded in the shader module at PostCallRecord time.
    423 It would have been convenient to use the shader module handle returned from the driver to use as this shader ID.
    424 But the shader needs to be instrumented before creating the shader module and therefore the handle is not available to use
    425 as this ID to pass to the optimizer.
    426 Therefore, the layer keeps a "counter" in per-device state that is incremented each time a shader is instrumented
    427 to generate unique IDs.
    428 This unique ID is given to the SPIR-V optimizer and is stored in the shader module state tracker after the shader module is created, which creates the necessary association between the ID and the shader module.
    429 
    430 The process of instrumenting the SPIR-V also includes passing the selected descriptor set binding index
    431 to the SPIR-V optimizer which the instrumented
    432 code uses to locate the memory block used to write the debug error record.
    433 An instrumented shader is now "hard-wired" to write error records via the descriptor set at that binding
    434 if it detects an error.
    435 This implies that the instrumented shaders should only be allowed to run when the correct bindings are in place.
    436 
    437 The original SPIR-V bytecode is left stored in the shader module tracking data.
    438 This is important because the layer may need to replace the instrumented shader with the original shader if, for example,
    439 there is a binding index conflict.
    440 The application cannot destroy the shader module until it has used the shader module to create the pipeline.
    441 This ensures that the original SPIR-V bytecode is available if we need it to replace the instrumented shader.
    442 
    443 #### GpuOverrideDispatchCreatePipelineLayout
    444 
    445 This is another function that replaces the parameters and so can't be called from a decomposed function.
    446 
    447 * Check for a descriptor set binding index conflict.
    448   * If there is one, issue an error message and leave the pipeline layout unmodified
    449   * If no conflict, for each pipeline layout:
    450     * Create a new pipeline layout
    451     * Copy the original descriptor set layouts into the new pipeline layout
    452     * Pad the new pipeline layout with dummy descriptor set layouts up to but not including the last one
    453     * Add our descriptor set layout as the last one in the new pipeline layout
    454 * Create the pipeline layouts by calling down the chain with the original or modified create info
    455 
    456 
    457 #### GpuPostCallQueueSubmit
    458 
    459 * Submit a command buffer containing a memory barrier to make GPU writes available to the host domain.
    460 * Call QueueWaitIdle.
    461 * For each primary and secondary command buffer in the submission:
    462   * Call a helper function to process the instrumentation debug buffers (described later)
    463 
    464 #### GpuPreCallValidateCmdWaitEvents
    465 
    466 * Report an error about a possible deadlock if CmdWaitEvents is recorded with VK_PIPELINE_STAGE_HOST_BIT set.
    467 
    468 #### GpuPreCallRecordCreateGraphicsPipelines
    469 
    470 * Examine the pipelines to see if any use the debug descriptor set binding index
    471 * For those that do:
    472   * Create non-instrumented shader modules from the saved original SPIR-V
    473   * Modify the CreateInfo data to use these non-instrumented shaders.
    474     * This prevents instrumented shaders from using the application's descriptor set.
    475 
    476 #### GpuPostCallRecordCreateGraphicsPipelines
    477 
    478 * For every shader in the pipeline:
    479   * Destroy the shader module created in GpuPreCallRecordCreateGraphicsPipelines, if any
    480     * These are found in the CreateInfo used to create the pipeline and not in the shader_module
    481   * Create a shader tracking record that saves:
    482     * shader module handle
    483     * unique shader id
    484     * graphics pipeline handle
    485     * shader bytecode if it contains debug info
    486 
    487 This tracker is used to attach the shader bytecode to the shader in case it is needed
    488 later to get the shader source code debug info.
    489 
    490 The current shader module tracker in core validation stores the bytecode,
    491 but this tracker has the same life cycle as the shader module itself.
    492 It is possible for the application to destroy the shader module after
    493 creating graphics pipeline and before submitting work that uses the shader,
    494 making the shader bytecode unavailable if needed for later analysis.
    495 Therefore, the bytecode must be saved at this opportunity.
    496 
    497 This tracker exists as long as the graphics pipeline exists,
    498 so the graphics pipeline handle is also stored in this tracker so that it can
    499 be looked up when the graphics pipeline is destroyed.
    500 At that point, it is safe to free the bytecode since the pipeline is never used again.
    501 
    502 #### GpuPreCallRecordDestroyPipeline
    503 
    504 * Find the shader tracker(s) with the graphics pipeline handle and free the tracker, along with any bytecode it has stored in it.
    505 
    506 ### Shader Instrumentation Scope
    507 
    508 The shader instrumentation process performed by the SPIR-V optimizer applies descriptor index bounds checking
    509 to descriptors of the following types:
    510 
    511     VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
    512     VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE
    513     VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
    514     VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER
    515     VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER
    516 
    517 Instrumentation is applied to the following SPIR-V operations:
    518 
    519     OpImageSampleImplicitLod
    520     OpImageSampleExplicitLod
    521     OpImageSampleDrefImplicitLod
    522     OpImageSampleDrefExplicitLod
    523     OpImageSampleProjImplicitLod
    524     OpImageSampleProjExplicitLod
    525     OpImageSampleProjDrefImplicitLod
    526     OpImageSampleProjDrefExplicitLod
    527     OpImageGather
    528     OpImageDrefGather
    529     OpImageQueryLod
    530     OpImageSparseSampleImplicitLod
    531     OpImageSparseSampleExplicitLod
    532     OpImageSparseSampleDrefImplicitLod
    533     OpImageSparseSampleDrefExplicitLod
    534     OpImageSparseSampleProjImplicitLod
    535     OpImageSparseSampleProjExplicitLod
    536     OpImageSparseSampleProjDrefImplicitLod
    537     OpImageSparseSampleProjDrefExplicitLod
    538     OpImageSparseGather
    539     OpImageSparseDrefGather
    540     OpImageFetch
    541     OpImageRead
    542     OpImageQueryFormat
    543     OpImageQueryOrder
    544     OpImageQuerySizeLod
    545     OpImageQuerySize
    546     OpImageQueryLevels
    547     OpImageQuerySamples
    548     OpImageSparseFetch
    549     OpImageSparseRead
    550     OpImageWrite
    551 
    552 ### Shader Instrumentation Error Record Format
    553 
    554 The instrumented shader code generates "error records" in a specific format.
    555 
    556 This description includes the support for future GPU-Assisted Validation features
    557 such as checking for uninitialized descriptors in the partially-bound scenario.
    558 These items are not used in the current implementation for descriptor array
    559 bounds checking, but are provided here to complete the description of the
    560 error record format.
    561 
    562 The format of this buffer is as follows:
    563 
    564 ```C
    565 struct DebugOutputBuffer_t
    566 {
    567    uint DataWrittenLength;
    568    uint Data[];
    569 }
    570 ```
    571 
    572 `DataWrittenLength` is the number of uint32_t words that have been attempted to be written.
    573 It should be initialized to 0.
    574 
    575 The `Data` array is the uint32_t words written by the shaders of the pipeline to record bindless validation errors.
    576 All elements of `Data` should be initialized to 0.
    577 Note that the `Data` array has runtime length.
    578 The shader queries the length of the `Data` array to make sure that it does not write past the end of `Data`.
    579 The shader only writes complete records.
    580 The layer uses the length of `Data` to control the number of records written by the shaders.
    581 
    582 The `DataWrittenLength` is atomically updated by the shaders so that shaders do not overwrite each others data.
    583 The shader takes the value it gets from the atomic update.
    584 If the value plus the record length is greater than the length of `Data`, it does not write the record.
    585 
    586 Given this protocol, the value in `DataWrittenLength` is not very meaningful if it is greater than the length of `Data`.
    587 However, the format of the written records plus the fact that `Data` is initialized to 0 should be enough to determine
    588 the records that were written.
    589 
    590 ### Record Format
    591 
    592 The format of an output record is the following:
    593 
    594     Word 0: Record size
    595     Word 1: Shader ID
    596     Word 2: Instruction Index
    597     Word 3: Stage
    598     <Stage-Specific Words>
    599     <Validation-Specific Words>
    600 
    601 The Record Size is the number of words in this record, including the the Record Size.
    602 
    603 The Shader ID is a handle that was provided by the layer when the shader was instrumented.
    604 
    605 The Instruction Index is the instruction within the original function at which the error occurred.
    606 For bindless, this will be the instruction which consumes the descriptor in question,
    607 or the instruction that consumes the OpSampledImage that consumes the descriptor.
    608 
    609 The Stage is the integer value used in SPIR-V for each of the Graphics Execution Models:
    610 
    611 | Stage  | Value |
    612 |--------|:-----:|
    613 |Vertex  |0      |
    614 |TessCtrl|1      |
    615 |TessEval|2      |
    616 |Geometry|3      |
    617 |Fragment|4      |
    618 |Compute |5      |
    619 
    620 ### Stage Specific Words
    621 
    622 These are words that identify which "instance" of the shader the validation error occurred in.
    623 Here are words for each stage:
    624 
    625 | Stage  | Word 0           | Word 1     |
    626 |--------|------------------|------------|
    627 |Vertex  |VertexID          |InstanceID  |
    628 |Tess*   |InvocationID      |unused      |
    629 |Geometry|PrimitiveID       |InvocationID|
    630 |Fragment|FragCoord.x       |FragCoord.y |
    631 |Compute |GlobalInvocationID|unused      |
    632 
    633 "unused" means not relevant, but still present.
    634 
    635 ### Validation-Specific Words
    636 
    637 These are words that are specific to the validation being done.
    638 For bindless validation, they are variable.
    639 
    640 The first word is the Error Code.
    641 
    642 For the *OutOfBounds errors, two words will follow: Word0:DescriptorIndex, Word1:DescriptorArrayLength
    643 
    644 For the *Uninitialized errors, one word will follow: Word0:DescriptorIndex
    645 
    646 | Error                       | Code | Word 0         | Word 1                |
    647 |-----------------------------|:----:|----------------|-----------------------|
    648 |IndexOutOfBounds             |0     |Descriptor Index|Descriptor Array Length|
    649 |DescriptorUninitialized      |1     |Descriptor Index|unused                 |
    650 
    651 So the words written for an image descriptor bounds error in a fragment shader is:
    652 
    653     Word 0: Record size (9)
    654     Word 1: Shader ID
    655     Word 2: Instruction Index
    656     Word 3: Stage (4:Fragment)
    657     Word 4: FragCoord.x
    658     Word 5: FragCoord.y
    659     Word 6: Error (0: ImageIndexOutOfBounds)
    660     Word 7: DescriptorIndex
    661     Word 8: DescriptorArrayLength
    662 
    663 If another error is encountered, that record is written starting at Word 10, if the whole record will not overflow Data.
    664 If overflow will happen, no words are written..
    665 
    666 The validation layer can continue to read valid records until it sees a Record Length of 0 or the end of Data is reached.
    667 
    668 #### Programmatic interface
    669 
    670 The programmatic interface for the above informal description is codified in the
    671 [SPIRV-Tools](https://github.com/KhronosGroup/SPIRV-Tools) repository in file
    672 [`instrument.hpp`](https://github.com/KhronosGroup/SPIRV-Tools/blob/master/include/spirv-tools/instrument.hpp).
    673 It consists largely of integer constant definitions for the codes and values mentioned above and
    674 offsets into the record for locating each item.
    675 
    676 ## GPU-Assisted Validation Error Report
    677 
    678 This is a fairly simple process of mapping the debug report buffer associated with
    679 each draw in the command buffer that was just submitted and looking to see if the GPU instrumentation
    680 code wrote anything.
    681 Each draw in the command buffer should have a corresponding result buffer in the command buffer's list of result buffers.
    682 The report generating code loops through the result buffers, maps each of them, checks for errors, and unmaps them.
    683 The layer clears the buffer to zeros when it is allocated and after processing any
    684 buffer that was written to.
    685 The instrumented shader code expects these buffers to be cleared to zeros before it
    686 writes to them.
    687 
    688 The layer then prepares a "common" validation error message containing:
    689 
    690 * command buffer handle - This is easily obtained because we are looping over the command
    691   buffers just submitted.
    692 * draw number - keep track of how many draws we've processed for a given command buffer.
    693 * pipeline handle - The shader tracker discussed earlier contains this handle
    694 * shader module handle - The "Shader ID" (Word 1 in the record) is used to lookup
    695   the shader tracker which is then used to obtain the shader module and pipeline handles
    696 * instruction index - This is the SPIR-V instruction index where the invalid array access occurred.
    697   It is not that useful by itself, since the user would have to use it to locate a SPIR-V instruction
    698   in a SPIR-V disassembly and somehow relate it back to the shader source code.
    699   But it could still be useful to some and it is easy to report.
    700   The user can build the shader with debug information to get source-level information.
    701 
    702 For all objects, the layer also looks up the objects in the Debug Utils object name map in
    703 case the application used that extension to name any objects.
    704 If a name exists for that object, it is included in the error message.
    705 
    706 The layer then adds on error message text obtained from decoding the stage-specific and
    707 validation-specific data as described earlier.
    708 
    709 This completes the error report when there is no source-level debug information in the shader.
    710 
    711 ### Source-Level Debug Information
    712 
    713 This is one of the more complicated and code-heavy parts of the GPU-Assisted Validation feature
    714 and all it really does is display source-level information when the shader is compiled
    715 with debugging info (`-g` option in the case of `glslangValidator`).
    716 
    717 The process breaks down into two steps:
    718 
    719 #### OpLine Processing
    720 
    721 The SPIR-V generator (e.g., glslangValidator) places an OpLine SPIR-V instruction in the
    722 shader program ahead of code generated for each source code statement.
    723 The OpLine instruction contains the filename id (for an OpString),
    724 the source code line number and the source code column number.
    725 It is possible to have two source code statements on the same line in the source file,
    726 which explains the need for the column number.
    727 
    728 The layer scans the SPIR-V looking for the last OpLine instruction that appears before the instruction
    729 at the instruction index obtained from the debug report.
    730 This OpLine then contains the correct filename id, line number, and column number of the
    731 statement causing the error.
    732 The filename itself is obtained by scanning the SPIR-V again for an OpString instruction that
    733 matches the id from the OpLine.
    734 This OpString contains the text string representing the filename.
    735 This information is added to the validation error message.
    736 
    737 For online compilation when there is no "file", only the line number information is reported.
    738 
    739 #### OpSource Processing
    740 
    741 The SPIR-V built with source-level debug info also contains OpSource instructions that
    742 have a string containing the source code, delimited by newlines.
    743 Due to possible pre-processing, the layer just cannot simply use the source file line number
    744 from the OpLine to index into this set of source code lines.
    745 
    746 Instead, the correct source code line is found by first locating the "#line" directive in the
    747 source that specifies a line number closest to and less than the source line number reported
    748 by the OpLine located in the previous step.
    749 The correct "#line" directive must also match its filename, if specified,
    750 with the filename from the OpLine.
    751 
    752 Then the difference between the "#line" line number and the OpLine line number is added
    753 to the place where the "#line" was found to locate the actual line of source, which is
    754 then added to the validation error message.
    755 
    756 For example, if the OpLine line number is 15, and there is a "#line 10" on line 40
    757 in the OpSource source, then line 45 in the OpSource contains the correct source line.
    758 
    759 ## GPU-Assisted Validation Testing
    760 
    761 Validation Layer Tests (VLTs) exist for GPU-Assisted Validation.
    762 They cannot be run with the "mock ICD" in headless CI environments because they need to
    763 actually execute shaders.
    764 But they are still useful to run on real devices to check for regressions.
    765 
    766 There isn't anything else that remarkable or different about these tests.
    767 They activate GPU-Assisted Validation via the programmatic
    768 interface as described earlier.
    769 
    770 The tests exercise the extraction of source code information when the shader
    771 is built with debug info.
    772