Home | History | Annotate | Download | only in specs
      1 Name
      2 
      3     MESA_shader_integer_functions
      4 
      5 Name Strings
      6 
      7     GL_MESA_shader_integer_functions
      8 
      9 Contact
     10 
     11     Ian Romanick <ian.d.romanick (a] intel.com>
     12 
     13 Contributors
     14 
     15     All the contributors of GL_ARB_gpu_shader5
     16 
     17 Status
     18 
     19     Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
     20 
     21 Version
     22 
     23     Version 2, July 7, 2016
     24 
     25 Number
     26 
     27     TBD
     28 
     29 Dependencies
     30 
     31     This extension is written against the OpenGL 3.2 (Compatibility Profile)
     32     Specification.
     33 
     34     This extension is written against Version 1.50 (Revision 09) of the OpenGL
     35     Shading Language Specification.
     36 
     37     GLSL 1.30 is required.
     38 
     39     This extension interacts with ARB_gpu_shader5.
     40 
     41     This extension interacts with ARB_gpu_shader_fp64.
     42 
     43     This extension interacts with NV_gpu_shader5.
     44 
     45 Overview
     46 
     47     GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
     48     added functionality requires significant hardware support.  There are many
     49     aspects, however, that can be easily implmented on any GPU with "real"
     50     integer support (as opposed to simulating integers using floating point
     51     calculations).
     52 
     53     This extension provides a set of new features to the OpenGL Shading
     54     Language to support capabilities of these GPUs, extending the capabilities
     55     of version 1.30 of the OpenGL Shading Language.  Shaders
     56     using the new functionality provided by this extension should enable this
     57     functionality via the construct
     58 
     59       #extension GL_MESA_shader_integer_functions : require   (or enable)
     60 
     61     This extension provides a variety of new features for all shader types,
     62     including:
     63 
     64       * support for implicitly converting signed integer types to unsigned
     65         types, as well as more general implicit conversion and function
     66         overloading infrastructure to support new data types introduced by
     67         other extensions;
     68 
     69       * new built-in functions supporting:
     70 
     71         * splitting a floating-point number into a significand and exponent
     72           (frexp), or building a floating-point number from a significand and
     73           exponent (ldexp);
     74 
     75         * integer bitfield manipulation, including functions to find the
     76           position of the most or least significant set bit, count the number
     77           of one bits, and bitfield insertion, extraction, and reversal;
     78 
     79         * extended integer precision math, including add with carry, subtract
     80           with borrow, and extenended multiplication;
     81 
     82     The resulting extension is a strict subset of GL_ARB_gpu_shader5.
     83 
     84 IP Status
     85 
     86     No known IP claims.
     87 
     88 New Procedures and Functions
     89 
     90     None
     91 
     92 New Tokens
     93 
     94     None
     95 
     96 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
     97 (OpenGL Operation)
     98 
     99     None.
    100 
    101 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
    102 (Rasterization)
    103 
    104     None.
    105 
    106 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
    107 (Per-Fragment Operations and the Frame Buffer)
    108 
    109     None.
    110 
    111 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
    112 (Special Functions)
    113 
    114     None.
    115 
    116 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
    117 (State and State Requests)
    118 
    119     None.
    120 
    121 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
    122 Specification (Invariance)
    123 
    124     None.
    125 
    126 Additions to the AGL/GLX/WGL Specifications
    127 
    128     None.
    129 
    130 Modifications to The OpenGL Shading Language Specification, Version 1.50
    131 (Revision 09)
    132 
    133     Including the following line in a shader can be used to control the
    134     language features described in this extension:
    135 
    136       #extension GL_MESA_shader_integer_functions : <behavior>
    137 
    138     where <behavior> is as specified in section 3.3.
    139 
    140     New preprocessor #defines are added to the OpenGL Shading Language:
    141 
    142       #define GL_MESA_shader_integer_functions        1
    143 
    144 
    145     Modify Section 4.1.10, Implicit Conversions, p. 27
    146 
    147     (modify table of implicit conversions)
    148 
    149                                 Can be implicitly
    150         Type of expression        converted to
    151         ---------------------   -----------------
    152         int                     uint, float
    153         ivec2                   uvec2, vec2
    154         ivec3                   uvec3, vec3
    155         ivec4                   uvec4, vec4
    156 
    157         uint                    float
    158         uvec2                   vec2
    159         uvec3                   vec3
    160         uvec4                   vec4
    161 
    162     (modify second paragraph of the section) No implicit conversions are
    163     provided to convert from unsigned to signed integer types or from
    164     floating-point to integer types.  There are no implicit array or structure
    165     conversions.
    166 
    167     (insert before the final paragraph of the section) When performing
    168     implicit conversion for binary operators, there may be multiple data types
    169     to which the two operands can be converted.  For example, when adding an
    170     int value to a uint value, both values can be implicitly converted to uint
    171     and float.  In such cases, a floating-point type is chosen if either
    172     operand has a floating-point type.  Otherwise, an unsigned integer type is
    173     chosen if either operand has an unsigned integer type.  Otherwise, a
    174     signed integer type is chosen.
    175     
    176 
    177     Modify Section 5.9, Expressions, p. 57
    178 
    179     (modify bulleted list as follows, adding support for implicit conversion
    180     between signed and unsigned types)
    181 
    182     Expressions in the shading language are built from the following:
    183 
    184     * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
    185       types, and all matrix types.
    186 
    187     ...
    188 
    189     * The operator modulus (%) operates on signed or unsigned integer scalars
    190       or vectors.  If the fundamental types of the operands do not match, the
    191       conversions from Section 4.1.10 "Implicit Conversions" are applied to
    192       produce matching types.  ...
    193 
    194 
    195     Modify Section 6.1, Function Definitions, p. 63
    196 
    197     (modify description of overloading, beginning at the top of p. 64)
    198 
    199      Function names can be overloaded.  The same function name can be used for
    200      multiple functions, as long as the parameter types differ.  If a function
    201      name is declared twice with the same parameter types, then the return
    202      types and all qualifiers must also match, and it is the same function
    203      being declared.  For example,
    204 
    205        vec4 f(in vec4 x, out vec4  y);   // (A)
    206        vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
    207        vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
    208 
    209        int  f(in vec4 x, out ivec4 y);  // error, only return type differs
    210        vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
    211        vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
    212 
    213      When function calls are resolved, an exact type match for all the
    214      arguments is sought.  If an exact match is found, all other functions are
    215      ignored, and the exact match is used.  If no exact match is found, then
    216      the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
    217      applied to find a match.  Mismatched types on input parameters (in or
    218      inout or default) must have a conversion from the calling argument type
    219      to the formal parameter type.  Mismatched types on output parameters (out
    220      or inout) must have a conversion from the formal parameter type to the
    221      calling argument type.
    222 
    223      If implicit conversions can be used to find more than one matching
    224      function, a single best-matching function is sought.  To determine a best
    225      match, the conversions between calling argument and formal parameter
    226      types are compared for each function argument and pair of matching
    227      functions.  After these comparisons are performed, each pair of matching
    228      functions are compared.  A function definition A is considered a better
    229      match than function definition B if:
    230 
    231        * for at least one function argument, the conversion for that argument
    232          in A is better than the corresponding conversion in B; and
    233 
    234        * there is no function argument for which the conversion in B is better
    235          than the corresponding conversion in A.
    236 
    237      If a single function definition is considered a better match than every
    238      other matching function definition, it will be used.  Otherwise, a
    239      semantic error occurs and the shader will fail to compile.
    240 
    241      To determine whether the conversion for a single argument in one match is
    242      better than that for another match, the following rules are applied, in
    243      order:
    244 
    245        1. An exact match is better than a match involving any implicit
    246           conversion.
    247 
    248        2. A match involving an implicit conversion from float to double is
    249           better than a match involving any other implicit conversion.
    250 
    251        3. A match involving an implicit conversion from either int or uint to
    252           float is better than a match involving an implicit conversion from
    253           either int or uint to double.
    254 
    255      If none of the rules above apply to a particular pair of conversions,
    256      neither conversion is considered better than the other.
    257 
    258      For the function prototypes (A), (B), and (C) above, the following
    259      examples show how the rules apply to different sets of calling argument
    260      types:
    261 
    262        f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
    263        f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
    264        f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
    265                              //   (C) not relevant, can't convert vec4 to 
    266                              //   ivec4.  (A) better than (B) for 2nd
    267                              //   argument (rule 2), same on first argument.
    268        f(ivec4, vec4);       // NOT matched.  All three match by implicit
    269                              //   conversion.  (C) is better than (A) and (B)
    270                              //   on the first argument.  (A) is better than
    271                              //   (B) and (C).
    272 
    273 
    274     Modify Section 8.3, Common Functions, p. 84
    275 
    276     (add support for single-precision frexp and ldexp functions)
    277 
    278     Syntax:
    279 
    280       genType frexp(genType x, out genIType exp);
    281       genType ldexp(genType x, in genIType exp);
    282 
    283     The function frexp() splits each single-precision floating-point number in
    284     <x> into a binary significand, a floating-point number in the range [0.5,
    285     1.0), and an integral exponent of two, such that:
    286 
    287       x = significand * 2 ^ exponent
    288 
    289     The significand is returned by the function; the exponent is returned in
    290     the parameter <exp>.  For a floating-point value of zero, the significant
    291     and exponent are both zero.  For a floating-point value that is an
    292     infinity or is not a number, the results of frexp() are undefined.  
    293 
    294     If the input <x> is a vector, this operation is performed in a
    295     component-wise manner; the value returned by the function and the value
    296     written to <exp> are vectors with the same number of components as <x>.
    297 
    298     The function ldexp() builds a single-precision floating-point number from
    299     each significand component in <x> and the corresponding integral exponent
    300     of two in <exp>, returning:
    301 
    302       significand * 2 ^ exponent
    303 
    304     If this product is too large to be represented as a single-precision
    305     floating-point value, the result is considered undefined.
    306 
    307     If the input <x> is a vector, this operation is performed in a
    308     component-wise manner; the value passed in <exp> and returned by the
    309     function are vectors with the same number of components as <x>.
    310 
    311 
    312     (add support for new integer built-in functions)
    313 
    314     Syntax:
    315 
    316       genIType bitfieldExtract(genIType value, int offset, int bits);
    317       genUType bitfieldExtract(genUType value, int offset, int bits);
    318 
    319       genIType bitfieldInsert(genIType base, genIType insert, int offset, 
    320                               int bits);
    321       genUType bitfieldInsert(genUType base, genUType insert, int offset, 
    322                               int bits);
    323 
    324       genIType bitfieldReverse(genIType value);
    325       genUType bitfieldReverse(genUType value);
    326 
    327       genIType bitCount(genIType value);
    328       genIType bitCount(genUType value);
    329 
    330       genIType findLSB(genIType value);
    331       genIType findLSB(genUType value);
    332 
    333       genIType findMSB(genIType value);
    334       genIType findMSB(genUType value);
    335 
    336     The function bitfieldExtract() extracts bits <offset> through
    337     <offset>+<bits>-1 from each component in <value>, returning them in the
    338     least significant bits of corresponding component of the result.  For
    339     unsigned data types, the most significant bits of the result will be set
    340     to zero.  For signed data types, the most significant bits will be set to
    341     the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
    342     zero.  The result will be undefined if <offset> or <bits> is negative, or
    343     if the sum of <offset> and <bits> is greater than the number of bits used
    344     to store the operand.  Note that for vector versions of bitfieldExtract(),
    345     a single pair of <offset> and <bits> values is shared for all components.
    346 
    347     The function bitfieldInsert() inserts the <bits> least significant bits of
    348     each component of <insert> into the corresponding component of <base>.
    349     The result will have bits numbered <offset> through <offset>+<bits>-1
    350     taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
    351     directly from the corresponding bits of <base>.  If <bits> is zero, the
    352     result will simply be <base>.  The result will be undefined if <offset> or
    353     <bits> is negative, or if the sum of <offset> and <bits> is greater than
    354     the number of bits used to store the operand.  Note that for vector
    355     versions of bitfieldInsert(), a single pair of <offset> and <bits> values
    356     is shared for all components.
    357 
    358     The function bitfieldReverse() reverses the bits of <value>.  The bit
    359     numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
    360     <value>, where <bits> is the total number of bits used to represent
    361     <value>.
    362 
    363     The function bitCount() returns the number of one bits in the binary
    364     representation of <value>.
    365 
    366     The function findLSB() returns the bit number of the least significant one
    367     bit in the binary representation of <value>.  If <value> is zero, -1 will
    368     be returned.
    369 
    370     The function findMSB() returns the bit number of the most significant bit
    371     in the binary representation of <value>.  For positive integers, the
    372     result will be the bit number of the most significant one bit.  For
    373     negative integers, the result will be the bit number of the most
    374     significant zero bit.  For a <value> of zero or negative one, -1 will be
    375     returned.
    376 
    377 
    378     (support for unsigned integer add/subtract with carry-out)
    379 
    380     Syntax:
    381 
    382       genUType uaddCarry(genUType x, genUType y, out genUType carry);
    383       genUType usubBorrow(genUType x, genUType y, out genUType borrow);
    384 
    385     The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
    386     <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
    387     the sum was less than 2^32, or one otherwise.
    388 
    389     The function usubBorrow() subtracts the 32-bit unsigned integer or vector
    390     <y> from <x>, returning the difference if non-negative or 2^32 plus the
    391     difference, otherwise.  The value <borrow> is set to zero if x >= y, or
    392     one otherwise.
    393 
    394 
    395     (support for signed and unsigned multiplies, with 32-bit inputs and a
    396      64-bit result spanning two 32-bit outputs)
    397 
    398     Syntax:
    399 
    400       void umulExtended(genUType x, genUType y, out genUType msb, 
    401                         out genUType lsb);
    402       void imulExtended(genIType x, genIType y, out genIType msb,
    403                         out genIType lsb);
    404 
    405     The functions umulExtended() and imulExtended() multiply 32-bit unsigned
    406     or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
    407     32 least significant bits are returned in <lsb>; the 32 most significant
    408     bits are returned in <msb>.
    409 
    410 
    411 GLX Protocol
    412 
    413     None.
    414 
    415 Dependencies on ARB_gpu_shader_fp64
    416 
    417     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
    418     of implicit conversions supported in the OpenGL Shading Language.  If more
    419     than one of these extensions is supported, an expression of one type may
    420     be converted to another type if that conversion is allowed by any of these
    421     specifications.
    422 
    423     If ARB_gpu_shader_fp64 or a similar extension introducing new data types
    424     is not supported, the function overloading rule in the GLSL specification
    425     preferring promotion an input parameters to smaller type to a larger type
    426     is never applicable, as all data types are of the same size.  That rule
    427     and the example referring to "double" should be removed.
    428 
    429 
    430 Dependencies on NV_gpu_shader5
    431 
    432     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
    433     of implicit conversions supported in the OpenGL Shading Language.  If more
    434     than one of these extensions is supported, an expression of one type may
    435     be converted to another type if that conversion is allowed by any of these
    436     specifications.
    437 
    438     If NV_gpu_shader5 is supported, integer data types are supported with four
    439     different precisions (8-, 16, 32-, and 64-bit) and floating-point data
    440     types are supported with three different precisions (16-, 32-, and
    441     64-bit).  The extension adds the following rule for output parameters,
    442     which is similar to the one present in this extension for input
    443     parameters:
    444 
    445        5. If the formal parameters in both matches are output parameters, a
    446           conversion from a type with a larger number of bits per component is
    447           better than a conversion from a type with a smaller number of bits
    448           per component.  For example, a conversion from an "int16_t" formal
    449           parameter type to "int"  is better than one from an "int8_t" formal
    450           parameter type to "int".
    451 
    452     Such a rule is not provided in this extension because there is no
    453     combination of types in this extension and ARB_gpu_shader_fp64 where this
    454     rule has any effect.
    455 
    456 
    457 Errors
    458 
    459     None
    460 
    461 
    462 New State
    463 
    464     None
    465 
    466 New Implementation Dependent State
    467 
    468     None
    469 
    470 Issues
    471 
    472     (1) What should this extension be called?
    473 
    474       UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
    475       some sort of a play on that name would be viable.  However, nothing in
    476       this extension should require SM5 hardware, so such a name would be a
    477       little misleading and weird.
    478 
    479       Since the primary purpose is to add integer related functions from
    480       GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
    481       for now.
    482 
    483     (2) Why is some of the formatting in this extension weird?
    484 
    485       RESOLVED: This extension is formatted to minimize the differences (as
    486       reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
    487       specification.
    488 
    489     (3) Should ldexp and frexp be included?
    490 
    491       RESOLVED: Yes.  Few GPUs have native instructions to implement these
    492       functions.  These are generally implemented using existing GLSL built-in
    493       functions and the other functions provided by this extension.
    494 
    495     (4) Should umulExtended and imulExtended be included?
    496 
    497       RESOLVED: Yes.  These functions should be implementable on any GPU that
    498       can support the rest of this extension, but the implementation may be
    499       complex.  The implementation on a GPU that only supports 32bit x 32bit =
    500       32bit multiplication would be quite expensive.  However, many GPUs
    501       (including OpenGL 4.0 GPUs that already support this function) have a
    502       32bit x 16bit = 48bit multiplier.  The implementation there is only
    503       trivially more expensive than regular 32bit multiplication.
    504 
    505     (5) Should the pack and unpack functions be included?
    506 
    507       RESOLVED: No.  These functions are already available via
    508       GL_ARB_shading_language_packing.
    509 
    510     (6) Should the "BitsTo" functions be included?
    511 
    512       RESOLVED: No.  These functions are already available via
    513       GL_ARB_shader_bit_encoding.
    514 
    515 Revision History
    516 
    517     Rev.      Date     Author    Changes
    518     ----  -----------  --------  -----------------------------------------
    519      2     7-Jul-2016  idr       Fix typo in #extension line
    520      1    20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.
    521