3
MESA_shader_integer_functions
7
GL_MESA_shader_integer_functions
11
Ian Romanick <ian.d.romanick@intel.com>
15
All the contributors of GL_ARB_gpu_shader5
19
Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
23
Version 3, March 31, 2017
31
This extension is written against the OpenGL 3.2 (Compatibility Profile)
34
This extension is written against Version 1.50 (Revision 09) of the OpenGL
35
Shading Language Specification.
37
GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
39
This extension interacts with ARB_gpu_shader5.
41
This extension interacts with ARB_gpu_shader_fp64.
43
This extension interacts with NV_gpu_shader5.
47
GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this
48
added functionality requires significant hardware support. There are many
49
aspects, however, that can be easily implmented on any GPU with "real"
50
integer support (as opposed to simulating integers using floating point
53
This extension provides a set of new features to the OpenGL Shading
54
Language to support capabilities of these GPUs, extending the
55
capabilities of version 1.30 of the OpenGL Shading Language and version
56
3.00 of the OpenGL ES Shading Language. Shaders using the new
57
functionality provided by this extension should enable this
58
functionality via the construct
60
#extension GL_MESA_shader_integer_functions : require (or enable)
62
This extension provides a variety of new features for all shader types,
65
* support for implicitly converting signed integer types to unsigned
66
types, as well as more general implicit conversion and function
67
overloading infrastructure to support new data types introduced by
70
* new built-in functions supporting:
72
* splitting a floating-point number into a significand and exponent
73
(frexp), or building a floating-point number from a significand and
76
* integer bitfield manipulation, including functions to find the
77
position of the most or least significant set bit, count the number
78
of one bits, and bitfield insertion, extraction, and reversal;
80
* extended integer precision math, including add with carry, subtract
81
with borrow, and extenended multiplication;
83
The resulting extension is a strict subset of GL_ARB_gpu_shader5.
89
New Procedures and Functions
97
Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
102
Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
107
Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
108
(Per-Fragment Operations and the Frame Buffer)
112
Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
117
Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
118
(State and State Requests)
122
Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
123
Specification (Invariance)
127
Additions to the AGL/GLX/WGL Specifications
131
Modifications to The OpenGL Shading Language Specification, Version 1.50
134
Including the following line in a shader can be used to control the
135
language features described in this extension:
137
#extension GL_MESA_shader_integer_functions : <behavior>
139
where <behavior> is as specified in section 3.3.
141
New preprocessor #defines are added to the OpenGL Shading Language:
143
#define GL_MESA_shader_integer_functions 1
146
Modify Section 4.1.10, Implicit Conversions, p. 27
148
(modify table of implicit conversions)
151
Type of expression converted to
152
--------------------- -----------------
163
(modify second paragraph of the section) No implicit conversions are
164
provided to convert from unsigned to signed integer types or from
165
floating-point to integer types. There are no implicit array or structure
168
(insert before the final paragraph of the section) When performing
169
implicit conversion for binary operators, there may be multiple data types
170
to which the two operands can be converted. For example, when adding an
171
int value to a uint value, both values can be implicitly converted to uint
172
and float. In such cases, a floating-point type is chosen if either
173
operand has a floating-point type. Otherwise, an unsigned integer type is
174
chosen if either operand has an unsigned integer type. Otherwise, a
175
signed integer type is chosen.
178
Modify Section 5.9, Expressions, p. 57
180
(modify bulleted list as follows, adding support for implicit conversion
181
between signed and unsigned types)
183
Expressions in the shading language are built from the following:
185
* Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
186
types, and all matrix types.
190
* The operator modulus (%) operates on signed or unsigned integer scalars
191
or vectors. If the fundamental types of the operands do not match, the
192
conversions from Section 4.1.10 "Implicit Conversions" are applied to
193
produce matching types. ...
196
Modify Section 6.1, Function Definitions, p. 63
198
(modify description of overloading, beginning at the top of p. 64)
200
Function names can be overloaded. The same function name can be used for
201
multiple functions, as long as the parameter types differ. If a function
202
name is declared twice with the same parameter types, then the return
203
types and all qualifiers must also match, and it is the same function
204
being declared. For example,
206
vec4 f(in vec4 x, out vec4 y); // (A)
207
vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type
208
vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type
210
int f(in vec4 x, out ivec4 y); // error, only return type differs
211
vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs
212
vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs
214
When function calls are resolved, an exact type match for all the
215
arguments is sought. If an exact match is found, all other functions are
216
ignored, and the exact match is used. If no exact match is found, then
217
the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
218
applied to find a match. Mismatched types on input parameters (in or
219
inout or default) must have a conversion from the calling argument type
220
to the formal parameter type. Mismatched types on output parameters (out
221
or inout) must have a conversion from the formal parameter type to the
222
calling argument type.
224
If implicit conversions can be used to find more than one matching
225
function, a single best-matching function is sought. To determine a best
226
match, the conversions between calling argument and formal parameter
227
types are compared for each function argument and pair of matching
228
functions. After these comparisons are performed, each pair of matching
229
functions are compared. A function definition A is considered a better
230
match than function definition B if:
232
* for at least one function argument, the conversion for that argument
233
in A is better than the corresponding conversion in B; and
235
* there is no function argument for which the conversion in B is better
236
than the corresponding conversion in A.
238
If a single function definition is considered a better match than every
239
other matching function definition, it will be used. Otherwise, a
240
semantic error occurs and the shader will fail to compile.
242
To determine whether the conversion for a single argument in one match is
243
better than that for another match, the following rules are applied, in
246
1. An exact match is better than a match involving any implicit
249
2. A match involving an implicit conversion from float to double is
250
better than a match involving any other implicit conversion.
252
3. A match involving an implicit conversion from either int or uint to
253
float is better than a match involving an implicit conversion from
254
either int or uint to double.
256
If none of the rules above apply to a particular pair of conversions,
257
neither conversion is considered better than the other.
259
For the function prototypes (A), (B), and (C) above, the following
260
examples show how the rules apply to different sets of calling argument
263
f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y)
264
f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y)
265
f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y)
266
// (C) not relevant, can't convert vec4 to
267
// ivec4. (A) better than (B) for 2nd
268
// argument (rule 2), same on first argument.
269
f(ivec4, vec4); // NOT matched. All three match by implicit
270
// conversion. (C) is better than (A) and (B)
271
// on the first argument. (A) is better than
275
Modify Section 8.3, Common Functions, p. 84
277
(add support for single-precision frexp and ldexp functions)
281
genType frexp(genType x, out genIType exp);
282
genType ldexp(genType x, in genIType exp);
284
The function frexp() splits each single-precision floating-point number in
285
<x> into a binary significand, a floating-point number in the range [0.5,
286
1.0), and an integral exponent of two, such that:
288
x = significand * 2 ^ exponent
290
The significand is returned by the function; the exponent is returned in
291
the parameter <exp>. For a floating-point value of zero, the significant
292
and exponent are both zero. For a floating-point value that is an
293
infinity or is not a number, the results of frexp() are undefined.
295
If the input <x> is a vector, this operation is performed in a
296
component-wise manner; the value returned by the function and the value
297
written to <exp> are vectors with the same number of components as <x>.
299
The function ldexp() builds a single-precision floating-point number from
300
each significand component in <x> and the corresponding integral exponent
301
of two in <exp>, returning:
303
significand * 2 ^ exponent
305
If this product is too large to be represented as a single-precision
306
floating-point value, the result is considered undefined.
308
If the input <x> is a vector, this operation is performed in a
309
component-wise manner; the value passed in <exp> and returned by the
310
function are vectors with the same number of components as <x>.
313
(add support for new integer built-in functions)
317
genIType bitfieldExtract(genIType value, int offset, int bits);
318
genUType bitfieldExtract(genUType value, int offset, int bits);
320
genIType bitfieldInsert(genIType base, genIType insert, int offset,
322
genUType bitfieldInsert(genUType base, genUType insert, int offset,
325
genIType bitfieldReverse(genIType value);
326
genUType bitfieldReverse(genUType value);
328
genIType bitCount(genIType value);
329
genIType bitCount(genUType value);
331
genIType findLSB(genIType value);
332
genIType findLSB(genUType value);
334
genIType findMSB(genIType value);
335
genIType findMSB(genUType value);
337
The function bitfieldExtract() extracts bits <offset> through
338
<offset>+<bits>-1 from each component in <value>, returning them in the
339
least significant bits of corresponding component of the result. For
340
unsigned data types, the most significant bits of the result will be set
341
to zero. For signed data types, the most significant bits will be set to
342
the value of bit <offset>+<base>-1. If <bits> is zero, the result will be
343
zero. The result will be undefined if <offset> or <bits> is negative, or
344
if the sum of <offset> and <bits> is greater than the number of bits used
345
to store the operand. Note that for vector versions of bitfieldExtract(),
346
a single pair of <offset> and <bits> values is shared for all components.
348
The function bitfieldInsert() inserts the <bits> least significant bits of
349
each component of <insert> into the corresponding component of <base>.
350
The result will have bits numbered <offset> through <offset>+<bits>-1
351
taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
352
directly from the corresponding bits of <base>. If <bits> is zero, the
353
result will simply be <base>. The result will be undefined if <offset> or
354
<bits> is negative, or if the sum of <offset> and <bits> is greater than
355
the number of bits used to store the operand. Note that for vector
356
versions of bitfieldInsert(), a single pair of <offset> and <bits> values
357
is shared for all components.
359
The function bitfieldReverse() reverses the bits of <value>. The bit
360
numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
361
<value>, where <bits> is the total number of bits used to represent
364
The function bitCount() returns the number of one bits in the binary
365
representation of <value>.
367
The function findLSB() returns the bit number of the least significant one
368
bit in the binary representation of <value>. If <value> is zero, -1 will
371
The function findMSB() returns the bit number of the most significant bit
372
in the binary representation of <value>. For positive integers, the
373
result will be the bit number of the most significant one bit. For
374
negative integers, the result will be the bit number of the most
375
significant zero bit. For a <value> of zero or negative one, -1 will be
379
(support for unsigned integer add/subtract with carry-out)
383
genUType uaddCarry(genUType x, genUType y, out genUType carry);
384
genUType usubBorrow(genUType x, genUType y, out genUType borrow);
386
The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
387
<y>, returning the sum modulo 2^32. The value <carry> is set to zero if
388
the sum was less than 2^32, or one otherwise.
390
The function usubBorrow() subtracts the 32-bit unsigned integer or vector
391
<y> from <x>, returning the difference if non-negative or 2^32 plus the
392
difference, otherwise. The value <borrow> is set to zero if x >= y, or
396
(support for signed and unsigned multiplies, with 32-bit inputs and a
397
64-bit result spanning two 32-bit outputs)
401
void umulExtended(genUType x, genUType y, out genUType msb,
403
void imulExtended(genIType x, genIType y, out genIType msb,
406
The functions umulExtended() and imulExtended() multiply 32-bit unsigned
407
or signed integers or vectors <x> and <y>, producing a 64-bit result. The
408
32 least significant bits are returned in <lsb>; the 32 most significant
409
bits are returned in <msb>.
416
Dependencies on ARB_gpu_shader_fp64
418
This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
419
of implicit conversions supported in the OpenGL Shading Language. If more
420
than one of these extensions is supported, an expression of one type may
421
be converted to another type if that conversion is allowed by any of these
424
If ARB_gpu_shader_fp64 or a similar extension introducing new data types
425
is not supported, the function overloading rule in the GLSL specification
426
preferring promotion an input parameters to smaller type to a larger type
427
is never applicable, as all data types are of the same size. That rule
428
and the example referring to "double" should be removed.
431
Dependencies on NV_gpu_shader5
433
This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
434
of implicit conversions supported in the OpenGL Shading Language. If more
435
than one of these extensions is supported, an expression of one type may
436
be converted to another type if that conversion is allowed by any of these
439
If NV_gpu_shader5 is supported, integer data types are supported with four
440
different precisions (8-, 16, 32-, and 64-bit) and floating-point data
441
types are supported with three different precisions (16-, 32-, and
442
64-bit). The extension adds the following rule for output parameters,
443
which is similar to the one present in this extension for input
446
5. If the formal parameters in both matches are output parameters, a
447
conversion from a type with a larger number of bits per component is
448
better than a conversion from a type with a smaller number of bits
449
per component. For example, a conversion from an "int16_t" formal
450
parameter type to "int" is better than one from an "int8_t" formal
451
parameter type to "int".
453
Such a rule is not provided in this extension because there is no
454
combination of types in this extension and ARB_gpu_shader_fp64 where this
467
New Implementation Dependent State
473
(1) What should this extension be called?
475
UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating
476
some sort of a play on that name would be viable. However, nothing in
477
this extension should require SM5 hardware, so such a name would be a
478
little misleading and weird.
480
Since the primary purpose is to add integer related functions from
481
GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
484
(2) Why is some of the formatting in this extension weird?
486
RESOLVED: This extension is formatted to minimize the differences (as
487
reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
490
(3) Should ldexp and frexp be included?
492
RESOLVED: Yes. Few GPUs have native instructions to implement these
493
functions. These are generally implemented using existing GLSL built-in
494
functions and the other functions provided by this extension.
496
(4) Should umulExtended and imulExtended be included?
498
RESOLVED: Yes. These functions should be implementable on any GPU that
499
can support the rest of this extension, but the implementation may be
500
complex. The implementation on a GPU that only supports 32bit x 32bit =
501
32bit multiplication would be quite expensive. However, many GPUs
502
(including OpenGL 4.0 GPUs that already support this function) have a
503
32bit x 16bit = 48bit multiplier. The implementation there is only
504
trivially more expensive than regular 32bit multiplication.
506
(5) Should the pack and unpack functions be included?
508
RESOLVED: No. These functions are already available via
509
GL_ARB_shading_language_packing.
511
(6) Should the "BitsTo" functions be included?
513
RESOLVED: No. These functions are already available via
514
GL_ARB_shader_bit_encoding.
518
Rev. Date Author Changes
519
---- ----------- -------- -----------------------------------------
520
3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3)
521
2 7-Jul-2016 idr Fix typo in #extension line
522
1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5.