-
Notifications
You must be signed in to change notification settings - Fork 97
/
GL_KHR_memory_scope_semantics.txt
567 lines (428 loc) · 26.6 KB
/
GL_KHR_memory_scope_semantics.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
Name
KHR_memory_scope_semantics
Name Strings
GL_KHR_memory_scope_semantics
Contact
Jeff Bolz (jbolz 'at' nvidia.com), NVIDIA
Contributors
Notice
Copyright (c) 2018 The Khronos Group Inc. Copyright terms at
http://www.khronos.org/registry/speccopyright.html
Status
Approved by Vulkan working group 10-Jul-2018.
Ratified by the Khronos Board of Promoters 24-Aug-2018.
Version
Last Modified Date: 13-Jun-2019
Revision: 2
Number
TBD.
Dependencies
This extension can be applied to OpenGL GLSL versions 4.20
(#version 420) and higher.
This extension can be applied to OpenGL ES ESSL versions 3.10
(#version 310) and higher.
This extension is written against revision 3 of the OpenGL Shading Language
version 4.60, dated July 23, 2017.
This extension interacts with GL_KHR_shader_atomic_int64.
Overview
This extension document modifies GLSL to add scopes and memory semantics
for atomic operations and barriers. For atomics, these are added as
optional parameters to the existing atomic built-in functions. For
barriers, two new builtin functions are added that correspond to SPIR-V's
OpMemoryBarrier and OpControlBarrier.
The meanings of scopes and semantics are defined in detail in the Vulkan
Memory Model section of the Vulkan specification, and those definitions are
not reproduced here. But in short:
* "Scope" can be used to limit the set of invocations that an atomic
operation is atomic respect to, and to limit the set of invocations
that a barrier synchronizes with. On some implementations, a smaller
scope may be more efficient.
* "Storage class semantics" can be used to limit the set of memory
storage classes that are synchronized by a release or acquire
operation. On some implementations, synchronizing fewer storage
classes may be more efficient (e.g. synchronizing shared memory
is often cheaper than buffer or image memory).
* "Release/Acquire semantics" are used to guarantee ordering between
an atomic or barrier and other memory operations that occur before
or after it in program order, as observed by other invocations.
Mapping to SPIR-V
-----------------
For informational purposes (non-specification), the following is an
expected way for an implementation to map GLSL constructs to SPIR-V
constructs:
gl_Scope* values are equal to the corresponding SPIR-V Scope enums
and can simply be passed through
gl_StorageSemantics* and gl_Semantics* values should be bitwise
ORed together to generate the SPIR-V Semantics enums
atomicLoad -> OpAtomicLoad
atomicStore -> OpAtomicStore
controlBarrier -> OpControlBarrier
memoryBarrier -> OpMemoryBarrier
coherent -> equivalent to queuefamilycoherent
devicecoherent -> NonPrivate{Pointer,Texel}KHR + Make{Pointer,Texel}{Available,Visible}KHR with scope = Device
queuefamilycoherent -> NonPrivate{Pointer,Texel}KHR + Make{Pointer,Texel}{Available,Visible}KHR with scope = QueueFamilyKHR
workgroupcoherent -> NonPrivate{Pointer,Texel}KHR + Make{Pointer,Texel}{Available,Visible}KHR with scope = Workgroup
subgroupcoherent -> NonPrivate{Pointer,Texel}KHR + Make{Pointer,Texel}{Available,Visible}KHR with scope = Subgroup
nonprivate -> NonPrivate{Pointer,Texel}KHR
volatile -> Volatile memory operand (for buffers) or VolatileTexelKHR image operand
(for image access) or Volatile memory semantic (for atomics)
Modifications to the OpenGL Shading Language Specification, Version 4.50
Including the following line in a shader can be used to control the
language features described in this extension:
#extension GL_KHR_memory_scope_semantics : <behavior>
where <behavior> is as specified in section 3.3.
A new preprocessor #define is added:
#define GL_KHR_memory_scope_semantics 1
Additions to Chapter 4 of the OpenGL Shading Language Specification
(Variables and Types)
Modify Section 4.3.8, Shared Variables
Replace "Shared variables are implicitly coherent" with
"Shared variables are implicitly workgroupcoherent".
Modify Section 4.10, Memory Qualifiers
(Add new rows to the table, and update the row for "coherent")
coherent: alias of queuefamilycoherent
devicecoherent: memory variable where writes have automatic
availability operations and reads have automatic visibility
operations to the shader domain
queuefamilycoherent: memory variable where writes have automatic
availability operations and reads have automatic visibility
operations to the queue family instance domain
workgroupcoherent: memory variable where writes have automatic
availability operations and reads have automatic visibility
operations to the workgroup instance domain
subgroupcoherent: memory variable where writes have automatic
availability operations and reads have automatic visibility
operations to the subgroup instance domain
nonprivate: memory variable that is not coherent, but still obeys
inter-invocation ordering requirements
(Replace the two paragraphs about "coherent")
Memory accesses to image variables declared using the devicecoherent
qualifier have implicit availability or visibility operations which allow
them to be used to share data with other shader invocations. In
particular, when reading a variable declared as devicecoherent, the
values returned will reflect the results of previously completed writes
performed by other shader invocations. When writing a variable declared
as devicecoherent, the values written will be reflected in subsequent
devicecoherent reads performed by other shader invocations.
Similarly, variables declared using the queuefamilycoherent qualifier
have implicit availability or visibility operations which allow them to
be used to share data with other shader invocations in the same
queue family. However, relative to accesses via invocations in other
queue families, such variables behave the same as unqualified (non-coherent)
variables.
Similarly, variables declared using the workgroupcoherent qualifier
have implicit availability or visibility operations which allow them to
be used to share data with other shader invocations in the same
workgroup. However, relative to accesses via invocations in other
workgroups, such variables behave the same as unqualified (non-coherent)
variables.
Similarly, variables declared using the subgroupcoherent qualifier
have implicit availability or visibility operations which allow them to
be used to share data with other shader invocations in the same
subgroup. However, relative to accesses via invocations in other
subgroups, such variables behave the same as unqualified (non-coherent)
variables.
It is a compile-time error to decorate a variable with more than one of
devicecoherent, queuefamilycoherent, workgroupcoherent, and
subgroupcoherent.
As described in section 7.11 "Shader Memory Access" of
the OpenGL Specification, shader memory reads and writes complete in a
largely undefined order. The built-in function memoryBarrier() can be used
if needed to guarantee the completion and relative ordering of memory
accesses performed by a single shader invocation.
When accessing memory using variables not declared as devicecoherent,
queuefamilycoherent, workgroupcoherent, or subgroupcoherent, the
memory accessed by a shader may be cached by the implementation to service
future accesses to the same address. Memory stores may be cached in such a
way that the values written might not be visible to other shader
invocations accessing the same memory. The implementation may cache the
values fetched by memory reads and return the same values to any shader
invocation accessing the same memory, even if the underlying memory has
been modified since the first memory read. While variables not declared as
coherent might not be useful for communicating between shader invocations,
using non-coherent accesses may result in higher performance.
Variables declared using the nonprivate qualifier obey the ordering
requirements defined by barriers and atomics. devicecoherent,
queuefamilycoherent, workgroupcoherent, and subgroupcoherent
variables are all implicitly nonprivate. Accesses to noncoherent private
variables can be reordered across barriers and atomics. nonprivate
noncoherent variables are primarily useful to avoid write-after-read
hazards, where a non-private read must occur before a barrier.
...
The memory qualifiers subgroupcoherent, workgroupcoherent,
queuefamilycoherent, devicecoherent, volatile, restrict, readonly,
writeonly, and nonprivate may be used in the declaration of buffer
variables (i.e., members of shader storage blocks). ...
...
Variables qualified with subgroupcoherent, workgroupcoherent,
queuefamilycoherent, devicecoherent, volatile, readonly, writeonly,
or nonprivate may not be passed to functions whose formal parameters
lack such qualifiers. ...
(Add to the end of the section)
Sampler variables and uniform blocks and block members can be decorated
with nonprivate, which causes them to obey inter-thread ordering
requirements. It is a compile-time error to use any other memory
qualifiers on sampler variables or uniform blocks or block members.
Additions to Chapter 8 of the OpenGL Shading Language Specification
(Built-in Functions)
Modify Section 8.11, Atomic Memory Functions
(Add devicecoherent, queuefamilycoherent, workgroupcoherent,
subgroupcoherent, nonprivate)
All the built-in functions in this section accept arguments with
combinations of restrict, subgroupcoherent, workgroupcoherent,
queuefamilycoherent, devicecoherent, volatile, and nonprivate
memory qualification, despite not having them listed in the prototypes.
(Add new variants of atomic built-in functions with additional
scope/semantics parameters)
uint atomicAdd (inout uint mem, uint data, int scope, int storage, int sem)
int atomicAdd (inout int mem, int data, int scope, int storage, int sem)
uint atomicMin (inout uint mem, uint data, int scope, int storage, int sem)
int atomicMin (inout int mem, int data, int scope, int storage, int sem)
uint atomicMax (inout uint mem, uint data, int scope, int storage, int sem)
int atomicMax (inout int mem, int data, int scope, int storage, int sem)
uint atomicAnd (inout uint mem, uint data, int scope, int storage, int sem)
int atomicAnd (inout int mem, int data, int scope, int storage, int sem)
uint atomicOr (inout uint mem, uint data, int scope, int storage, int sem)
int atomicOr (inout int mem, int data, int scope, int storage, int sem)
uint atomicXor (inout uint mem, uint data, int scope, int storage, int sem)
int atomicXor (inout int mem, int data, int scope, int storage, int sem)
uint atomicExchange (inout uint mem, uint data, int scope, int storage, int sem)
int atomicExchange (inout int mem, int data, int scope, int storage, int sem)
uint atomicCompSwap (inout uint mem, uint compare, uint data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
int atomicCompSwap (inout int mem, int compare, int data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
uint64_t atomicAdd (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicAdd (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicMin (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicMin (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicMax (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicMax (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicAnd (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicAnd (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicOr (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicOr (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicXor (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicXor (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicExchange (inout uint64_t mem, uint64_t data, int scope, int storage, int sem)
int64_t atomicExchange (inout int64_t mem, int64_t data, int scope, int storage, int sem)
uint64_t atomicCompSwap (inout uint64_t mem, uint64_t compare, uint64_t data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
int64_t atomicCompSwap (inout int64_t mem, int64_t compare, int64_t data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
(Add new built-in functions)
// Atomically loads the value from <mem> and returns it
uint atomicLoad (in uint mem, int scope, int storage, int sem)
int atomicLoad (in int mem, int scope, int storage, int sem)
uint64_t atomicLoad (in uint64_t mem, int scope, int storage, int sem)
int64_t atomicLoad (in int64_t mem, int scope, int storage, int sem)
// Atomically stores the value of <data> to <mem>
void atomicStore (out uint mem, uint data, int scope, int storage, int sem)
void atomicStore (out int mem, int data, int scope, int storage, int sem)
void atomicStore (out uint64_t mem, uint64_t data, int scope, int storage, int sem)
void atomicStore (out int64_t mem, int64_t data, int scope, int storage, int sem)
The values passed as scope, storage, and sem parameters must all be
integer constant expressions. Valid values are listed in the Scope and
Semantics section. scope must be a gl_Scope* value, sem* must be a
gl_Semantics* value, and storage* must be a combination of
gl_StorageSemantics* values.
(Add a new subsection to the end of this section)
Scope and Semantics
gl_Scope*, gl_Semantics*, and gl_StorageSemantics* are constant integer
values which can be used for the scope, storage, and sem parameters to
atomic built-in functions. For scope and semantics, only the listed
values are valid. For storage semantics, any bitwise combination of the
listed values is valid.
const int gl_ScopeDevice = 1;
const int gl_ScopeWorkgroup = 2;
const int gl_ScopeSubgroup = 3;
const int gl_ScopeInvocation = 4;
const int gl_ScopeQueueFamily = 5;
const int gl_SemanticsRelaxed = 0x0;
const int gl_SemanticsAcquire = 0x2;
const int gl_SemanticsRelease = 0x4;
const int gl_SemanticsAcquireRelease = 0x8;
const int gl_SemanticsMakeAvailable = 0x2000;
const int gl_SemanticsMakeVisible = 0x4000;
const int gl_SemanticsVolatile = 0x8000;
const int gl_StorageSemanticsNone = 0x0;
const int gl_StorageSemanticsBuffer = 0x40;
const int gl_StorageSemanticsShared = 0x100;
const int gl_StorageSemanticsImage = 0x800;
const int gl_StorageSemanticsOutput = 0x1000;
The meaning of these values is defined in the Vulkan Memory Model.
The following error checks are applied to commands that accept these
values. Each results in a compile-time error.
* gl_SemanticsAcquire must not be used with atomicStore or
imageAtomicStore.
* gl_SemanticsRelease must not be used with atomicLoad or
imageAtomicLoad.
* gl_SemanticsAcquireRelease must not be used with atomicLoad,
imageAtomicLoad, atomicStore, or imageAtomicStore.
* Semantics operands must only have gl_Semantics* bits set.
* Storage class semantics operands must only have
gl_StorageSemantics* bits set.
* Semantics must not include multiple of gl_SemanticsRelease,
gl_SemanticsAcquire, or gl_SemanticsAcquireRelease.
* memoryBarrier must use exactly one of gl_SemanticsRelease,
gl_SemanticsAcquire, or gl_SemanticsAcquireRelease.
* memoryBarrier must not use storage class semantics of zero.
* If controlBarrier uses non-zero semantics, then it must not use
storage class semantics of zero.
* atomicCompSwap and imageAtomicCompSwap semUnequal must not use
gl_SemanticsRelease or gl_SemanticsAcquireRelease.
* If semantics includes gl_SemanticsMakeAvailable it must also
include gl_SemanticsRelease or gl_SemanticsAcquireRelease.
* If semantics includes gl_SemanticsMakeVisible it must also
include gl_SemanticsAcquire or gl_SemanticsAcquireRelease.
* gl_SemanticsVolatile must not be used with memoryBarrier or
controlBarrier.
* atomicCompSwap and imageAtomicCompSwap must either include
gl_SemanticsVolatile in both semEqual and semUnequal or in
neither.
Modify Section 8.12, Image Functions
(Add devicecoherent, queuefamilycoherent, workgroupcoherent,
subgroupcoherent, nonprivate)
All the built-in functions in this section accept arguments with
combinations of restrict, subgroupcoherent, workgroupcoherent,
queuefamilycoherent, devicecoherent, volatile, and nonprivate
memory qualification, despite not having them listed in the prototypes.
(Add new variants of atomic built-in functions with additional
scope/semantics parameters)
uint imageAtomicAdd (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicAdd (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicMin (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicMin (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicMax (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicMax (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicAnd (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicAnd (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicOr (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicOr (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicXor (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicXor (IMAGE_PARAMS, int data, int scope, int storage, int sem)
uint imageAtomicExchange (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
int imageAtomicExchange (IMAGE_PARAMS, int data, int scope, int storage, int sem)
float imageAtomicExchange (IMAGE_PARAMS, float data, int scope, int storage, int sem)
uint imageAtomicCompSwap (IMAGE_PARAMS, uint compare, uint data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
int imageAtomicCompSwap (IMAGE_PARAMS, int compare, int data, int scope,
int storageEqual, int semEqual,
int storageUnequal, int semUnequal)
(Add new built-in functions)
// Atomically loads the value from the image and returns it
uint imageAtomicLoad (IMAGE_PARAMS, int scope, int storage, int sem)
int imageAtomicLoad (IMAGE_PARAMS, int scope, int storage, int sem)
// Atomically stores the value of <data> to the image
void imageAtomicStore (IMAGE_PARAMS, uint data, int scope, int storage, int sem)
void imageAtomicStore (IMAGE_PARAMS, int data, int scope, int storage, int sem)
The values passed as scope, storage, and sem parameters must all be
integer constant expressions. Valid values are listed in the Scope and
Semantics section. scope must be a gl_Scope* value, sem* must be a
gl_Semantics* value, and storage* must be a combination of
gl_StorageSemantics* values.
Modify Section 8.16, Shader Invocation Control Functions
(Replace the table and the second paragraph with the following)
The following built-in function performs a control barrier as defined in
the Vulkan Memory Model:
void controlBarrier(int execution, int memory, int storage, int sem);
Informally, a control barrier can be used in conjunction with memory
barriers (including those memory barriers optionally performed by
controlBarrier itself) to synchronize memory accesses between shader
invocations.
The values passed as execution, memory, storage, and sem parameters must
all be integer constant expressions. Valid values are listed in the Scope
and Semantics section. execution and memory must be gl_Scope* values, sem*
must be a gl_Semantics* value, and storage* must be a combination of
gl_StorageSemantics* values.
The built-in function "void barrier()" in tessellation control shaders can
be used to control-barrier-order accesses to output variables. Informally,
this means it synchronizes accesses to those variables before the barrier
against those after the barrier.
In compute shaders, barrier() is equivalent to controlBarrier() with
execution and memory scope equal to gl_ScopeWorkgroup, storage semantics
equal to gl_StorageSemanticsShared, and sem equal to
gl_SemanticsAcquireRelease. Informally, this means it synchronizes
accesses to shared memory between invocations in the same workgroup.
Replace Section 8.17, Shader Memory Control Functions
Shaders of all types can read and write the contents of textures and
buffer objects using image and buffer variables. The relative order of
reads and writes from multiple shader invocations is largely undefined.
Memory barrier functions can be used to synchronize memory accesses
between invocations as described in the Vulkan Memory Model.
The following built-in function is a memory barrier as defined in the
Vulkan Memory Model:
void memoryBarrier(int memory, int storage, int sem);
The above function is a general memory barrier that can synchronize with
any supported scope or semantics. Legacy built-in functions that
synchronize a subset of memory or with particular scopes are also
supported:
The values passed as memory, storage, and sem parameters must all be
integer constant expressions. Valid values are listed in the Scope and
Semantics section. memory must be a gl_Scope* value, sem* must be a
gl_Semantics* value, and storage* must be a combination of
gl_StorageSemantics* values.
void memoryBarrier()
// equivalent to:
// memoryBarrier(gl_ScopeQueueFamily,
// gl_StorageSemanticsBuffer |
// gl_StorageSemanticsShared |
// gl_StorageSemanticsImage,
// gl_SemanticsAcquireRelease)
void memoryBarrierBuffer()
// equivalent to:
// memoryBarrier(gl_ScopeQueueFamily,
// gl_StorageSemanticsBuffer,
// gl_SemanticsAcquireRelease)
void memoryBarrierShared()
// equivalent to:
// memoryBarrier(gl_ScopeQueueFamily,
// gl_StorageSemanticsShared,
// gl_SemanticsAcquireRelease)
void memoryBarrierImage()
// equivalent to:
// memoryBarrier(gl_ScopeQueueFamily,
// gl_StorageSemanticsImage,
// gl_SemanticsAcquireRelease)
void groupMemoryBarrier()
// equivalent to:
// memoryBarrier(gl_ScopeWorkgroup,
// gl_StorageSemanticsBuffer |
// gl_StorageSemanticsShared |
// gl_StorageSemanticsImage,
// gl_SemanticsAcquireRelease)
Interactions with GL_KHR_shader_atomic_int64
If GL_KHR_shader_atomic_int64 is not supported, the atomic built-in
functions with 64-bit integer parameters and return types are not
supported.
Issues
1. Should we extend atomic/barrier built-in functions by adding new
parameters with default values to existing functions, or adding
new/separate overloads?
RESOLVED: New overloads, to avoid introducing a new language feature.
2. How hard should we try to informally document how the memory model works
and how to use scope and semantics in this extension, vs. just leaving it
to the Vulkan Memory Model spec?
RESOLVED: It's not practical to repeat all the details of how these
work, and it is hard to summarize without writing something that's
strictly incorrect. Currently, this extension mostly leaves it to the
other spec to give meaning to these parameters.
3. Should we add scope/semantics to all the RMW atomics?
RESOLVED: Yes. It is nice to have these for completeness. Implementation-wise
it should not be significant burden since all the atomics behave similarly.
But it is potentially a lot of test writing.
4. Should we add workgroup coherence, i.e. an analog of the "coherent"
decoration that only makes guarantees at workgroup scope?
RESOLVED: Yes. This is a feature that exists in other APIs and can give
better performance. It will also be added in a SPIR-V extension. We also
add "subgroupcoherent" which only makes guarantees at subgroup scope.
Revision History
Rev. Date Author Changes
---- ----------- -------- -------------------------------------------
1 28-Feb-2018 jbolz Initial revision.
2 13-Jun-2019 jbolz Add gl_SemanticsVolatile.