150 likes | 288 Vues
The ARB_sync extension provides a synchronization framework for OpenGL, allowing for efficient CPU and GPU coordination. Key features include Fence Synchronization Objects, which facilitate partial command completion management, and the ability to synchronize multiple OpenGL contexts. This framework supports advanced functionalities such as querying synchronization state and potential future enhancements, like adding multiple sync objects to a wait command. The specifications and practical implementations enhance performance in graphics applications, making ARB_sync a crucial aspect of modern OpenGL development.
E N D
Async Workgroup Update Barthold Lichtenbelt
Goals • Provide synchronization framework for OpenGL • Provide base functionality as defined in NV_fence and GL2_async_core • Build a framework for future, more complex, functionality, some of which discussed in GL2_async_core • Initially support CPU <-> GPU synchronization • Support synchronization across multiple OpenGL contexts • Resulted in GL_ARB_sync spec • Finished April 2006 • Posted draft to opengl.org for feedback • Not quite official ARB extension yet
Functionality overview • ARB_sync provides synchronization primitives • Can be tested, set and waited upon • Specifically, a “Fence Synchronization Object” and corresponding Fence command • Fence completion allows for partial glFinish • All commands prior to the fence are forced to complete before control is returned to caller • Fence Sync Objects can be shared across contexts • Allows for synchronization of OpenGL command streams across contexts • New data type: GLtime represents intervals in nanoseconds • 64 bit integer, same encoding as UST counter in OpenML • Accuracy implementation dependent, precision in nanoseconds If you have used the Windows Event model, this will feel familiar
Synchronization model in ARB_sync 1/2 • A “sync object” is a primitive used for synchronization between CPU and GPU, CPU, or ‘something else’. • Sync object has state: type, condition, status • A sync object’s status can be signaled or non-signaled • when created status is signaled unless a flag is set in which case it is non-signaled • A “fence sync object” is a specific type of sync object • Provides partial finish semantics • Only type of sync object currently defined • A “fence” is a token inserted in the GL command stream • A sync object is not inserted into the command stream • Fence has no state • A fence is associated with a fence sync object. • Multiple fences can be associated with the same sync object • When a fence is inserted in the command stream, the status of its sync object is set to non-signaled • A fence, once completed, will set the status of its sync object to signaled
Synchronization model in ARB_sync 2/2 • A wait function waits on a sync object, not on a fence • A poll function polls a sync object, not a fence • A wait function called on a sync object in the non-signaled state will block. It unblocks when the sync object transitions to the signaled state.
Context A Sync_objectA = glCreateSync(attrib); <render to texture that context B needs> glFence(sync_objectA); glFlush(); // prevent deadlock Context B glClientWaitSync(sync_objectA,0,GL_FOREVER); glBindTexture(….); // Just rendered <render using texture> Example – RTT with two contexts
OS specific functionality • Convert sync object to the window system native event primitive • Allows applications to synchronize all events in a system using one API • All operations on <sync> are reflected in OS event and vice-versa • Both <sync> and the OS event are valid to use in your code • On windows, convert to an Event HANDLE wglConvertSyncToEvent(object sync); • Need to specify, when sync object is created, that it can be converted to OS event • Separate extension: WGL_ARB_sync_event • On Unix, convert to a file-descriptor, x-event or semaphore? • Still TBD
Possible future functionality • Add a WaitForMultipleSync(uint *sync_objects, ….) command • Synchronize with multiple sync objects at once • Add a “payload” to a fence • For example, the time it completed • Allow one GPU stream to wait for another GPU stream • WaitSync(sync_object); • A sync object whose status will pulse with every vblank • A sync object that can signal when data binding has completed • As opposed to when rendering has completed using the data
Example – Streaming video processing • Loop Draw frame 1 // To a FBO, for example glFence(sync_object1);// inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 2
Variation with asynchronous read back • Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1 // Asynchronous readback glFence(sync_object1);// Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2
Differences with GL_NV_Fence • No separation of sync objects and fences in NV_Fence • NV version only has fence objects • Fence object has state • Creation of sync object and inserting a fence in one command • SetFenceNV creates and inserts a fence (old object model) • NV Fence objects not shared across contexts
API Overview 1/2 • Create a sync attribute object object CreateSyncAttrib(); • SYNC_TYPE has to be FENCE • SYNC_CONDITION has to be SYNC_PRIOR_COMMANDS_COMPLETE • SYNC_STATUS SIGNALED or UNSIGNALED • Create the sync object object CreateSync(object attrib); • Insert a fence, associated with a sync object, into command stream void Fence(object sync);
API Overview 2/2 • Wait or test the status of a fence sync object enum ClientWaitSync(object sync, uint flags, time timeout); • Blocks until sync is signalled or timeout expired • If timeout == 0, does not block, returns the status of sync • If timeout == FOREVER, call does not timeout • Optionally will flush before blocking • Returns 3 values: ALREADY_SIGNALED, TIMEOUT_EXPIRED, CONDITION_SATISFIED • Signal or unsignal a sync object void SignalSync(object sync, enum mode); • If status transitions from unsignaled to signaled, ClientWaitSync will unblock
Example – Streaming video processing • Loop Draw frame 1 // To a FBO, for example glFence(sync_object1);// inserts a fence in the command stream Draw frame 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking Read back data in frame 2
Variation with asynchronous read back • Loop Draw frame 1 // To a FBO, for example Read back frame 1 into PBO 1 // Asynchronous readback glFence(sync_object1);// Inserts a fence in the command stream Draw frame 2 Read back frame 2 into PBO 2 glFence(sync_object2); while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 1 in PBO 1 while (glClientWaitSync(sync_object1,0,0)!=GL_ALREADY_SIGNALED) <Do some useful work> // App uses CPU cycles instead of blocking glMapBuffer(…); // Access the data of frame 2 in PBO 2