We’ve mentioned in a recent blog post how maintaining presence is key in virtual reality systems. Rendering applications at high framerates (60, 90 or 120 Hz depending on the Head Mounted Display’s maximum refresh rate) with low motion-to-photon latency is an important part of achieving it.
In this article, I’ll explain how the OVR_multiview extension can be used to reduce the CPU and GPU overhead of rendering a VR application.
Rendering without OVR_multiview
In a standard well-optimized VR application, the scene will be rendered to an Framebuffer Object (FBO) twice – once for the left eye, once for the right. To issue the renders, an application will do the following:
- Bind the FBO
- Left eye
- Set viewport to the left-half of the FBO
- Draw all objects in the scene using the left eye camera projection matrix
- Right eye
- Set viewport to the right-half of the FBO
- Draw all objects in the scene using the right eye camera projection matrix
Once the scene is rendered for each eye, the FBO contents are barrel distorted to correct pincushion distortion introduced by the HMD lenses.
In this solution the application has to submit two almost identical streams of GL calls, even though the only difference between the renders are the matrix transformations applied to vertices. This wastes application time submitting calls per-eye. It also wastes GPU driver time validating API calls and generating a GPU command buffer per-eye when a single shared command buffer would do.
With the OVR_multiview extension (and the layered OVR_multiview2 and OVR_multiview_multisampled_render_to_texture extensions), an application can bind a texture array to an FBO and instance draws to each element. This enables graphics drivers to prepare a single GPU command buffer and reuse it for each instanced render. When the extension is active, the gl_ViewID_OVR built-in can be accessed in vertex shaders to identify the element the draw will be rendered to.
As a tile-based GPU architecture, tiling must be performed per-instance. Once the tiling process completes, per-element pixel render tasks are kicked.
OVR_multiview: Optimizing draw submission
A simple use case for OVR_multiview is to create a texture array consisting of two elements that represent the left and right eye images. Each frame, an application can render the elements by performing the steps below:
- Bind the FBO (texture array attached)
- Pass an array of transformation matrices to shaders as a uniform
- Array consists of two elements – one transformation for the left eye, one for the right
- Draw all objects in the scene
- During vertex shader execution, use gl_ViewID_OVR to determine which matrix should be used for transformations
With this simple change, an application can halve the number of OpenGL calls submitted to the driver!
OVR_multiview: Reducing fragment processing
Lenses that increase a user’s field of view are an essential part of an immersive VR system. To counter the pincushion distortion introduced by the lenses, barrel distortion must be applied before the image is displayed.
Unfortunately, modern GPUs are not designed to natively render barrel distorted images. VR applications must render a non-distorted image in a first pass and then barrel distort it in a second pass. This wastes GPU cycles and bandwidth colouring texels in the first pass that will make a minimal contribution to the outer regions of the barrel where the texel to pixel density is high in the second pass.
As shown in the diagram above, OVR_multiview can be used to sub-divide the render into regions that better represent the pixel density of the barrel area they occupy. A simple implementation of this method would (per-eye) render a high-resolution, narrow field-of-view image for the centre of the barrel and a lower-resolution, wide field-of-view image for the outer regions of the barrel. During the barrel distortion pass, a fragment shader can be used to mix the high-resolution and low-resolution images based on the pixel coordinate within the barrel.
In a render where the narrow field of view, full-resolution image accounts for 25% of the scene and the wide field-of-view render is half-resolution (25% of the full resolution), the GPU will only need to colour half as many pixels as a full-resolution render – a huge reduction in fragment shader calculations and associated bandwidth. Of course, the savings made will depend on how small you can make the narrow field-of-view without introducing artefacts.
With the OVR_multiview extensions and a few simple application changes, VR applications can submit work to graphics drivers much more efficiently and reduce GPU overhead by rendering fewer pixels. If you want to know more about the work Imagination is doing to optimize VR rendering, I’d highly recommend reading Christian Pötzsch’s excellent blog post on reducing the latency of asynchronous time warping with strip rendering.