The importance of fine-grained GPU preemption support for VR

In my blog post about single buffered strip rendering I talked about reducing latency by shortening the graphic pipeline length. In the second post I described one way of speeding up the barrel distortion render by moving the distortion transformation out of the fragment shader. Both methods increase the throughput and therefore help in reducing the latency for the VR use case. There are other techniques to even further accelerate VR content creation. For example one interesting approach is to move the distortion transformation into the content generation step which makes a second presentation render basically obsolete. Although this is a huge win, it needs direct support by the VR application and therefore isn’t usable in a generic VR framework.

Another technique is to reproject the final content shortly before presenting it to the user and also to decouple the content generation and the VR post-processing. Like with single buffered strip rendering the correct timing is important here.

In this blog post I will explain why a GPU with fine-grained preemption support is essential when using those advanced techniques in VR.

CPU pinning

Before we look at the GPU side of things let’s talk about the CPU and Linux scheduling first.

A minimum frame rate of 60 fps is required to maintain presence in a VR environment. High-end desktop PC VR platforms target even higher numbers. It is more important to have a smooth and steady frame rate than e.g. high resolution content. Developers should always adjust their content to achieve those frame rates. However, even if they do so there might be times that the content creation can’t keep up and falls below 60 fps. It is important to remember that Android/Linux is a multi-process operation system and realtime behaviour of specific processes/threads is not guaranteed. A process can be interrupted at any time and moving a process from one CPU to another is an expensive operation because, for example, some caches are per CPU and they become invalid when doing so.

One way to partly overcome this is to pin a specific thread to a specific CPU. Using sched_setaffinity is improving the scheduling behaviour of a VR application considerably. Furthermore, Android has started to use cpusets to assign specific processes (such as background services) to selected CPU groups. There is even one top-app cpuset which includes one exclusive CPU to be used by the foreground application only.

no_affinitySystrace of Linux scheduling with no CPU pinning. The two threads freely change CPUs at any point.

affinitySystrace of Linux scheduling with CPU pinning. The content creation thread is pinned to CPU 2 and the VR post-processing thread is pinned to CPU 3.

Pinning an important thread to a specific CPU or even giving it exclusive access to one CPU is an important step towards a consistent rendering experience. Nowadays SoCs consists usually of 4 or more CPUs, so this isn’t a huge hurdle anymore.

However, even with these precautions in place realtime behaviour can’t be guaranteed.

GPU Preemption

One of the more advanced techniques in VR is asynchronous time warping. In the context of VR this was first implemented by Oculus. It is used to achieve two main goals:

  1. reducing latency
  2. increasing frame rate

I will not explain how it achieves those goals; you can read about it in the blog post of Oculus or watch this very informative video on the topic. The key takeaway is that the algorithm waits for the final render to start until a very short period of time before the vsync. It does this to be able to query the sensors again and therefore reduce the motion to photon latency. This final render has a fixed workload and an SoC vendor can approximate the time it takes to normally finish this task.

Furthermore to increase the frame rate the content creation is decoupled from the VR post-processing by using two distinctive threads which operate independently.

Where does GPU preemption fit-in here? As I said, the GPU has only very little time to process the VR post-processing render. An SoC vendor would always choose the smallest possible value to get the most out of the late reading of the sensors. The problem is that at the same time an (asynchronous) content render (or anything else in the system) can have submitted other work too. How can we guarantee that our important post-processing render finish within our target time?

Context Priority

First of all we need to understand that a modern GPU also has a scheduler for distributing multiple render tasks to only one or a small number of hardware units. The scheduler takes into account if a task is ready to run or if it has dependencies which are not fulfilled and therefore another task can run in the meantime. Furthermore the scheduler is also able to interrupt running tasks to switch to a task with a higher priority. The PowerVR Rogue hardware architecture is designed to do this interruption at a very fine-grained level. This allows interrupting between the finishing of one tile and starting the processing of the next tile. Having a tile size of 32×32 pixels for example allows for multiple interruption points while processing a fullscreen render.

To make this advanced GPU scheduling control mechanism available to OpenGL ES (or compute) developers, Imagination Technologies proposed the EGL_IMG_context_priority extension back in 2009. This extension was ratified by Khronos in the same year. It defines three priority levels to differentiate between individual workload requirements. In our VR use case we obviously choose EGL_CONTEXT_PRIORITY_HIGH_IMG for the post-processing thread and EGL_CONTEXT_PRIORITY_MEDIUM_IMG for the content render (which is also the default).

The effect of this can be seen in the following systrace:

Systrace showing GPU preemptionSystrace highlighting GPU preemption.

I artificially increased the workload of the content render to see how it influences the post-processing render task. We can see how the content creation thread submits render roughly every 32ms. The VR post-processing thread submits render every 16.7ms to keep a steady frame rate of 60 fps. The green content render tasks gets interrupted by the blue post-processing tasks as highlighted in the “GPU: 3D” row. At one point, the green “Content Creation Task 2” is interrupted three times by multiple blue tasks. This ensures the blue post-processing tasks are able to finish in time for our target frame rate.


Fine-grained GPU preemption make techniques like single buffered strip rendering or asynchronous time warping possible. It helps in balancing the creation of rich content and VR post-processing even at times when the CPU or GPU can’t keep up with its tasks in a timely fashion. Obviously there might be other use cases for GPU preemption and developers are free to make use of the EGL_IMG_context_priority extension for their own applications. VR being such a demanding task for a portable device, developers should also keep an eye on the CPU scheduling and make use of all the profiling tools available.

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom
Tel: +44 (0)1923 260 511

Related blog articles

Connecting to CES 2019

Attending CES on an annual basis does enable one to make comparisons and get a sense of how technology and trends are progressing. Therefore, as well as meetings and discussions around our recent graphics and AI announcements we took the

AI in the UK: Year in Review

As a company focused heavily on enabling AI processing, it’s important to us that the country where we live is supporting efforts around AI technology development and policy. Indeed, the government, academic institutions, investors, and the entire ecosystem in the

CES 2019 banner

Get some facetime with Imagination at CES 2019

As the holiday season starts to fade to a distant memory the reality of CES is once again looming large our horizons, promising a new flurry of technology announcements to try and make sense of. As the saying goes, what

Lenovo 300e

PowerVR 2018 consumer tech round-up

Before we shut up shop for the year here at Imagination HQ, we thought we take a quick look at some of the end user products that Imagination tech found its way into in 2018. Our IP has appeared in

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.