It’s been a while since we first showed off our Vulkan* driver for PowerVR Rogue GPUs. Since then, our PowerVR driver and graphics demo teams have been working hard to synchronize with the spec as it evolves towards its final form.

Today we are excited to show you a new demo we have been working on that better highlights the specific benefits we believe this API should bring to developers and devices.

Vulkan and OpenGL ES in Gnome Horde

This new demo is called Gnome Horde and runs under Android on the Intel-based Nexus Player, a consumer device integrating a PowerVR G6430 GPU; it uses the latest prototype Vulkan API driver for PowerVR GPUs (final performance may differ).

On the left-hand side of the video, we are showing Vulkan and on the right we have OpenGL® ES 3.0. We have attempted to ensure both versions run equivalent code and both run without extensions. The demos are not using instancing either, each draw call could be a different piece of geometry with a different material or texture and the CPU performance would be very similar.

Before reading any further, please note that this is an exaggerated scenario that is intended to highlight and amplify Vulkan’s strengths. It is not intended to show OpenGL ES in a bad light – we are deliberately using OpenGL ES in a way that it was not designed for. We are also aiming to be GPU bound using the Vulkan API; this means the GPU and CPU are being used as effectively as possible, which is a great thing for developers and vendors alike.

The implementation details

Using Vulkan we batch draw calls into tiles and render a tile at a time. Each time a tile goes out of view, comes in to view or changes its level of detail we regenerate a command buffer (more on this later). By avoiding changes in the command buffer, we reduce overall CPU usage significantly compared to OpenGL ES.

This is explained in more detail below.

Tiled renderingTiled rendering

In OpenGL ES, all draw calls are submitted dynamically according to the tiles in view, with no opportunity to cache draw calls that have already been executed.

Lower CPU usage

As you can see from the CPU usage graph in the bottom left of the video, CPU usage is very low for this many draw calls in the first mode. In the highest zoom level we are drawing around 400,000 gnomes (and other objects) per second. Each object has a different transformation, and there are many different materials, textures, blend modes and shaders being used.

The reason that the OpenGL ES API struggles with these tasks is because OpenGL ES requires many calls into kernel mode to change the state of the driver, along with validating that state and any extra work that goes on behind the scenes, all during an applications render loop.

This is in contrast to Vulkan where we can pre-generate these commands. Executing pre-generated commands in Vulkan is very fast, with little CPU overhead and no need for the driver to validate or compile anything inside the render loop. These pre-generated commands are called command buffers.

Vulkan demos CPU usage vs OpenGL ESVulkan CPU usage (left) and OpenGL ES CPU usage (right) for Gnome Horde

The lower line is the process CPU usage and the top line represents system CPU usage. Both are reduced in Vulkan due to the ability to process command buffers before submission.

Command buffer re-use

Being able to re-use command buffers proves useful in some circumstances. This feature will not be a panacea, but it will be possible to use it to a great extent in many games and applications. In this specific instance we decided that being able to re-use command buffers for each tile would reduce the overall CPU usage.

Before drawing in both APIs, the driver needs to compile a set of commands for the GPU to execute, validate those commands, and do other work – all before actually starting the GPU. With OpenGL ES, this needs to be performed for each draw call, during the render loop. In Vulkan we can compile and validate this list of commands ahead of time, and then have the GPU execute these pre-generated commands.

Vulkan demo screenshotVulkan in action: Gnome Horde demo screenshot

In this screenshot there are 300 tiles with a total of 13,500 draw calls being run at roughly 30fps with very little CPU usage, this is approaching half-a-million draw calls per second without instancing.

Parallel command buffer generation

In the next demo modes, watching the CPU graph we can see that we can go from very little CPU usage to using nearly the whole of every CPU core. What’s happening here is the camera is moving much faster and therefore needs to regenerate command buffers more frequently (a slightly unrealistic use-case). In OpenGL ES we are CPU bound and cannot feed the GPU with enough commands. However with Vulkan we have the opportunity to distribute the regeneration of the tiles command buffers to different threads. This is not possible with OpenGL ES which was designed before multi-threading was widely available. In a real application, the workload will be somewhere between the two extremes of dynamic draw calls and static draw calls.

In this case we are sacrificing CPU usage for memory usage. We could store all of the command buffers for the entire scene in memory. However on mobile devices, memory is often limited so we only store the command buffers that are in the viewable frustum instead. With Vulkan we are purposefully bound by GPU performance which goes to show that we are using the the CPU effectively and feeding the GPU with enough commands.

Vulkan CPU vs OpenGL ES CPU: Note how OpenGL ES cannot do multi-threadingVulkan CPU vs OpenGL ES CPU: Note how OpenGL ES cannot do multi-threading

For equivalent performance, the Vulkan demo could have the CPU run at a much lower clock frequency, increasing efficiency and battery life compared to OpenGL ES.

In this mode there are roughly 80 command buffers being re-created each frame distributed between the cores of the CPU. Each command buffer has 45 draw calls and other state setting information. With all this work going on, it is good to see the frame rate stays the same as in the previous mode.

Memory allocation strategies

One advantage of Vulkan over OpenGL ES is that the developer has more visibility of the memory that needs to be allocated. With OpenGL ES the driver handles most of the allocation and hides it away from the developer. With Vulkan the memory that the driver allocates is very minimal and the developer can use different memory allocation strategies. For example, if an image is not in use by the GPU, the developer could decide to use that memory for other purposes like uploading a texture.

Render pass – pixel local storage

In Vulkan there is a structure called a render pass; each render pass has one or more sub passes. These sub passes can be exploited to utilise pixel local storage to store intermediate values for shaders between sub passes.

Being a tile-based deferred renderer, PowerVR can execute multiple shaders for the same pixel in an FBO effectively using fast on-chip memory. This is a good idea in rendering techniques such as
deferred rendering. The benefit of doing this is that it avoids wastefully writing intermediate values back to main memory, saving bandwidth and therefore power. However this functionality is an extension in OpenGL ES, requiring more code to check if the extension exists.

In Vulkan this functionality is a core feature that will benefit battery life and the efficiency of applications and devices. Vulkan also allows the driver to handle out-of-memory issues gracefully with respect to deferred renderers and the transient memory they use.

Finally

All of the features above require implementation in code, so the use of Vulkan does come with added code complexity compared to OpenGL ES. However, Imagination is committed to continuing full support for OpenGL ES for a long time to come alongside developing a new Vulkan API driver for PowerVR Rogue GPUs.

Devices with the new Vulkan API should bring new optimisation opportunities and increased efficiency to application developers.

If you are heading to SIGGRAPH 2015 this week, drop by the Khronos Group BoF meeting on Wednesday to see this demo in action and get an explanation of what is going on.

Stay tuned to our blog as we will bring you more details after the BoF.

Remember to also follow us on Twitter (@ImaginationPR, @PowerVRInsider) for  the latest news and announcements from the PowerVR Insider team.

Editor’s Note

* The prototype Vulkan driver for PowerVR Rogue GPUs is based on an internal draft Khronos Specification, which may change prior to final release. Conformance criteria for this Specification have not yet been established.

PowerVR Rogue GPUs are based on published Khronos specifications, and are expected to pass the Khronos Conformance Testing Process. Multiple PowerVR Rogue GPU cores have already achieved OpenGL ES conformance. Current conformance status can be found at www.khronos.org/conformance.

OpenGL is a registered trademark and the OpenGL ES logo is a trademark of Silicon Graphics Inc. used by permission by Khronos.

Comments

  • sas

    Neat, I would like to know if your are planning on opensourcing the code and help with the integration for OpenGL next gen?

    • Not at the moment since the Vulkan spec isn’t final. We will see what we can do when that happens.

  • Alexander von Gluck

    Yeah, i’d also really like to see this demo open sourced… We need more open demos in the ecosystem that test the capabilities of API’s

    • I’d love to open source it. Unfortunately we’d have to wait until the Vulkan spec has been released. I might be able to show you the code from the OpenGL side of things only.

    • I will send your feedback to the PowerVR graphics demo team; perhaps we can make certain parts available – or all of it when the spec is finalized.

    • Steven Szennai

      I’m working on an open source implementation (desktop for now). However it seems that the performance depends heavily on the poligon count of the rendered objects and/or the actual hardware.

  • it’s possible to try the demo?

    • Pavel Sikun

      well, if you have gpu with support of vulkan – then you could possibly search for this demo.

      But afaik we won’t see any commercial device with supported mobile gpu until Q2 2016

      • The gpu in the video is a simple 6430 from 2013..

        • pj

          Shimo – you need the gpu *and* the OS/driver support for Vulkan. Nobody has those last two in the wild as far as I know.

          • That’s correct, this is a prototype driver that we’ve flashed on an experimental version of Android.

    • Once the Vulkan spec is finalized and the driver achieves conformance, you will be able to run Vulkan-compatible apps on PowerVR GPUs.

  • Michael DeGuzis

    This is crazy awesome to see these kinds of results and metrics.

    • Thank you, we hope to achieve even better results when the spec is finalized and the PowerVR driver achieves conformance.

  • NEKO WORKING

    Can’t wait for next handheld console :3

    • Matthew Smit

      Consoles (and the console developers) have always had the ability to directly access hardware, which is the main reason why the xbox 360 and PS3 could have such good graphics on hardware that is very old, and PCs required much better GPUs to run at an equivalent quality.

      Since vulcan is coming to PCs and mobiles, this will add the ability for developers to release more optimized games for android.

  • LDM

    May I asked a stupid question? Since so huge difference in performance between these 2 APIs why there is still the need to support Open GL ES? The latter seems not efficient at all..

    • That’s not a stupid question at all. Vulkan is not a replacement for OpenGL (ES), but a complementary API for developers who want to get close to the metal (e.g. game engine guys).
      OpenGL ES will still be available for a lot of high-level, quick-and-easy rendering. For example, simple UIs or 2D/3D games will likely not see any benefit from Vulkan.

      • LDM

        Thanks Alex 🙂

  • Giuseppe Barbieri

    How many triangles per gnome? How many different materials, textures, blend modes and shaders are being used?

    • Hey, each object has various levels of detail. From 13K vertices to ~300. In the zoomed out screen shot there are about 1M triangles. Each type of object has a different texture, so ~10 different textures including the shadows. There is alpha-testing for the plants and alpha blending for the shadows and each object type is using a different shader.

  • really nice article.

  • Aaron Sarrafoğlu

    Are you willing to support older designs such as PowerVR SGX544 or something like it?

    • Vulkan mandates the OpenGL ES 3.1 feature set. PowerVR SGX GPUs support up to OpenGL ES 2.0 so we are not able to support Vulkan on older generation hardware.

      • LDM

        Is that applicable for PS Vita GPU as well ?

        • verdantchile

          The Vita GPU’s functionality falls short of Vulcan’s standards, but the custom console API(s) Sony provides can still enable low-level GPU access for many of the benefits.

          • LDM

            Thanks!

  • johnBas5

    Interesting test and comparison.

    I’m hoping a future version of
    Gnome Hordes comes out with other optimization features and CPU-saving
    tricks (post Vulkan 1.0, say Vulkan 1.1 or something).

    Being able to see the impact of those features on
    the different API’s would be very interesting.

  • Are these kinds of improvements available for geometry which changes position, or only static (I notice the gnomes do not move)

  • Gabriel Chiarelli Noble

    I don’t know if this is the right place to ask this but… OpenGL has things like glGenBuffers, glGenFramebuffers, etc. Vulkan will keep this kind of API call or it will use a completely different approach regarding data storage into the GPU?

  • Kazami Yuji

    The download link for the gnome demo doesn’t work, i redirects me to https://www.imgtec.com/

    • If you are talking about http://www.imgtec.com/vulkan, the download link works for me (see image below)

      I’ll check in with our website team tomorrow, perhaps there is a permission issue related to your account.

      Regards,
      Alex.

      • Daniel Fries

        This is probably a dumb question but which application do you use to run this demo in Windows?

        • The source code can be used on Android and we currently don’t have a plan to provide a version for Windows.

          • Meli

            The link doesn’t work for me either :/