PowerVR PerfDoc – Performance Validation for PowerVR

Share on linkedin
Share on twitter
Share on facebook
Share on reddit
Share on digg
Share on email

It’s time for some exciting news! As usual, our engineers here in the Developer Technology team have been hard at work making your life a little bit easier when developing for PowerVR devices.

Today, we’re releasing PVRPerfDoc a custom version of the Vulkan® validation layer, PerfDoc, which should make it more useful for developers working with PowerVR hardware. If that’s all you need to hear, why not check PVRPerfDoc on GitHub?

However, for the uninitiated…

What is PerfDoc?

PerfDoc is a Vulkan validation layer produced by our friends at ARM. Validation layers are important component of the Vulkan API which allows it to have a much lower CPU overhead compared to older APIs, like OpenGL® ES. Vulkan has very little in-built error checking, instead a developer can enable a layer, like the LunarG validation layer, which will check the API is being used correctly when the application is run.

So, what makes PerfDoc different from any other validation layer? Well, instead of checking whether Vulkan is being used correctly (in line with the Vulkan specification), it’s instead focused on whether the API is being used efficiently. When a Vulkan application is run with the PerfDoc layer active any errors will be reported to the application as callbacks or printed in an available console. This means inefficient uses of the API, such as creating a pipeline without a pipeline cache or using too many instanced vertex buffers, can be caught and fixed during development. Addressing the issues raised by PerfDoc can improve application performance and get the best out of Vulkan.

Unfortunately, since PerfDoc validates API usage against ARM’s performance recommendations for Mali, some of these recommendations may actually hurt performance on PowerVR hardware.

This is where we come in: we’ve forked PerfDoc to validate API usage against our own performance recommendations for PowerVR hardware.

What have we done?

We’ve customised PerfDoc for PowerVR in two ways:

  1. We’ve made all of the standard checks toggle-able, so you can select which of the original performance recommendations your application will be checked against. By default, we’ve enabled the checks which tell you how to improve performance on PowerVR.
  2. Next, we’ve added lots of extra checks which are based on our own PowerVR Performance Recommendations. This set of recommendations was developed by our experienced team of engineers here in the Developer Technology, in order to allow you to get the most of your applications running on PowerVR devices.

Whether you’re a Vulkan newbie or a more experience developer our customised layer is a great way to ensure you’re always thinking about performance when developing for PowerVR with Vulkan.

For the rest of this post we’ll take a look at some of the performance recommendations that this new layer checks against. This will give you an idea of how this new layer can help you.

Remember to use mipmapped and compressed textures

Using both compressed textures, such as PVRTC, ASTC, and ETC2, and mipmapping can help reduce the strain on memory bandwidth as textures being loaded will be smaller in size. Memory bandwidth issues are common bottlenecks for graphics-heavy applications, so reducing bandwidth usage in any application will generally give a nice performance boost.

example mipmap chain 1

The layer will output an error when a texture is using an uncompressed format or doesn’t have any specified mipmaps levels in its associated image view. An exception is when the image view is a render target.

An easy way to get mipmapped and compressed textures is with our texture processing tool, PVRTexTool. This powerful tool allows texture encoding into a variety of compressed and uncompressed formats including PVRTC, ASTC, and ETC. You can also generate a full mipmap chain with just one click.

Try to use indexed draw calls when a call involves a lot of geometry

Indexed draw calls can help to reduce the amount of geometry by eliminating redundant vertices in the vertex buffer. PowerVR GPUs are also optimised for indexed draw calls. This means you can really notice the difference when rendering scenes with complex geometry.

If you really want to push your performance further, you can also sort both the index and vertex buffers. This will improve cache efficiency. PVRGeoPOD, our scene exporter and optimisation tool can do this automatically for you.

Ensure framebuffer compression is used as much as possible

PowerVR GPUs use PowerVR Image Compression (PVRIC) which is Imagination’s proprietary, lossless framebuffer compression and decompression (FBCDC) algorithm. This compression scheme helps to reduce the demands on memory bandwidth by shrinking image sizes by around 50%. The layer will advise when framebuffer compression should be used.

PVRIC lossless vs lossy

Try to use optimal subpass dependency flags when creating subpasses

The layer can check which subpass dependency flag has been set during the creation of subpass and output an error if the flag selected is not optimal for PowerVR architectures. For PowerVR and other tile-based architectures, VK_DEPENDENCY_BY_REGION_BIT is the best flag for performance tile, as whole tiles can be kept in fast on-chip memory.

Avoid using partial clears of the framebuffer

Clear commands which do not clear the entire framebuffer can be detected with this layer. Partial clears are generally bad for performance as they result in overdraw.

Keep pipeline optimisation enabled

When creating a pipeline in Vulkan, you can set the flags parameter to VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT. This specifies that the created pipeline will not be optimised which may help to reduce pipeline creation time. However, it should only be set in debug or developer builds, not release builds, as it can harm overall performance.

Split multiple renderpasses into subpasses wherever possible

If you’re still getting used to Vulkan, it can be easy to forget about subpasses. In Vulkan, renderpasses can be divided into different phases of rendering called subpasses. Each of these subpasses have their own specified dependencies. This allows the driver to perform various optimisations, as it knows exactly what you’re trying to do. This layer can detect when multiple renderpasses are set to act on the same render target and will warn you. It’s best in these cases to integrate these renderpasses as multiple subpasses within a single renderpass.

But there’s much more…

We’ve added 15 new checks in total, these include:

  • Checking whether textures are using linear or optimal tiling layout
  • Ensuring the workgroup size is optimal for PowerVR
  • Advising on texture formats
  • And more….

So… what are you waiting for?

You can take a look at PVRPerfDoc on GitHub. Here’s a direct link to our first release PVRPerfDoc v1.0.

We’d love to hear feedback about this or any other tool we produce.

To get in touch or keep up with the latest Developer Technology news, follow @tom_devtech on Twitter.

If you need any help check out our forums or ticketing systems.

Tom Lewis

Tom Lewis

Tom Lewis is a graduate technical author in the PowerVR Developer Technology team at Imagination. He is responsible for producing documentation to support the PowerVR SDK and Tools, including user manuals and guides. Outside of this, you will probably find him cycling up a hill that is far too steep or catching up on the latest PC game releases.

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

Tel: +44 (0)1923 260 511

Search by Tag

Search by Author

Related blog articles

pvrtune complete

What is PVRTune Complete?

PVR Tune Complete highlights exactly what the application is doing at the GPU level, helping to identify any bottlenecks in the compute stage, the renderer, and the tiler.

Read More »
shutterstock 1175807410 1

Vulkan synchronisation and graphics-compute-graphics hazards: Part I

How do you mix and match rasterisation and compute in a modern GPU? In modern rendering environments, there are a lot of cases where a compute workload is used during a frame. Compute is generic (non-fixed function) parallel programming on the GPU, commonly used for techniques that are either challenging, outright impossible, or simply inefficient to implement with the standard graphics pipeline (vertex/geometry/tessellation/raster/fragment).

Read More »


Sign up to receive the latest news and product updates from Imagination straight to your inbox.