Understanding PowerVR Series5XT: PowerVR, TBDR and architecture efficiency

Share on linkedin
Share on twitter
Share on facebook
Share on google

In the previous three sections, I discussed the need to match market requirements, our rapid adoption of multiple graphics APIs, and our uncompromising support for GPU compute across all our PowerVR GPU cores.

In this section, I will dive more deeply into the underlying hardware architecture, highlighting a selection of the patented unique features, as well as the flexibility of our modern efficiency-focussed GPU architecture, which made it possible to support today’s requirements on our Series5XT architecture.

Mobile graphics are all about efficiency

Imagination’s PowerVR GPUs have always been different: from day one, our hardware architects realised that brute force solutions will sooner rather than later run into problems, and with mobile GPUs these problems all too easily include power consumption and/or bandwidth limitations.

By focussing our whole architectural design around ‘efficiency’, we are guaranteed to always come out ahead of our competition – be it better performance versus power consumption, or, equally important, better performance versus an always limited bandwidth budget.

This efficiency focus made it easy for us to target the mobile market in 2001, as in this market power consumption is critical. Battery capacity is always limited – linked to this, heat is also increasingly a problem due to the rise in popularity of the thin form factor.

But equally important, bandwidth in mobile is limited, as for power reasons the memory infrastructure is unified and shared, and hence equally restricted.

PowerVR is the only Tile Based Deferred Rendering (TBDR) implementation

Driven by our focus on efficiency, our architects rejected the brute force Immediate Mode Rendering (IMR) approach early on, as it’s extremely wasteful in bandwidth usage and processing efficiency as illustrated below:

The Immediate Mode Rendering (IMR) graphics pipeline The Immediate Mode Rendering (IMR) graphics pipeline

We considered Tile Based Rendering (TBR) which ensures 100% cache efficiency for depth buffers and colour buffers (thus restricting a lot of bandwidth to be only on-chip – a significant efficiency gain), but equally found that this approach still resulted in lots of wasted cycles and bandwidth on texturing/shading of invisible pixels as illustrated below:

The Tile Based Rendering (TBR) graphics pipeline

 The Tile Based Rendering (TBR) graphics pipeline

So ultimately our architects designed what is today known as true Tile Based Deferred Rendering (TBDR), a rendering approach which aims to ‘delay’ or ‘defer’ all texturing and shading operations until their visibility is known. This is illustrated below:

The Tile Based Deferred Rendering (TBDR) graphics pipeline (PowerVR Series5XT)

The Tile Based Deferred Rendering (TBDR) graphics pipeline

By now, you’re probably wondering why I feel the need to add the term ‘true’ to my reference of TBDR. The reason for this is simple: competitor marketing hype machines have been trying to redefine what we have been calling TBDR for the past 20 years.

Their reasoning is that TBR ‘defers’ the processing of pixels until after the geometry has been tiled, and hence they feel the need to claim that ‘TBR’ = ‘TBDR’. Unfortunately, this ignores our 20 year standing definition of ‘true’ TBDR, where the ‘deferring’ is about deferring the texturing/shading processing. Basically, make sure that when you see a claim that a GPU has TBDR, it’s actually really the ‘true’ TBDR.

If you’d like to read more about TBDR and its benefits, have a look at the excellent whitepaper we have available on our PowerVR Insider developer support portal.

To date, TBDR continues to be the most advanced and efficient approach to rendering. Over time, ‘brute force’ solutions have adopted an ever-increasing number of TBDR properties from tiled rasterisation, to tiled rendering approaches for efficient usage of eDRAM, to software assisted ‘Deferred Rendering,’ where depth pre-passes are submitted by the application to try and approximate the inherent benefits of the hardware based ‘deferred rendering’ of our PowerVR TBDR architecture.

Our efficiency through TBDR ensures the lowest possible bandwidth usage and the lowest amount of processing cycles per frame, and all of this leads to high performance efficiency and the lowest power consumption per frame.

In the next blog post, I will continue to discuss how PowerVR SGX puts efficiency first through a series of unique architectural features like multithreading and multitasking that allow Series5XT cores to deliver uncompromising performance while keeping power consumption at a minimum.

If you have any questions or feedback about Imagination’s graphics IP, please use the comments box below. To keep up to date with the latest developments on PowerVR, follow us on Twitter (@GPUCompute, @PowerVRInsider and @ImaginationTech) and subscribe to our blog feed.

‘Understanding PowerVR’ is an on-going, multi-part series of blog posts from Kristof Beets, Imagination’s Senior Business Development Manager for PowerVR. These articles not only focus on the features that make PowerVR GPUs great, but also provide a detailed look at graphics hardware architectures and software ecosystems in mobile markets.

If you’ve missed any of the posts, here are some backlinks:

2 thoughts on “Understanding PowerVR Series5XT: PowerVR, TBDR and architecture efficiency”

  1. This is a bit late (and I have moved this post from another blog-post):
    I notice that the PowerVR is able to do far more draw calls than the
    competition. This seems to make sense with your TBDR solution. In a
    typical TBR, each call is transformed, rasterized, filled and written
    out to the FB (albeit in small tiles). If sorted front to back, the
    typical TBR can just ignore filling occluded fragments. But for your
    TBDR, the driver would have to defer the rasterization/filling to the
    very last step in order to sort fragments, meaning that each draw call
    would contend with far less work — it would more-or-less submit
    information for future processing. When you’re ready for the
    framebuffer, the bulk of the work would be done in batches: eg. sorting,
    rasterizing, fragment shading based on materials/assets, framebuffer
    writing.
    Is this true? If it is, it’s quite a clever implementation as it
    efficiently batches workloads which should be far more efficient than
    running a complete render cycle for each draw call.
    I guess the big challenge is ensuring that the operations can fit as well as possible in on-chip memory to avoid swapping.
    Let me know if I’m off base!

  2. This is a bit late (and I have moved this post from another blog-post):
    I notice that the PowerVR is able to do far more draw calls than the
    competition. This seems to make sense with your TBDR solution. In a
    typical TBR, each call is transformed, rasterized, filled and written
    out to the FB (albeit in small tiles). If sorted front to back, the
    typical TBR can just ignore filling occluded fragments. But for your
    TBDR, the driver would have to defer the rasterization/filling to the
    very last step in order to sort fragments, meaning that each draw call
    would contend with far less work — it would more-or-less submit
    information for future processing. When you’re ready for the
    framebuffer, the bulk of the work would be done in batches: eg. sorting,
    rasterizing, fragment shading based on materials/assets, framebuffer
    writing.
    Is this true? If it is, it’s quite a clever implementation as it
    efficiently batches workloads which should be far more efficient than
    running a complete render cycle for each draw call.
    I guess the big challenge is ensuring that the operations can fit as well as possible in on-chip memory to avoid swapping.
    Let me know if I’m off base!

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

benny.har-even@imgtec.com
Tel: +44 (0)1923 260 511

Related blog articles

Product and event round-up from the experts in GPU and AI

It’s certainly been a busy few months for Imagination. Towards the latter end of last year, we released a raft of new products and initiatives, and a new CEO took the helm giving us real momentum for 2019. At the

How AI is conducting the future of music technology

“We tend to think of technological advances as destroying what’s gone before, but that doesn’t usually happen. This could lead to a different way of making music.” – Jarvis Cocker, former Pulp frontman, solo artist, writer and broadcaster In recent

Why you should join Imagination at Embedded World 2019

Our technology is focussed entirely on offering SoC manufacturers low power, high-performance options for building groundbreaking products in a range of markets, from automotive to smart devices such as smart speakers to the latest smartphones. Embedded World is one of

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.