Understanding PowerVR Series5XT: PowerVR, TBDR and architecture efficiency

In the previous three sections, I discussed the need to match market requirements, our rapid adoption of multiple graphics APIs, and our uncompromising support for GPU compute across all our PowerVR GPU cores.

In this section, I will dive more deeply into the underlying hardware architecture, highlighting a selection of the patented unique features, as well as the flexibility of our modern efficiency-focussed GPU architecture, which made it possible to support today’s requirements on our Series5XT architecture.

Mobile graphics are all about efficiency

Imagination’s PowerVR GPUs have always been different: from day one, our hardware architects realised that brute force solutions will sooner rather than later run into problems, and with mobile GPUs these problems all too easily include power consumption and/or bandwidth limitations.

By focussing our whole architectural design around ‘efficiency’, we are guaranteed to always come out ahead of our competition – be it better performance versus power consumption, or, equally important, better performance versus an always limited bandwidth budget.

This efficiency focus made it easy for us to target the mobile market in 2001, as in this market power consumption is critical. Battery capacity is always limited – linked to this, heat is also increasingly a problem due to the rise in popularity of the thin form factor.

But equally important, bandwidth in mobile is limited, as for power reasons the memory infrastructure is unified and shared, and hence equally restricted.

PowerVR is the only Tile Based Deferred Rendering (TBDR) implementation

Driven by our focus on efficiency, our architects rejected the brute force Immediate Mode Rendering (IMR) approach early on, as it’s extremely wasteful in bandwidth usage and processing efficiency as illustrated below:

The Immediate Mode Rendering (IMR) graphics pipeline The Immediate Mode Rendering (IMR) graphics pipeline

We considered Tile Based Rendering (TBR) which ensures 100% cache efficiency for depth buffers and colour buffers (thus restricting a lot of bandwidth to be only on-chip – a significant efficiency gain), but equally found that this approach still resulted in lots of wasted cycles and bandwidth on texturing/shading of invisible pixels as illustrated below:

The Tile Based Rendering (TBR) graphics pipeline

 The Tile Based Rendering (TBR) graphics pipeline

So ultimately our architects designed what is today known as true Tile Based Deferred Rendering (TBDR), a rendering approach which aims to ‘delay’ or ‘defer’ all texturing and shading operations until their visibility is known. This is illustrated below:

The Tile Based Deferred Rendering (TBDR) graphics pipeline (PowerVR Series5XT)

The Tile Based Deferred Rendering (TBDR) graphics pipeline

By now, you’re probably wondering why I feel the need to add the term ‘true’ to my reference of TBDR. The reason for this is simple: competitor marketing hype machines have been trying to redefine what we have been calling TBDR for the past 20 years.

Their reasoning is that TBR ‘defers’ the processing of pixels until after the geometry has been tiled, and hence they feel the need to claim that ‘TBR’ = ‘TBDR’. Unfortunately, this ignores our 20 year standing definition of ‘true’ TBDR, where the ‘deferring’ is about deferring the texturing/shading processing. Basically, make sure that when you see a claim that a GPU has TBDR, it’s actually really the ‘true’ TBDR.

If you’d like to read more about TBDR and its benefits, have a look at the excellent whitepaper we have available on our PowerVR Insider developer support portal.

To date, TBDR continues to be the most advanced and efficient approach to rendering. Over time, ‘brute force’ solutions have adopted an ever-increasing number of TBDR properties from tiled rasterisation, to tiled rendering approaches for efficient usage of eDRAM, to software assisted ‘Deferred Rendering,’ where depth pre-passes are submitted by the application to try and approximate the inherent benefits of the hardware based ‘deferred rendering’ of our PowerVR TBDR architecture.

Our efficiency through TBDR ensures the lowest possible bandwidth usage and the lowest amount of processing cycles per frame, and all of this leads to high performance efficiency and the lowest power consumption per frame.

In the next blog post, I will continue to discuss how PowerVR SGX puts efficiency first through a series of unique architectural features like multithreading and multitasking that allow Series5XT cores to deliver uncompromising performance while keeping power consumption at a minimum.

If you have any questions or feedback about Imagination’s graphics IP, please use the comments box below. To keep up to date with the latest developments on PowerVR, follow us on Twitter (@GPUCompute, @PowerVRInsider and @ImaginationTech) and subscribe to our blog feed.

‘Understanding PowerVR’ is an on-going, multi-part series of blog posts from Kristof Beets, Imagination’s Senior Business Development Manager for PowerVR. These articles not only focus on the features that make PowerVR GPUs great, but also provide a detailed look at graphics hardware architectures and software ecosystems in mobile markets.

If you’ve missed any of the posts, here are some backlinks:

2 thoughts on “Understanding PowerVR Series5XT: PowerVR, TBDR and architecture efficiency”

  1. This is a bit late (and I have moved this post from another blog-post):
    I notice that the PowerVR is able to do far more draw calls than the
    competition. This seems to make sense with your TBDR solution. In a
    typical TBR, each call is transformed, rasterized, filled and written
    out to the FB (albeit in small tiles). If sorted front to back, the
    typical TBR can just ignore filling occluded fragments. But for your
    TBDR, the driver would have to defer the rasterization/filling to the
    very last step in order to sort fragments, meaning that each draw call
    would contend with far less work — it would more-or-less submit
    information for future processing. When you’re ready for the
    framebuffer, the bulk of the work would be done in batches: eg. sorting,
    rasterizing, fragment shading based on materials/assets, framebuffer
    writing.
    Is this true? If it is, it’s quite a clever implementation as it
    efficiently batches workloads which should be far more efficient than
    running a complete render cycle for each draw call.
    I guess the big challenge is ensuring that the operations can fit as well as possible in on-chip memory to avoid swapping.
    Let me know if I’m off base!

  2. This is a bit late (and I have moved this post from another blog-post):
    I notice that the PowerVR is able to do far more draw calls than the
    competition. This seems to make sense with your TBDR solution. In a
    typical TBR, each call is transformed, rasterized, filled and written
    out to the FB (albeit in small tiles). If sorted front to back, the
    typical TBR can just ignore filling occluded fragments. But for your
    TBDR, the driver would have to defer the rasterization/filling to the
    very last step in order to sort fragments, meaning that each draw call
    would contend with far less work — it would more-or-less submit
    information for future processing. When you’re ready for the
    framebuffer, the bulk of the work would be done in batches: eg. sorting,
    rasterizing, fragment shading based on materials/assets, framebuffer
    writing.
    Is this true? If it is, it’s quite a clever implementation as it
    efficiently batches workloads which should be far more efficient than
    running a complete render cycle for each draw call.
    I guess the big challenge is ensuring that the operations can fit as well as possible in on-chip memory to avoid swapping.
    Let me know if I’m off base!

Leave a Comment

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

benny.har-even@imgtec.com
Tel: +44 (0)1923 260 511

Related blog articles

Image-based lighting

PowerVR Tools and SDK 2018 Release 2 now available

Here’s an early Christmas present for graphics developers – the release of the latest version of our PowerVR Tools and SDK! The headline features for this release include some exciting new examples demonstrating new techniques in our SDK, and some very

on stage in China

PVRIC4 a hit at ICCAD 2018 in China

Imagination’s PVRIC4 image compression tech garnered plenty of attention at the recent ICCAD China 2018 symposium, which took place on 29th and 30th November at the Zhuhai International Convention & Exhibition Centre, China. The annual event focusses on integrated circuit

The ultimate embedded GPUs for the latest applications

Introducing PowerVR Series9XEP, Series9XMP, and Series9XTP As Benjamin Franklin once said, only three things in life are certain: death, taxes and the ongoing rapid advancement of GPUs for embedded applications*. Proving his point, this week, Imagination has once again pushed

Opinion: the balance between edge and cloud.

Simon Forrest explains how embedded chips can meet the challenge of delivering true local AI processing. GPUs and NNAs are rapidly becoming essential elements for AI on the edge. As companies begin to harness the potential of using neural networks

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.