PowerVR Series8XT: pushing power efficiency to the next level

Last year, Imagination announced Furian, our newest graphics architecture and the first major update for PowerVR since 2012. This introduction marks a significant step in satisfying the ever increasing demands for more device performance, especially for new use cases such as AR, VR and AI, while keeping PowerVR at the top of the pile for power efficiency and performance in embedded GPUs.
Furian was designed with scalability in mind, ensuring that performance and area efficiency remain consistent regardless of the number of scalable processing units (SPUs) that are laid down. In this post, we will be focussing specifically on the power efficiency of the architecture and how this ensures that it remains ahead of rivals designs for performance per watt.
Furian IguanaFollowing the arrival of the Furian architecture, we announced the first Furian-based core available for licence – the PowerVR GT8525 and we have now announced the PowerVR GT8540 – a four cluster variant aimed at the premium mobile and automotive markets.
Compared to the multi-core designs currently shipping in high-end smartphones and tablets, the single scalable processing unit (SPU) – containing what we describe as two-clusters, in the PowerVR GT8525 could be considered a fairly modest option to launch as the first GPU available to licence. However, it should not be underestimated. The performance of this single SPU design is a testament to how far Furian has moved the game on.
This is because by far the bulk of the potential smartphone market is in the category that some parts of the industry refer to as ‘super-mid’. As technology enthusiasts that make industry-leading graphics designs, we at Imagination are always going to be fans of premium products. Indeed, many of us carry high-end devices in our pockets featuring no-compromise performance. However, these types of devices are undoubtedly expensive and increasingly it can be argued that for most people lower-cost, more sensibly priced devices, offer enough capability for almost everyone. In large part this precisely down to GPUs such as the PowerVR GT8525.
For many, a sensibly priced device offers enough power to comfortably handle day-to-day tasks, such as browsing the web, checking social media, and also playing visually complex games smoothly. They are even good enough for users to dabble in new types of experiences such as AR, VR and AI-based applications. What’s more, the SoCs in mid-range devices don’t traditionally use leading-edge manufacturing processes, which will limit the power budget they can spend, and thus the use of a smaller GPU will be preferable. That being the case, this makes the release of the PowerVR GT8525 an ideal GPU for SoC vendors to license, thanks to its ideal balance between efficiency and performance.
Architected for efficiency
What really enables us to stay ahead of the curve when it comes to power efficiency, though, are the changes and enhancements in Furian that build on the work already done with Rogue. These changes were made to optimise internal efficiency, which inherently has the benefits of offering more performance for a given power consumption point.
Furian overall architecture
 
More efficient SPU
One of the changes is inside the scalable processing units, which have been re-architected to be more area efficient with much higher utilisation of the Arithmetic Logic Unit (ALU). As the block diagram above indicates, the texture unit now has its own cache, ensuring that it doesn’t have to compete with the Unified Shading Clusters (USCs) for access to the data – the result is higher throughput efficiency with lower power. The texture unit now delivers double the fill-rate over the previous design at eight pixels per clock, for only a modest increase in area.
The 2D Data Master
Another enhancement, first introduced with Series7XT, and enhanced further here, concerns the 2D Data Master. It’s now fully asynchronous, enabling better utilisation of the core, also delivering better power efficiency. This enables the independent submission of 2D workloads, bypassing all tiling overhead (e.g. when doing 2D using 3D calls) making it much more efficient for things such as creating UIs – and again this helps to reduce power consumption.
Doubling the pipe flow
And yet another major enhancement to overall efficiency stems from the changes made to the primary ALU pipelines (see the diagram below). This has doubled in width from 16 to 32 pipelines – delivering twice the throughput per clock. What’s key though is that it does not take up twice the silicon area thanks to streamlining of the internal design and more shared control logic. In Rogue, the pipelines contained two Multiply Adds function blocks (MADs). However, after close analysis of the shader and kernel code typically written by developers, it became apparent that these two MADs were rarely being fully utilised, as using two together proved difficult for the compiler.
Furian ALU efficiencyTherefore, with Furian we instead implemented a MAD and a MUL, offering more performance in the real world, while keeping silicon area cost in check. And in scenarios where two MAD operations are actually required, Furian’s double-width pipeline enables it to match Rogue’s performance (16×2 MADs = 32×1 MAD), so in that sense at worst, nothing has been lost over the previous approach and most of the time, throughput is significantly increased.
Reducing latency
The changes don’t end there. The function calls between the GPU driver and the GPU for compute, now no longer have to go through the OS kernel layer. They now use something called ‘user mode queues’ to communicate directly, reducing overhead and latency, in turn again reducing power consumption.
Furian user command queues
Additionally, the GPU now supports improved simultaneous access to more local memory addresses, which means each ALU pipeline can directly access the memory areas they need without stalls.
Furian multi-threaded memory access
The result
So what are the results of all these changes? As we can see from the graph below, in the popular industry standard Kishonti GFXBench Manhattan 3.0 benchmark, the Series8XT GT8525 offers significantly more frames per second (fps) per watt, or fps/W, than our PowerVR GT7450, our previous generation equivalent GPU hitting around 35fps compared to 15fps previously. (‘Frames per second’ refers to how smoothly a game might run on a device, so the higher number the better for the end-user experience). That it does so in a smaller silicon area (as indicated by the smaller circle) is an additional benefit.
PowerVR GT8525 vs GT7450 Manhattan performance per milliwatt

PowerVR GT8525 vs GT7450  – Manhattan 3.0 FPS vs FPS/W and Area 

While power efficiency is an important aggregate measurement metric of a GPU, what matters equally is the absolute power consumption. Mobile devices are limited to an SoC thermal envelope of 3-3.5W. Of that number, the GPU accounts for 30-50% of the power consumption. At an estimated 35fps/W, a Series8XT GT8525 in a mid-range SoC manufactured on TSMC’s 10FF process falls in the sweet-spot of under 1.5W of power for the GPU.
To put it into perspective, against our previous generation Series7XT, we expect that for the same performance target (iso-performance), a Series8XT GT8525 will use up to 60% less power against a comparable Series7XT GT7450, meaning the performance/W goes up to an incredible 75%! For the end user, this means that they will be able to enjoy a device that will last for longer, especially when under load, such as when gaming.
Of course, our industry-leading power efficiency is built on the use of our legendary tile-based deferred rendering (TDBR) technique, where we only render the pixels that will be seen on the screen and as ever, this is what Furian is based on. If you’d like to know more, do check out our older blog posts that go in-depth on TBDR.
Conclusion
As you can see then from all these changes introduced in the Furian architecture, the PowerVR Series8XT GT8525 offers highly effective performance at cost-effective prices, raising the bar for devices in a wide range of markets. We have already licensed our first Series8XT core to key customers, and we look forward to adding to that list in 2018.
Remember to keep up to date you can follow us on social media on Twitter @ImaginationTech and on LinkedInFacebook and Google+.

2 thoughts on “PowerVR Series8XT: pushing power efficiency to the next level”

  1. 1. what’s different between GT7450 and GT7400 series 7XT plus or just rebranded name?
    2. any improvement on tesselation performance if furian is compared with rogue? because rogue on GFXBenchmark Tesselation Offscreen @Helio X30 look weaker than meizu pro 6 plus with mali grahics(13.3fps vs. 41.7fps) and very weak against adreno 530 or 540(13.3fps vs 69.2fps or 80.2fps).
    3. what’s different between fps/watt and performance/watt? according to your figures fps/watt is over 2.3x(35 vs. 15) but only 75% for performance/watt?
    4. at iso-perfomance 800MHz GT7450(204.8GFLOPs), GT8525 need clock frequency around 1066MHz both on 10nmFF, but GT8525 still achieve great efficiency performance/watt on those graph, so it safe to assume that powervr furian GT8525 can maintaining high clock above 1GHz and still low power?

  2. it’s weird when suddenly GT7450 came up without any announcement, i have 2 questions.
    1. what’s different between GT7450 and GT7400 series XT+?
    2. any improvement on tesselation performance if furian is compared with rogue? because rogue on Helio X30 look little weaker than mali and very weaker against adreno 530 or 540.

Leave a Comment

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

benny.har-even@imgtec.com
Tel: +44 (0)1923 260 511

Related blog articles

Right or wrong? Looking back at our 2018 predictions

When we gazed into our crystal ball at the end of 2017, we made several predictions for 2018. I’m pleased to say that we weren’t far off the mark on most of them. In this post, we revisit our 2018

What is PowerVR Automotive? Register NOW to hear our webinar.

The automotive industry is going through many changes and that is having a huge impact on the semiconductor IP industry. The vehicle will move from being predominantly mechanical to primarily a computer on wheels enabling a future of self-driving cars,

Image-based lighting

PowerVR Tools and SDK 2018 Release 2 now available

Here’s an early Christmas present for graphics developers – the release of the latest version of our PowerVR Tools and SDK! The headline features for this release include some exciting new examples demonstrating new techniques in our SDK, and some very

on stage in China

PVRIC4 a hit at ICCAD 2018 in China

Imagination’s PVRIC4 image compression tech garnered plenty of attention at the recent ICCAD China 2018 symposium, which took place on 29th and 30th November at the Zhuhai International Convention & Exhibition Centre, China. The annual event focusses on integrated circuit

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.