The ultimate embedded GPUs for the latest applications

Share on linkedin
Share on twitter
Share on facebook
Share on reddit
Share on digg
Share on email

Introducing PowerVR Series9XEP, Series9XMP, and Series9XTP

As Benjamin Franklin once said, only three things in life are certain: death, taxes and the ongoing rapid advancement of GPUs for embedded applications*. Proving his point, this week, Imagination has once again pushed the boundaries of graphics and compute performance for power-constrained devices with the introduction of three new GPU families: the PowerVR Series9XEP, Series9XMP, and Series9XTP.

These three families cover the entry-level, mid-range and high-end and represent our best ever GPU line up, with optimisations and enhancements that enable it to provide outstanding performance density and low power consumption. Along with our new PowerVR Series3NX for neural network inferencing, not to mention our new Ensigma location IP, this represents a comprehensive product offering for 2019.

PowerVR 9Series second-gen covers all markets

PowerVR Series9XEP: power to the people

It makes sense to start at the beginning, so let’s start with the entry-level product; the PowerVR Series9XEP. This is built on our Rogue architecture and is aimed at markets where the physical size, or the cost of the GPU, or both together, are of paramount importance. Think set-top boxes and low-end smartphones, where the smooth presentation of complex, high-resolution GUIs is the priority rather than full 3D gaming. That’s not to say you can’t game on Series9XEP. It takes our already impressive low-end part and pushes things up a notch with an up to 10% overall improvement in performance over last year’s Series9XE, achieved through microarchitectural tweaks. On top of that, timings tweaks: that is, optimisations of the die size, has enabled nominal clock speeds to be increased by an additional 10%.

PowerVR Series9XEP and 9XMP core architecture

PowerVR Series9XEP and 9XMP core architecture improvements

A key enhancement inside the Series9XEP is the present of PVRIC4, our new image compression technique that we announced earlier this year. This has the benefit of guaranteeing a 50% reduction in system bandwidth and memory footprint, while means further cost savings for SoC designers and helps reduce battery consumption. This is a feature of course that appears right across the range of all our new Series9 series GPUs. While PVRIC4 has been added the improvements and area reductions elsewhere ensure the Series9XEP maintains the fillrate leadership position of the XE family.

PVRIC4 compression stats

PowerVR Series9XEP will be available in a range of configurations depending on the needs and silicon budget of the SoC designers ranging from 1- 8 pixels per clock and 16 to 64 FP32 ops/clock.

PowerVR Series9XMP: the mid-range master

Also based on the Rogue architecture, the changes to the PowerVR Series9XMP over the Series9XM are more significant and relate specifically to density improvements. For example, a part featuring dual 16-pipeline-wide ALUs and two texture processing units (TPUs) can now be achieved with a single 32-pipeline-wide ALU and TPU, providing a significant area reduction. Overall, Series9XMP offers a 45% increase in cluster density – essentially the same performance for a lot less area, which is hugely valuable in terms of cost and power savings.

PowerVR Series9XMP cluster density improvement

Image quality also gets a big boost with two times the anisotropic filtering performance, helped by the addition of dedicated texture cache. Greater anisotropic filtering means images look a lot sharper as you look into the screen space and gamers will tell you how it makes a significant amount of difference to the image.

PowerVR Series9XMP anisotropic filtering There are also several other improvements such as reducing the overhead and bandwidth in the core management unit, and a doubling in the size of the system level cache. There are also improvements and enhancements to data paths in caching, atomic operations, data master setup rates, and compute overlap with other workloads.

Naturally, PVRIC4 technology is also present. In terms of configurations, the Series9XMP has a more compute-focused design, offering up to 128 FP32 ops/sec at four pixels per clock.

This makes it an ideal fit for the affordable, but powerful gaming and AI requirements of mid-range devices such as affordable smartphones.

Top-end performance

At the high-end, we have PowerVR Series9XTP, built on the newer Furian architecture that included a number of significant changes over Rogue, to deliver enhanced scalability to make it easier to raise the ceiling on performance with better power efficiency. You can head over to our in-depth blog post on the Furian architecture to find out what it brought to the table but suffice to say that the Series9XTP has been further enhanced in a couple of key areas.

To recap quickly, in Furian, the shader processing unit (SPU) contains two unified shader clusters (USC). Inside these USCs are redesigned arithmetic logic units (ALU), the logic that performs the complex mathematical magic at the heart of a GPU. As with the Series8XT, this ALU can perform a MAD and a MUL operation through each pipeline per clock and featured a 32-wide pipeline for the first time.

There are two key enhancements for Series9XTP. First, there’s an option for a 40-pipeline wide ALUs, so simply more work can be done at the same time. Secondly, it’s also now possible to specify a part with three USCs per SPU, giving more raw GLOPS. This configuration will deliver a core that delivers up to 360 FP32 FLOPS/clock combined with a fillrate of 8PPC, fulfilling the needs that high-end markets demand. By putting down two SPUs this can be doubled to a very powerful, but still power-efficient, core, offering 16PPC with 720 FP32 FLOPS/clock.

The various optimisations in Series9XTP, such as cache tuning and microarchitectural enhancements and the timing improvements for a 10% higher clock speed all mount up. PowerVR Series9XTP is faster than Series8XT for the same area by as much as 50%, with better FLOPS density, better computer density and better Manhattan performance.

PowerVR Series9XTP core architecture

PowerVR Series9XTP core architecture improvementsFurther cost savings can also be made over Series8XT thanks to the 50% reduction in system bandwidth and memory footprint delivered by PVRIC4.

PowerVR Series9XTP also carries over core features from Series8XT such as support for all the latest, relevant APIs such as OpenGL ES 3.2 and Vulkan 1.1. Our unique virtualization support, also supported on the Series9XEP and Series9XMP, is also present and correct, enabling multiple OSs to be run in isolation for true security, with fast context switching between them ensuring no loss of performance.

It all adds up

The second-generation PowerVR Series9 offering represents a comprehensive line-up of cores that provides SoC designers with a wide range of options depending on their designs needs. The perfect balance of fillrate and FLOPS performance, in a given area with the desired power consumption, can be achieved by selecting a suitable core. Whether it’s an affordable smartphone, a powerhouse for your pocket or a design for in-vehicle entertainment PowerVR’s new Series9 GPUs have you covered. Benjamin Franklin would no doubt have approved.

For more news and announcements related to PowerVR, keep coming back to our blog and follow us on Twitter @ImaginationTech@PowerVRInsider, Facebook, and LinkedIn.

*OK, I’m paraphrasing…

Benny Har-Even

Benny Har-Even

With a background in technology journalism stretching back to the late 90s, Benny Har-Even has written for many of the top UK technology publications, across both consumer and B2B and has appeared as an expert on BBC World Business News and BBC Radio Five Live. He is now Content Manager at Imagination Technologies.

4 thoughts on “The ultimate embedded GPUs for the latest applications”

  1. No problem! Also there’s been articles about Nvidia and AMD implementing elements of tile-based rendering in the form of tile-based caching. Is it possible that we can get an article that discusses the pros and cons between these methods?

    Reply
  2. Thanks Mark. You’re right – and erroneous G so changed to FLOPS/clock. And the figure should have been 360 too! All corrected now.

    Reply
  3. “This configuration will deliver a core that delivers up to 480 FP32 GFLOPS/clock combined”

    Surely you meant GFLOPs not GFLOPS per clock.

    Reply

Please leave a comment below

Comment policy: We love comments and appreciate the time that readers spend to share ideas and give feedback. However, all comments are manually moderated and those deemed to be spam or solely promotional will be deleted. We respect your privacy and will not publish your personal details.

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

benny.har-even@imgtec.com
Tel: +44 (0)1923 260 511

Search by Tag

Search by Author

Related blog articles

dancing people listening to music

Why you should care about Bluetooth LE Audio

Earlier this week, we announced our new iEB110 IP, a complete Bluetooth Low Energy IP solution based on the recently announced Bluetooth SIG version 5.2 specification. The iEB110 Bluetooth Low Energy v5.2 IP contains the RF, controller software and Bluetooth Low Energy (BLE) host stack, providing manufacturers with everything they need to create cutting-edge devices, from earbuds to hearing aids, at speed, and at the lowest BOM cost, giving them a competitive edge in the fast-evolving wireless audio market. We’re particularly excited about this IP as the BLE 5.2 specification introduces LE audio, which is a significant step forward for Bluetooth audio, both in terms of sounds quality and functionality.

Read More »

Imagination的射频IP获Autotalks选用并集成至其PLUTON2芯片组中

中国北京,2020年2月21日 ─ Imagination Technologies宣布,其CRF4600射频(RF)IP已被集成至Autotalks的PLUTON2 RFIC收发器中,这款收发器是Autotalks的V2V和V2X通信解决方案的一部分。该IP目前已处于硅验证阶段,预计将于2021年之内量产。 集成在PLUTON2器件中的CRF4600是一种高度灵活的5.9GHz V2X射频解决方案,能够支持DSRC(IEEE802.11p)、LTE-V2X和NR-V2X等标准,以及发射和接收分集天线。同时它还支持双频Wi-Fi,IEEE 802.11n/ac和2×2 MIMO。该IP提供了一个紧凑型超小尺寸解决方案,凭借最少的外围器件,可以降低大批量生产的成本,并且在尺寸、外形和功能方面极具灵活性。 Autotalks研发副总裁Amos Freund表示:“实现车辆之间以及车辆与其他道路使用者之间的直接通信,对于汽车的发展是至关重要的。我们的双模(C-V2X和DSRC)V2X解决方案已经成熟,并已准备好响应整个行业的此类需求,而Imagination的技术在推动我们实现生产并提供行业领先的技术方面发挥了关键作用。” PLUTON2 RFIC是一种低功耗、高性能、高度灵活、支持多标准的射频集成电路(RFIC),可提供最佳的射频发射器/接收器功能。通过结合CRATON2 / SECTON基带器件,它可实现目前最远的V2X通信范围。 通过集成前置功率放大器(PA)、低噪声和高动态范围优化、快速的增益自适应功能,PLUTON2可以保持很高的射频系统性能;支持在恶劣的车载和移动环境中进行不间断操作,以及在高温(最高达105摄氏度)条件下支持车顶安装。 Imagination Technologies硬件工程副总裁Pelle Wijk表示:“随着领先的整车厂(OEM)为其车辆装配V2X通信技术,该技术正在向大众市场迈进。我们经过生产验证且可扩展升级的RFIC IP技术,使我们在支持802.11ac的同时,能快速地增加对802.11p和C-V2X的支持,并使产品符合汽车标准,从而助力Autotalks持续提供业界领先的V2X解决方案。” V2X通信技术使车辆之间以及车辆与周边环境之间能够通信。与视觉传感器等试图复制我们已有感觉的传统传感器不同,V2X为人和机器的感知添加了新的组件。该传感器可以看清拐角处以及超过一英里半径范围内的任何障碍物。 Autotalks的芯片组在本月早些时候已被选用于一个C-V2X量产项目,该项目是首批在中国部署的相关项目之一。

Read More »

Connect

Sign up to receive the latest news and product updates from Imagination straight to your inbox.