PowerVR Series2NX: Raising the bar for embedded AI

Back in September Imagination changed the game for embedded AI acceleration with the introduction of the Series2NX neural network accelerator (NNA) and today we are proud to announce two cores in the range: the PowerVR AX2185 aimed at the premium markets, and the AX2145 for the mid-range. (We go into more detail about each below).

As we explained in our post at the time, neural networks are becoming increasingly prevalent as a tool for solving problems and powering new types of applications across a wide range of industries. Do check out our blog post if you’d like a simple explanation of what neural networks are and how they work.

key market mobile

Today, many of the devices you use every day will make use of a neural network. They power the search function in the photo application on your smartphone and are used to identify you so you can unlock your phone with your face. They are how your number plate is recognised when you enter or leave a congestion zone and they are clever enough to detect a person’s body language – ideal for surveillance systems. They could be in a camera-enabled AV system that identifies you and offers personalised content and in automotive, performing tasks such as lane departure warning and driver alertness monitoring, to name but two. It’s also been proven that neural networks are better than humans at detecting skin cancers – the uses are wide and varied.

key market smart camera

While neural networks are trained offline, when it comes to inference – the running of these networks to identify and process objects in real-time – there is a need to have this technology available in edge devices, rather than having the processing take place in the cloud. Take just one example: drones. They can fly in excess of 150mph and a neural network can power the collision detection system. However, without dedicated hardware to process the images, the drone would need to see objects 10-15 metres ahead to have a chance of avoiding obstacles. The cloud is not a suitable solution due to the latency it would take to send and receive the information and the bandwidth required.  A drone equipped with a dedicated PowerVR NNA could travel at 150mph and be able to avoid obstacles under one metre away, improving responsiveness and safety and enhancing the creative possibilities for their use.

We see dedicated neural network hardware as the next step in SoC integration. In the 1980s a maths co-processor was available as an addition to supplement the early desktop CPU, but soon became a standard part of the design. While many of these compute tasks have moved from the CPU to the GPU, these are still not the most design-efficient tools for the job and placing them on dedicated local hardware is the logical solution. The performance advantages are clear. Searching 1,000 photos using a premium GPU takes 60 seconds – but just two on our NNA. You could also use 1% of battery life to sort images – but the NNA can process 428,000. That’s pretty compelling.

typical SoC

Our Series2NX accelerator launched in September was built from the ground up for neural network inferencing and is the highest performing solution in the industry. We also offer support for all the major convolutional neural network layers such as Inception and ResNet and frameworks such as Tensorflow and Caffe2 and PyTorch.

A key differentiator of our solution is the fact that it offers flexible precision. Using a trained network consisting of 16-bit data it can achieve very high levels of accuracy. However, training a network at lower precision means that for many purposes high levels of accuracy can still be maintained, but with the advantage of significantly reducing power consumption and requiring less bandwidth. In the real-world, this makes it cost effective and practical to integrate an NNA into embedded devices such as smart cameras and smartphones. Read our blog post to find out more about the process and benefits of training a network for efficient inference.

The chart below demonstrates the benefits of this precision flexibility. In a nutshell, with 4-bit precision, you can slash the power and bandwidth (memory) requirements, with a mere 1% drop in accuracy, which in most cases will have no appreciable impact on device effectiveness in the real world.

4-bit vs 8-bit

The PowerVR AX2185
Our Series2NX has already found success in the market with multiple licensees and today we are expanding the choice in the market. First, we are enhancing our original core in the form of the PowerVR AX2185 and second, releasing another version, the PowerVR AX2145. A key new feature in both cores is the addition of hardware support for the Android NN API, enabling developers to bring deep learning-based applications to a large market of Android-based devices.

PowerVR ZX2185

The PowerVR AX2185 is aimed at the higher-end of the embedded market in terms of its raw performance. With eight full-width compute engines the AX2185 delivers up to 4.1 Tera Operations Per Second (TOPS) offering the highest performance per mm2 in the market. In fact, this level of performance is 3.5 times the performance of recent desktop GPUs. This makes it a much more enticing prospect for those companies that are currently using power-hungry full-fat desktop GPUs for neural network inferencing, particularly in the automotive space. Compared to competitors offering competing hardware solutions, using a network based on 4-bits of data, the AX2185 will require just 50% of the bandwidth. When compared to DSP-based competitors the bandwidth requirement is 75% lower.

bandwidth comparison

While premium devices already sport features such as face unlock, we are seeing that these capabilities are rapidly moving down the food chain. Device manufacturers are looking to offer this functionality at lower price points. Obviously, consumers want to pay less for their devices but still enjoy the benefits such as smart cameras for home security and face unlock on their smartphone. Yes, they want to have their cake and eat it.

The PowerVR AX2145
So how to maintain that experience in memory constrained devices? The answer is the PowerVR AX2145. This trades overall performance for performance efficiency. Its highly tuned tensor processing and convolution engines combined with an optimised core memory infrastructure can deliver outstanding maximum performance of 1 TOPS, but with a 50% better inference performance per mm2 than the AX2185 in systems with low peak memory bandwidth (as measured in GB/s). We’re really proud of this balance of performance and efficiency and we think it’s a really smart choice for cost-sensitive devices.

PowerVR AX2145

Of course, hardware alone does not make a solution. We offer a PowerVR AI toolkit, which simplifies the deployment workflow of our customers, provides easy debugging and network analysis and enables the optimisation of trained networks. Our API support includes the IMG DNN and Android DNN providing seamless operability between the GPU and NNA.

software_support

 
Conclusion
With the introduction of these two new cores, we are enabling new levels of intelligent applications to consumers through the addition of low-cost AI capabilities on edge devices. With these cores, developers will have access to devices offering all the performance they need to create applications that fulfil the promise of the AI revolution and we look forward to the improvements that they are sure to bring across markets such as smartphone, smart camera and automotive. The possibilities are endless and the only thing needed is a little Imagination.

2 thoughts on “PowerVR Series2NX: Raising the bar for embedded AI”

    • Hi Evgeny. Thanks for your comment. The AX2185 core has already been delivered to lead customers but we produce the IP rather than the final products. Finished products and pricing are in the hands of our customers and these take roughly 12-18 months to reach the market. As for dev boards, they are only available for our direct customers at this point.

Leave a Comment

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom

benny.har-even@imgtec.com
Tel: +44 (0)1923 260 511

Related blog articles

on stage in China

PVRIC4 a hit at ICCAD 2018 in China

Imagination’s PVRIC4 image compression tech garnered plenty of attention at the recent ICCAD China 2018 symposium, which took place on 29th and 30th November at the Zhuhai International Convention & Exhibition Centre, China. The annual event focusses on integrated circuit

The ultimate embedded GPUs for the latest applications

Introducing PowerVR Series9XEP, Series9XMP, and Series9XTP As Benjamin Franklin once said, only three things in life are certain: death, taxes and the ongoing rapid advancement of GPUs for embedded applications*. Proving his point, this week, Imagination has once again pushed

Opinion: the balance between edge and cloud.

Simon Forrest explains how embedded chips can meet the challenge of delivering true local AI processing. GPUs and NNAs are rapidly becoming essential elements for AI on the edge. As companies begin to harness the potential of using neural networks

DJI Mavic 2 closed

Partner interview series: DJI – flying high

DJI is a name now synonymous with drones, thanks to an estimated 74% market share across consumer, professional and enterprise markets. In the second of our ‘quick chat’ interview series, we speak to we talk to Charlie Sun, R&D Director at DJI to find

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.