Increasing performance and power efficiency in heterogeneous software

Heterogeneous architectures in embedded computing are fast becoming a reality – we indeed see many leading IP and semiconductor companies today building heterogeneous computing hardware.

In the article below, I’m going to describe one typical use case for heterogeneous computing and the challenges that result from moving to a heterogeneous programming model.

Running a beautification algorithm on a modern SoC

The diagram below illustrates how a video recording application that performs beautification might be implemented using a number of heterogeneous hardware and software components. In this example, input frames captured by the ISP/camera are first inspected by the GPU to determine the position of a face and its individual features (i.e. eyes, lips, nose and possibly others), passing these coordinates along to the CPU which tracks and automatically adjusts the camera focus and exposure to maintain high quality video. The CPU also determines which parts of the face contain skin colour, and the GPU applies a bilinear filter which smooths these textures, removing artefacts that represent blemishes and wrinkles, while preserving sharpness around the edges of the face.

A vision software pipeline implemented on top of heterogeneous architectures

The sequence of transformed images is output to both the hardware encoder for recording to disk and to the display subsystem for rendering in a preview window. As an additional optimization, the CPU could instruct the hardware encoder to encode the face coordinates at higher-fidelity than the background, optimizing both video quality and file size. In this scenario, at least five different hardware components require access to the image data in memory.

Memory bandwidth constraints

A key characteristic of many SoCs is the presence of a single unified system memory such as an off-chip DDR DRAM, which is shared between all hardware components. These components typically communicate with other components and with memory using a shared bus or interconnect, the bandwidth of which is tightly constrained to limit implementation area and cost. SoC bandwidth is frequently an order of ten times less than is common on desktop-class machines with PCI Express buses, and is a common performance bottleneck–particularly in cases where multiple hardware components attempt to access memory and other I/O at the same time.

GPU memory bandwidth

Furthermore, when an application passes ownership of data between different hardware components, the underlying operating system may create a duplicate copy of the data in memory. In some cases this may be due to hardware limitations, for example where the GPU requires access to data allocated by the CPU in virtual memory that CPU can page to disk at will. In other cases this may be related to image formats; for example, the ISP produces image sensor data in YUV format but the CPU or GPU needs to filter this data in RGB colour space.

Conversely, some operating system such as Android automatically convert images from YUV to RGB format before presenting the data to developers (for example, as OGLES_TEXTURE_2D textures); this can reduce the efficiency of many vision algorithms that only need to process image luminance data. The inefficiencies introduced by these behind-the-scene copies can be quickly compounded when processing high-resolution image data at video rate.

Join us next time to see how our engineering team has addressed the SoC bandwidth issues above by creating an innovative PowerVR Imaging Framework.

Further reading

Here is a menu to help you navigate through every article published in this heterogeneous compute series:


Please let us know if you have any feedback on the materials published on the blog and leave a comment on what you’d like to see next. Make sure you also follow us on Twitter (@ImaginationTech, @GPUCompute and @PowerVRInsider) for more news and announcements from Imagination.

Leave a Comment

Search by Tag

Search for posts by tag.

Search by Author

Search for posts by one of our authors.

Featured posts
Popular posts

Blog Contact

If you have any enquiries regarding any of our blog posts, please contact:

United Kingdom
Tel: +44 (0)1923 260 511

Related blog articles

Powering the automotive revolution

Since the advent of the semiconductor, the industry has been all about making electronics faster and more powerful. While that process is ongoing, the focus these days is on making things smarter and more efficient, and nowhere is that truer

Partner interview series: OPPO – rising up

In the first of a new ‘quick chat’ interview series, we speak to some of our leading partners to learn about their perspective on the industry and how they are looking to challenge the status quo. We start with China-based

Stay up-to-date with Imagination

Sign up to receive the latest news and product updates from Imagination straight to your inbox.

  • This field is for validation purposes and should be left unchanged.
Contact Us

Contact Us