A primer on mobile systems used for heterogeneous computing

In the mobile and embedded market, the design constraints of electronic products can sometimes be seen as tight and contradictory: the market demands higher performance yet lower power consumption, reductions in cost but shorter time-to-market.

These constraints have created a trend for more specialized hardware designs that fit a particular application; if each task is well matched to a functional unit, fewer transistors are wasted and power efficiency is better. As a result, application processors have become increasingly heterogeneous over time, integrating multiple components into a single System-on-Chip (SoC).

The diagram below presents the architecture of a modern SoC. Such a chip typically includes a CPU (with optional multi-core and SIMD capabilities), a GPU for both 3D graphics acceleration and high-performance vector computation, an ISP (Image Signal Processor) for acquiring image sensor data, a VDE (Video Decoder and Encoder) for codec acceleration and an RPU (Radio Processing Unit) for connectivity. Each of these components has its own advantages and combinations of these can be used to implement many applications efficiently.

Heterogeneous computing SoC architecture

Today many application developers rely on the CPU to meet the requisite performance requirements of their advanced computational photography and computer vision algorithms. However, these CPU-centric solutions frequently struggle to deliver sustained video-rate processing of high-resolution images, largely due to thermal limits of the devices.

As shown in more detail in the figure below, a CPU combines a small number of cores with a large data cache, optimized for efficient execution of general-purpose control code with low memory latency. The GPU, on the other hand, dedicates its transistors to ALUs (arithmetic logic units) rather than data caches and control flow. This arrangement of hardware enables efficient execution of large unbranched data sets that require many repetitive arithmetic calculations, such as an image processing algorithm operating on many pixels.

Heterogeneous computing: CPU vs GPU architecture

Furthermore, because the GPU is designed to run at lower clock speeds than a CPU, offloading image processing workloads from the CPU to GPU can lead to both an increase in performance and a reduction in power consumption and generated heat. The resulting implementation is also likely to be more balanced and also more responsive, as the CPU has more free cycles to respond to the demands of the operating system and user interface.

In the context of mobile and embedded software, heterogeneous computing is the process of combining different types of processing units together to meet an application’s performance requirements within a limited power and thermal budget. By partitioning the application into multiple workloads that can be distributed across the available hardware units, so that each workload is run on the hardware unit capable of executing it most efficiently, the overall performance and power-efficiency of the implementation can be improved.

When partitioning an application, serial tasks should usually be allocated to the CPU, whereas data-parallel tasks are good candidates for offloading to the GPU. If the SoC provides dedicated hardware accelerators such as an ISP or VDE, related tasks such as image de-noising and video playback should usually be allocated to these accelerators in order to maximise power-efficiency.

However, in some cases it may be desirable to implement these tasks in software instead, for example using GPU compute, to trade efficiency for a higher-quality algorithm than may be provided by the fixed-function accelerator. The use of GPU compute is particularly common in the field of computer vision where active research is continually leading to refinements of existing algorithms as well as entirely new vision algorithms. Fast deployment of these algorithms into products requires both programmability and a high-performance compute capability.

Join us next time for an example use case of heterogeneous computing and the existing bandwidth constraints SoCs currently face.

Further reading

Here is a menu to help you navigate through every article published in this heterogeneous compute series:

Please let us know if you have any feedback on the materials published on the blog and leave a comment on what you’d like to see next. Make sure you also follow us on Twitter (@ImaginationTech, @GPUCompute and @PowerVRInsider) for more news and announcements from Imagination.

 

  • Balancing the design of a system with enough specialized hardware for the range of common workloads it’s projected to have to do in its lifetime would seem like a common sense priority for the designer of the system, yet such a basic principle has eluded the lead developers of even major projects on occasion in the past.

  • Search by Tag

    Search for posts by tag.

    Search by Author

    Search for posts by one of our authors.

    Featured posts
    Popular posts

    Blog Contact

    If you have any enquiries regarding any of our blog posts, please contact:

    United Kingdom

    benny.har-even@imgtec.com
    Tel: +44 (0)1923 260 511

    Related blog articles

    British Engineering Excellence Award

    PowerVR Vision & AI design team collect another award

    We’re delighted that the design team for our PowerVR Series2NX Neural Network Accelerator (NNA) has been honoured with a prestigious British Engineering Excellence Award (BEEA). The BEEAs were established in 2009 to demonstrate the high calibre of engineering design and innovation in the

    Series8XT AR/VR Banner

    Imagination Technologies: the ray tracing pioneers

    After a period out of the spotlight, ray tracing technology has recently come back into focus, taking up a lot of column inches in the tech press. The primary reason is because graphics cards for the PC gaming market have

    Amazon Fire Stick 4K pic

    Amazon Lights up its Fire TV Stick 4K with PowerVR

    Amazon, the internet shopping giant, announced earlier this week the latest version of its media streaming device, the Fire TV Stick 4K. First released in 2016, the Fire TV stick brings catch-up streaming services to any TV with an HDMI

    Stay up-to-date with Imagination

    Sign up to receive the latest news and product updates from Imagination straight to your inbox.

    • This field is for validation purposes and should be left unchanged.
    >
    Contact Us

    Contact Us