The automotive market is one of the most exciting sectors in the technology ecosystem, and as a result the industry is constantly fluctuating and evolving in order to keep at the forefront of new technology. This means the underlying hardware which powers the system must not only deliver great performance but must also be versatile. The automotive industry is a key area for Imagination Technologies as our PowerVR graphics cores are used throughout the automotive segment to enable cutting-edge technologies.
In this blog post, we are going to describe how some of the features of the PowerVR architecture make it an extremely powerful choice for driving the latest graphically demanding applications within the car. We will also discuss how a modern graphics API such as Vulkan® can bring benefits to automotive applications in the future.
Tile Based Deferred Rendering (TBDR)
All PowerVR graphic cores employ the well-established TBDR architecture. It is important to understand the key concepts in order to realise the benefits of the architecture, which we will briefly explain here.
TBDR enables our graphic cores to more efficiently utilise the system memory bus by keeping all of the data used for rasterization of primitives and the shading of each pixel in a single tile (typically 32×32 pixels) in on-chip buffers until rendering of that tile is complete. Once the rendering task for a tile has finished, only then will the result be written out to the framebuffer, which is held in system main memory. This approach significantly reduces memory transfers between the graphics core and main memory, which helps to reduce power consumption for the entire System on a Chip (SoC).
The “deferred” part of the architecture enables our graphics cores to discard much of the geometry that would otherwise be hidden behind other fragments in the final image before fragment processing begins. This approach significantly reduces the time spent on processing fragments that will inevitably be overwritten, and therefore can significantly reduce the number of clock cycles spent on processing the frame. By utilising this type of architecture PowerVR graphics cores provides excellent scalable performance while being extremely power efficient.
For a more in-depth look at our tile based deferred architecture, you should talk a look at this blog post.
In-car GPU accelerated displays
The automotive market demands graphics processors that can deliver a high pixel fill rate in order to power various high-resolution displays which can be used for informational and/or entertainment purposes. Some of these are as follows:
- Infotainment systems are a central part of the modern car. They are usually where the dashboard is being transformed into an all-round multimedia centre, which can communicate with other smart devices such as smartphones, or even to the cloud over mobile networks. As well as the user interface, which needs to be clear and responsive, images and video can be displayed.
- Instrument clusters are another popular cutting-edge technology which is designed to provide real-time information to the driver, potentially on multiple high-resolution screens.
- Navigation is a technology which has been around for a long time but is now becoming ever more sophisticated. One of these improvements is in the increased visual fidelity of the navigation applications deployed on embedded devices present in the vehicle. These are thanks in part to the advancements made in graphics technology.
- Finally, head-up displays – a technology which was pioneered by the aerospace industry is now being used in automotive applications to project useful information onto the windshield of the vehicle. This enables the driver to maintain eye contact with the road at all times while still being able to receive crucial information.
Currently, most modern vehicles usually come equipped with at least some of these features and systems. Looking forward to the future, these features will not only become more ubiquitous but will also improve in visual quality as manufacturers push towards higher resolution displays such as 4K and beyond. This will obviously put more strain on the underlying hardware.
As display technology improves, systems will inevitably require more graphics horsepower and much more memory bandwidth, in order to push the millions of pixels that the high resolution displays command. This provides even more reasons to look towards tile based deferred architectures such as PowerVR, due to the fact that memory bandwidth on embedded and mobile devices is historically quite limited and TBDR allows for the processing to be performed on-chip. As we mentioned before, this architecture design significantly reduces the number of unnecessary data transfers over the memory bus, which in turn significantly improves the efficiency of the entire system.
In addition to the reduced memory transfers that the tile based architecture permits, early depth testing and hidden surface removal (the “deferred” part of the architecture) significantly reduces the number of wasted clock cycles. This is especially important as display resolution and application complexity increases. With vast numbers of fragments to process at extreme resolutions, this technology could allow for a properly optimised application to run at stable interactive frame rates, which would improve the user experience for things such as sat-nav and infotainment.
While for these meeting a target frame rate is desirable when it comes to running an in-car dashboard, it’s essential, as providing information to the driver consistently and reliably will be vital for the safety of the system.
On the subject of new technologies, the Vulkan graphics API is likely to play a major role in the automotive industry in the coming years as the API slowly becomes adopted by platform manufacturers. This shift will have a profound impact on future automotive applications such as satellite navigation.
Some of the issues and techniques surrounding navigation applications have previously been discussed in a blog post written by Robin Britton, which looks at some of the techniques used for efficient rendering of a navigation application using both the OpenGL ES and Vulkan graphics APIs. It also discusses the merits of Vulkan and the improved rendering efficiency that the Vulkan graphics API can deliver when used optimally.
One excellent reason to move to a modern API such as Vulkan is that it is extremely efficient. Vulkan requires much less work to be carried out by the CPU in order to instruct the graphics core. This results in reduced CPU usage, which is a particularly important area for mobile/embedded devices, as reduced CPU usage can greatly reduce thermal output and power consumption.
A second important advantage for the Vulkan API over OpenGL ES is that it maps extremely well to tile-based architectures and provides several measurable advantages over OpenGL ES – very briefly some of these advantages include:
- Vulkan allows for fine-grained control over synchronisation, this means that the graphics driver is more aware of all dependencies between objects and memory which mean that only caches (on-chip storage) that need to be flushed are, which can help to reduce memory bandwidth.
- Vulkan ensures that all dependencies are explicitly declared ahead of time, which removes the need for the driver to guess about the state at draw time. This makes it simpler for the driver to package the incoming work into tiler and rasteriser tasks which can be consumed directly by the hardware, leading to more efficient execution of the work by the graphics core.
- Vulkan provides API objects (render pass) which specifically disallow any operations which will cause a mid-frame flush during rasterization, which would cause the graphics device to stall. Furthermore, these API objects allow the graphics core to more effectively use on-chip storage due to the fact that intermediate FBO attachments that do not need to be stored are never written back to main memory. Again this reduces memory bandwidth significantly, even more so at higher resolutions and consequently reduces power consumption for the entire SoC.
If you would like to go even deeper into the benefits of Vulkan for tile-based architectures such as PowerVR graphics cores, take a look at this article by Tobias Hector.
Whether you are using OpenGL ES or Vulkan, there are always some relatively simple tricks that you can consider doing to improve the performance of your applications. Enabling back face culling and correctly setting it up may seem insignificant on the surface, but it noticeably reduces the amount of load on the graphics hardware. This is because the tiling hardware and rasterizer benefit specifically, as you are reducing the number of polygons that are fed into these fixed function stages. This can actually help performance significantly on highly complex workloads with many hundreds of thousands or millions of polygons.
In addition, using compressed textures (such as PVRTC) in your application not only reduces the amount of memory required to store them but also significantly reduces the amount of memory bandwidth required for texture sampling. This is because compressed textures are transferred in their compressed format and only uncompressed by the hardware on-chip. A compressed texture can significantly improve the cache hit rate, as a compressed texture is tightly packed, meaning more of the texture can fit into the cache. This can improve memory latency as the hardware can fetch the data from on-chip cache rather than main memory. This significantly speeds up the time for the request being serviced, potentially reducing the time the Unified Shader Cluster (USC) is stalled waiting for data, and lessening the number of power-sapping memory transfer operations.
With the ever increasing demand for higher quality applications in our vehicles, it is no surprise that in-vehicle display systems, and by extension the underlying graphics hardware are considered to be crucial technologies for further innovation and development by automotive manufacturers. The efficiency of PowerVR graphics will be critical for any car manufacturer looking to obtain as much performance as they can from a limited area and power budget.
In addition to the hardware being up to the task, it is also important the software is as optimal as possible by identifying and removing bottlenecks, in order to fully take advantage of the hardware and enable ever more complex applications to run at interactive frame rates.