PowerVR Graphics Architecture
The PowerVR Tile Based Deferred Rendering (TBDR) Architecture
All PowerVR GPUs are based on Imagination’s unique Tile Based Deferred Rendering (TBDR) architecture; the only true deferred rendering GPU architecture in the world. TBDR combines two complementary architectural features to provide the very highest levels of efficiency and performance:
The PowerVR architecture splits the screen into a number of ‘tiles’, which are then processed individually (in parallel to other tiles). Since the GPU only needs to work on a subset of the complete scene data at any given time, this data (such as colour and depth buffers) is small enough to be stored in internal GPU memory, significantly reducing the required number of accesses to system level memory. This results in lower energy and bandwidth consumption and also higher performance.
PowerVR deferred rendering uses a unique, patented method of Hidden Surface Removal which defers all texturing and shading operations until the visibility of each pixel in the tile is known- only the pixels that will actually be seen by the end user consume processing resources. This means that unnecessary processing of hidden pixels is eliminated, which further ensures the lowest possible bandwidth usage and number of processing cycles per frame, resulting in the highest performance levels and the lowest power consumption.
The PowerVR USC (Universal Shading Cluster) Engine
PowerVR GPUs employ a multi-threaded, multi-tasking, pipeline cluster architecture called the Universal Shading Cluster (USC). The USC is optimised for the operations used in vertex and pixel shaders and also practical GPU compute tasks (video and image processing, for example). The USC architecture is highly scalable, meaning that multiple clusters can be implemented with very little overhead (performance scales almost linearly with additional clusters). Because the USC is a scalar-based engine, it benefits from all the advantages of scalar processing such as higher compute density and ease of software development.
This unified architecture, combined with intelligent task scheduling and hardware load balancing, ensures that processing resources achieve maximum utilisation and provide latency tolerance. Optimised data paths and local caching enable further increases in performance efficiency whilst reducing power consumption
High Performance, Highly Flexible Microkernel
All PowerVR GPUs are managed by firmware which controls all higher-level GPU events. This approach offers numerous advantages including full offloading of virtually all interrupt handling from the main host CPU while maintaining maximum flexibility.
PowerVR GPUs feature a dedicated multi-threaded microcontroller to run the microkernel, which allows full debugging functionality of the GPU. The software-based management of the GPU ensures the ability to adapt to future market requirements as well as providing optimal performance through priority-based execution of GPU tasks. The microkernel also has the ability to help SoC designers implement advanced power management features by, for example, signalling workload information to DVFS and power-gating logic within the SoC.
Virtualisation and Security
From Series7 onward, all PowerVR GPUs include hardware support for virtualisation, enabling OEMs to implement secure SoC platforms for safety critical applications such as automotive as well as for everyday secure devices.
Comprehensive, Powerful Compression Schemes
PowerVR GPUs feature PVR3C Triple Compression, a suite of three compression technologies to ensure the most efficient use of memory bandwidth. Image compression (PVRIC), texture compression (PVRTC, ASTC) and geometry compression (PVRGC) can significantly reduce system-level memory accesses required by the GPU. Benefits of reduced memory bandwidth consumption include lower power consumption, better overall system efficiency and reduced system-level memory costs.
PVRIC (PowerVR Image Compression)*
One of the largest consumers of memory bandwidth in modern, high-definition graphics applications are the intermediate render target reads and writes that contribute to each frame (surfaces such as shadow maps, reflection maps and so on). PVRIC is a highly efficient, lossless compression scheme that typically provides a 50% reduction in the size of these memory accesses, automatically compressing the data before it is written out of the GPU and decompressing the data as it is read back in from memory.
For further bandwidth savings, the PVRIC decompression logic can be integrated into the SoC-level display pipeline which then allows the GPU to also compress the final frame buffer image before it is written to memory, again with a typical compression ratio of 2:1.
PVRIC compression is the only image compression scheme which works with block based memory access patterns of a video decoder to enable high efficiency system wide image compression.
Enabling the compression of textures can significantly reduce application file size and download times; it also dramatically improves runtime performance and power consumption by keeping bandwidth usage to an absolute minimum. PowerVR GPUs provide support for a number of industry-standard texture compression formats.
The highly-acclaimed PVRTC lossy texture compression format is one of the most widely used texture compression formats in the mobile industry today, having been implemented in over a billion devices. It is fully accelerated in hardware on all PowerVR SGX and PowerVR GPUs. PVRTC enables both RGB and RGBA formats to be compressed into 2 or 4 bits per pixel versus the standard 32 bits formats, resulting in compression ratios from 8:1 up to 16:1.
PVRTC2 is a major upgrade and builds on the many strengths of PVRTC, adding a wide range of additional features including:
- Improved image quality, especially for textures with high contrast, large areas of significant colour discontinuity, or boundaries of non-tiling textures
- Better support for pre-multiplied textures
- Support for arbitrary sized NPOT (Non Power Of Two) textures
ASTC* is an efficient texture compression technology which allows encoding of a wide variety of texture formats at bit-rates ranging from 8 bits per pixel to <1 bit per pixel. ASTC was developed under the cooperative process at Khronos and supports monochrome, luminance-alpha, RGB and RGBA formats, as well as X+Y and XY+Z formats for surface normals, and provides the flexibility for any format to be encoded at any bit rate. Uniquely, the encoding method is chosen independently for each block of pixels in the image, so that the coding adapts dynamically to most efficiently represent the image region-by-region.
PVRGC Geometry Compression*
To help manage the increasing geometry complexity of 3D scenes, PowerVR GPUs can include PVRGC (PowerVR Geometry Compression) . PVRGC minimises memory usage by automatically compressing the intermediate geometry parameter data that is written to memory as part of the tiling process. The data is then automatically decompressed as it is read back into the GPU later in the pipeline, resulting in a significant reduction in the required memory bandwidth.
* Support is optional on some variants
Low Power, High Performance, Ultimate Efficiency with PowerGearing
From the outset PowerVR GPUs have been designed with high performance and low-power consumption in mind. The patented PowerVR technologies such as TBDR and Hidden Surface Removal (HSR) mean that our GPUs are inherently efficient at an architectural level. These fundamental efficiency advantages are complemented by Imagination’s PowerGearing™ advanced power management features such as multi-level clock gating, support for power islands and flexible power control mechanisms that can seamlessly interact with SoC-level power management / DVFS control blocks.
GPU Compute is an increasingly important requirement for all GPU-based systems, from entry-level right up to high-end devices. In recognition of this, all PowerVR GPUs support industry-standard compute APIs such as Khronos OpenCL, Android Renderscript and OpenGL ES 3.1 compute shaders. The PowerVR architecture was designed from the outset to be a highly capable compute engine with highly efficient, multi-level workload scheduling hardware working alongside shading engines that have been optimised not just for graphics, but general-purpose compute workloads too.