Meta Processing

MIPS® P5600 Superscalar Multiprocessor
Core Family


The MIPS P5600 processor IP core is the first member of the Warrior generation of MIPS CPU cores from Imagination Technologies. It delivers industry leading 32-bit performance with class-leading low power characteristics, and in a silicon footprint up to 30% smaller than comparable alternatives in the industry. The P5600 CPU core was designed for the performance and features required for tomorrow’s mainstream connected consumer electronics including smartphones, tablets, connected TVs and set-top boxes. However, the rich and broad feature set extends applicability into a variety of networking applications, from residential gateways to network appliances and microservers.

The MIPS P5600 CPU is based on a wide issue, deeply out-of-order (OoO) implementation of the MIPS32 architecture, supporting up to six cores in a single cluster with high performance cache coherency. Complementing this raw horsepower, this core is the first in the MIPS CPU lineup to include 128-bit integer and floating point SIMD processing, hardware virtualization, and physical and virtual addressing capability enhancements.

Product resources

MIPS P5600 Factsheet (pdf) (801Kb)

P5600 CPU Summary

The MIPS P5600 processor IP core delivers top line performance while being the most efficient CPU core in its class, making it ideal for both mobile and digital home applications in the rapidly growing connected consumer electronics market.

It builds upon the existing proAptiv family microarchitecture, adding 128-bit SIMD, hardware virtualization with hardware table walk, 40-bit eXtended Physical Addressing (XPA), and substantial gains in performance on system-oriented software workloads. The P5600 CPU also exceeds 5 CoreMark/MHz per core, a score significantly higher than any published score for licensable IP cores in the industry, and achieves 3.5 DMIPS/MHz, matching or exceeding other high end IP cores as well.

The P5600 processor delivers this performance in a much smaller silicon footprint than leading IP core alternatives, achieving these results in up to 30% smaller silicon area , given a common process geometry, similar configurations and synthesis techniques used. SoC designers can use this efficiency advantage for significant cost and power savings, or to implement additional cores to deliver a performance advantage against competing silicon.

P5600 Applications

Market Applications and Target Performance

Mobile Digital Home
* High-end tablet / smartphone   application processor

* 1.0 to 1.71 GHz

* Single to quad cores

* High-end connected DTV / STB   application processor

* 1.0 to 2.01 GHz

* Single to quad cores

Networking  
* 802.11ac routers, residential gateways, CPE modems, 3G/4G   cell infrastructure control plane, Network appliances and microservers

* 1.0 to > 2.01 GHz

* Single to hex cores



1Preliminary metrics based on pre-production RTL and comparable results on proAptiv CPU family. Higher end frequencies readily achievable with use of more aggressive implementation techniques and physical libraries.

P5600 Benefits

  • 128-bit SIMD - accelerates execution of audio, video, graphics, imaging, speech and other DSP-oriented software algorithms, with instruction set designed for development in high level languages such as C, OpenCL
  • Hardware virtualization – supports multiple software environments running independently, securely, efficiently and in complete isolation to each other
  • Advanced addressing extensions for Enhanced Virtual Address (EVA) and eXtended Physical Address (XPA)
    • EVA enables 3GB+ Linux (and similar OS) implementations without use and overhead of HIGHMEM
    • XPA extends physical addressing up to 1 Terabyte (40-bits)
  • Multiple context security platform for enterprise/consumer partitioning, secure content access, payments/transcations, and isolating secure schemes from numerous content sources
  • Sophisticated branch prediction for maximizing utilization and performance on deeply pipelined CPU
  • Load/Store bonding for optimum data movement performance
  • Industry leading benchmark and real world performance at smaller area and power than competing solutions
  • Broad software and ecosystem support and mature toolchain
  • Available as synthesizable IP, for implementation in any process node, with standard cells and memories

P5600 Features

Base Core Features

  • 32-bit MIPS32® Release 5 Instruction Set Architecture
  • High-performance, 16-stage, wide issue, out-of-order (OoO) pipeline
    • Quad instruction fetch per cycle
    • Triple bonded dispatch per cycle
    • Instruction peak issue of 4 integer and 2 SIMD operations per cycle
    • Sophisticated branch prediction scheme, plus L0/L1/L2 branch target buffers (BTBs), Return Prediction Stack (RPS), Jump Register Cache (JRC)
    • Instruction bonding – merges two 32-bit integer accesses into one 64-bit access, or two 64-bit floating point accesses into one 128-bit access for up to 2x increase on memory-intensive data movement routines
  • L1 cache size for Instruction and Data of 32KB or 64KB each, 4-way set associative
  • New high-performance dual-issue 128-bit SIMD Unit - optional
    • 32 x 128-bit register set, 128-bit loads/stores to/from SIMD unit
    • Native data types:
      • 8-/16-/32-bit integer and fixed point, 16-/32-/64-bit floating point
    • IEEE-754 2008 compliant
    • Runs at full speed with CPU core
  • Full hardware virtualization
    • Provides root and guest privilege levels for kernel and user space
    • Supports multiple guests, with full virtual CPU per guest = guest OSs run unmodified
    • Separate TLBs, COP0 contexts for root and guests –> full isolation, fast context switching, exception and interrupt handling by root
    • HW table walk support in TLB for optimal performance
    • Complete SoC virtualization support (IOMMU and interrupt handling – see multi-core features)
  • Programmable Memory Management Unit (MMU)
    • Enhanced Virtual Address (EVA) - Programmable kernel and user segment sizese
    • eXtended Physical Address (XPA) – Provides extension to 40-bits of physical address bits (1 TB)
    • 1st level micro TLBs (uTLBs) – 16 entry instruction TLB, 32 entry data TLB
    • 2nd level TLBs – simultaneous access, variable and fixed page sizes
      • 64x2 entry VTLB, 512x2 entry 4-way set associative FTLB
    • Hardware table walk for fast page refills
  • Power Management Features
    • Multi-core cluster power controller (CPC):
      • Register-based, visible to/controllable by operating system
      • Per CPU voltage domain gating; per CPU clock gating
      • Cluster level DVFS capable
    • Core level
      • Course and fine-grained clock gating throughout core
      • Way prediction on data and instruction L1 caches
      • Instruction and register-based sleep modes
  • EJTAG/PDtrace debug blocks and interface

Coherent Multi-Core Processor Features

  • Superscalar, deeply OoO multi-core processor
  • Complete multi-core system designed for maximum cluster-level bandwidth
    • Coherence manager- – supports multi-core configurations up to six cores in a single cluster
    • High-bandwidth 256-bit internal data paths and external system interface
    • Integrated L2 cache (L2$): 4-way set associative, up to 8MB of memory
      • ECC option on L2$ RAM for higher data reliability
      • Configurable wait states to RAM for optimal L2$ design
      • L2$ hardware pre-fetch for higher throughput and performance
    • Up to two IO Coherence Units (IOCU) per coherent processing system
    • Cluster Power Controller (CPC) for voltage/clock gating per-CPU
    • 256-interrupt Global Interrupt Controller (GIC)
    • Virtualization support at system level – IOCUs have IO MMU, and GIC has virtualized interrupts
  • Advanced debug capabilities – PDtrace subsystem allows visibility to core- and cluster-level trace information

P5600 Specifications

Target TSMC 28HPM
Frequency 1 GHz - 2+ GHz*
CoreMark/MHz (per core) > 5
Total CoreMark @ 1.5GHz > 7500 per core
DMIPS/MHz (per core) 3.5
Total DMIPS @ 1.5GHz > 5250 per core

Notes: Frequencies indicated are based on pre-production P5600 RTL and compared with results for fully floorplanned dual core proAptivimplementation, and range from 12T SVt area-optimized in worst case silicon corner, to 12T MVt speed-optimized typical corner silicon. Final production RTL results may vary.

Each base core configuration:

  • 32KB Data/Inst L1 caches with parity, BIST
  • New high-speed Integer + Floating Point (SP and DP) SIMD unit
  • Fully-featured MMU, using multi-level TLB (I/D uTLBs + 128 entry VTLB + 1024 entry FTLB)
  • PDtrace™ debug

Multi-core cluster configuration:

  • Dual fully-configured P5600 cores per above
  • Coherence Manager + integrated 1MB L2$ w/ECC
  • One hardware IO Coherence Unit (IOCU) port
  • Cluster level PDtrace

Implementation libraries/parameters – speed optimized, based on:

  • TSMC 28HPM 12T standard cells + Synopsys memories
  • Worst case, slow-slow corner silicon (zero temp, WCZ) with 10% OCV + 25ps clock jitter margins, except where noted at typical silicon


Cookies on the
Imagination website
We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we'll assume that you are happy to receive all cookies on the Imagination website. However, if you would like to, you can change your cookie settings at any time.