If you’ve been following me on Twitter, you might have noticed a few recent posts focusing on past and present MIPS64 CPUs that have set new records in high performance and power efficiency:

Today I’d like to highlight a new MIPS64-based, high-performance architecture and two corresponding processors coming out of China courtesy of Loongson Technology. The company organized a swanky launch event where its chief architect introduced a 64-bit processor architecture called GS464E, and Loongson-3A2000 and 3B2000, two quad-core processors based on GS464E.

Building ultra-high performance MIPS64 CPUs at low power consumption

Both Loongson-3A2000 and 3B2000 are 4-way superscalar processors built on a 9-stage, super-pipelined architecture with in-order execution units, two floating-point units, a memory management unit, and an innovative crossbar interconnect. Several reports from China position 3A2000 as a flagship CPU aimed at the high-performance consumer electronics market (e.g. desktop computers and laptops, 64-bit embedded and DSP applications, and network routers) while 3B2000 will be used in a number of home-grown eight and 16-core server systems.

MIPS64 Loongson 3A2000A die shot of the new MIPS64-based Loongson processors

The 3B series also features Loongson-3B1500, a MIPS64-based superscalar processor clocked at 1.5GHz; platforms integrating an octa-core 3B1500 configuration can deliver up to 192 GFLOPS of peak performance at only 30W.

Loongson-3B1500 MIPS64Loongson-3B1500 is a new high-performance MIPS64 chip

Loongson chief architect Hu Weiwu also exclusively confirmed to Imagination that his team plans to release two now chips in 2016: 3A3000 and 3B3000. These processors will be built on a leading 28nm process node and clocked closer to 2GHz, therefore increasing the company’s competitive advantage.

For now, benchmarking data released last week for Loongson-3A2000 shows the MIPS64-based powerhouse CPU surpassing several competing processors in performance efficiency:

64-bit-CPU-performance-AMD-Intel-ARM-MIPS64Loongson-3A2000 offers competitive performance per GHz

Thanks to a series of significant microarchitectural enhancements, performance figures for the new chips show a 2.7x improvement over the previous generation cores (Loongson-3A1000 and Loongson-3B1000, respectively).

Loongson GS464E and extending the MIPS64 architecture

The new GS464E architecture counts over 1,400 instructions grouped under a set called LoongISA; these instructions include:

  • MIPS64 Release 3 instructions for high-performance, general-purpose computing
  • LoongBT, a binary translation technology that enables developers to run x86 and ARM code
  • LoongVM instructions for custom virtual machines
  • LoongSIMD instructions for 128- and 256-bit vector arithmetic operations

All Loongson CPUs run LOONGNIX, an open source distribution of Linux optimized for the GS464E architecture.

Innovation in China’s semiconductor market

Originally launched in 2010 as the result of a public–private partnership, Loongson focused initially on R&D and academic activities in the field of 64-bit computing.

Presently, Loongson is gearing up to become a solid contender in the Asian embedded market. This new push was visible during the launch event where multiple partners showcased products using Loongson CPUs, including Sugon Information Industry, China Aerospace Science and Industry, Tsinghua Tongfang Co., Neusoft Corp. and Ruijie Networks.

MIPS64 Loongson laptopA Loongson-based laptop running Linux

China is often seen as the world’s factory, making many of the devices and platforms we use on a daily basis. However, it imports well over 80% of the semiconductors it consumes for end products that are both exported out of China, and products that stay in China to be used by domestic consumers.

For that reason, the Chinese government sees semiconductors as a strategic industry, with domestic players having the potential to increase their market share versus overseas vendors. Therefore, investing in semiconductor design and manufacturing has become an important national strategy for China – and Loongson is capitalizing on this new wave of investment in Chinese semiconductor companies. After ten years of accumulated R&D and market exploration in the field of CPU architecture and design, Loongson has been steadily building a strong ecosystem of customers and gearing up for its next phase of expansion.

For more news and updates from MIPS, follow us on Twitter (@ImaginationTech, @MIPSguru) and come back to our blog.

 

Comments

  • tangey

    Your title states it “…breaks performance barrier”, but the body of your article does not identify the barrier, nor how it was broken ?

    • The statement is related to the performance barrier set by the previous generation. The new figures show a 2.7x improvement over the 3A/3B1000 family which is a remarkable engineering achievement.

      Regards,
      Alex.

      • tangey

        Hmmm, well ok, “new generation performs better than previous generation” isn’t exactly remarkable in the industry, especially considering the wikipedia entry indicates there has been a 4-5 year gap, and a switch from 65nm to 28nm.

        Apple often gets a x2 overall soc performance in it’s year-on-year soc improvements.

        • I think the performance you are talking about is graphics, not CPU. Increases in CPU performance tends to occur in increments; a 30-50% increase in CPU performance from one generation to the next is considered to be quite competitive.

          In this case we are talking about a much more substantial gain which deserves the praise in my opinion.

          • Mark Hahn

            I think the point here is that the chip is merely competitive. The fact that it’s significantly faster than previous iterations which used trailing edge processes – well, is that really something to brag about?

            also, specCPU-per-clock is not a measure anyone takes seriously.

          • Raj

            Then what is the “measure” that everyone takes seriously — marketing bull!

          • Mark Hahn

            spec-per-dollar. spec-per-watt. just spec. spec-per-clock is sorta like judging a vehicle by speed-per-RPM.

          • valerij

            how much about cpu performance will say your ‘spec-per-dollar’? you have said a stupid thing above, next comments only make it more sounding. per-clock measure shows exactly how fast cpu is disregarding clock speed, thus how good its internal microarchitecture is in respect of performance – how much instructions it execute at once (in average). this is clearly a measure, and it is very representative of cpu internals sophisticatedness. as it says in my land – you farted in the tub. 😀

          • handleym

            (1) “specCPU-per-clock”, ie IPC, is interesting insofar as it tells us something about the sophistication of the micro-architecture, and thus of the design team and their priorities. For comparison I’d expect the Apple numbers to be slightly below the Intel numbers. (Apple has a better IPC core than Intel, but their uncore is not yet as good, and SPEC stresses the uncore).

            (2) Apple got two great leaps in performance each of which could be considered around 2x if you look at the right benchmarks; the first going from the A5 to the A6 with their custom Swift core, the second going from the A6 to the A7 with the advantages of the ARMv8 ISA, 64 bit, and 6 wide rather than 3 wide.
            The A7 to A8 transition was more like 20% overall, with substantial variation depending on the exact benchmark. (And with an additional 8% of so boost just from compiler improvements from 2013 at A8 introduction to 2014 at A8 introduction.)

            The most realistic expectations for the A9 are the same sort of jump, of maybe 20 to 25% improvement, perhaps half from frequency going up to 1500 to 1600 MHz (from 1400MHz), the other half from improved micro-architecture. (Though the A9 may pick up a third core, and the A9X a fourth core.)

            This is not to say that the party is over for Apple (or other mobile vendors). There remain many interesting ideas for how to gain 5% here, 10% there, while maintaining energy budget. There is no reason to imagine we will have to settle soon for Intel style 2% a year improvements. But the easiest and lowest risk options have been used up. So far it’s just been copying what Intel, IBM (and, yes, MIPS) did years ago. From now on out it’s more novel ideas (various types of heterogeneous processing, low-voltage circuits, better cache-miss latency tolerance) and each of these ideas has to be simulated exhaustively before it can be concluded that it’s feasible and offers a good enough performance/energy tradeoff.

            Getting back to Loongson, what I’m seeing here is that their sophistication is around that of ARM a few years ago — higher IPC than A57, but that’s easy if you don’t have a tight power constraint; and not as sophisticated as A72 which should pretty much match the Loongson IPC at lower power.

            So, good for them, but still not quite world-class. But on track… Let’s see what they release next year.

          • Mark Hahn

            spec/clock is certainly not IPC. the real point is that clock is a design constraint, just like area or transistor delay: you produce a different design if you aim for 1.3 GHz rather than 2.6 GHz, even if you hold everything else constant. it’s no more informative than spec-per-layers-of-metal.

          • Obtaining the final SPEC CPU2000 scores means multiplying the value displayed above with the frequency of each processor.

        • Alex Alexandrewitsch

          Lol

  • TimBob

    I’m impressed by these cpu’s. It show’s just how far China has come in processor design capability. These chips will not be competing against Intel, AMD or even ARM, these chips are for national consumption only, so they basically have to be ‘good enough’. I think they are beyond ‘good enough’ and so represent a very nice home grown (therefore more secure) alternative for the Chinese government to install into government servers, desktop pc’s and laptops. Key features are increased performance, virtualisation and improved compatibility with x86 and ARM architectures.

  • LDM

    Hi Alex,
    This cpu seems to be targeting laptops and small desktop rather than mobiles and tablets.
    Will you have a mobile/tablet version in your plans?

    L

    • Loongson licensed the MIPS architecture with the direct goal of building a family of desktop and embedded-focused processors. These chips indeed have more breathing room in terms of power consumption since they target high-performance appplications like laptops or servers.

      Our current Warrior designs are indeed focused on mobile where performance/watt and area efficiency are very important metrics. This is where our 64-bit I6400 CPU fits the bill perfectly.

      Regards,
      Alex.

      • Alex Alexandrewitsch

        Okay, did You have any Plans on High End? I want independence and Securitiy. I want to play games without compromisses.

  • Alex Alexandrewitsch

    Plz we need an high end gpu that supportd vulcan! Its a great developent. Thank you, i wantt more news

  • Mehmed

    Great. Now we need a 2016 version of this and an ARM cortex a72 to see which one is better.

  • Ed Pell

    Do you have contact info for Loongson? I would like to apply for a job with them. My area is physical design verification so far down to 10nm. edpell aatt optonline doot net

    • Fred Bosick

      You might be very good at your job, but this chip is a flagship of Chinese industry. They want it to be totally homegrown(nevermind that it’s derived from an opensourced MIPS design from the ’90s). Besides, they might have all sorts of TPM technology embedded and they don’t want Westerners to see how deeply ingrained it really is.

  • Fred Bosick

    The last in-order processor Intel offered was the Pentium. There is no way Loongson beats AMD unless the benchmark was handwritten machine code.