ARM to take multicore mainstream with A5
ARM intends to bring multiprocessing to standard cellphone handsets with the launch of a processor core that is much simpler than the high-end Cortex A9 but which is code compatible down to the firmware level.
Eric Schorn, vice president of marketing for ARM’s processor division, said the Cortex A5 is roughly the same size in terms of die area as the existing ARM 926 but runs code faster than the ARM 11, which is used in many of today’s handsets and portable devices.
“This particular processor is targeted at the really high-volume, high-value sweetspot that is currently occupied by the ARM 9 and ARM 11,” said Schorn. “It will be used in handsets but also in digital photoframes, still cameras, networking gear and Kindle-type devices. And it approaches the microcontroller category.”
Schorn said that, as the A5 is compatible with the A9 down to the firmware level and uses the same multiprocessor model, it would ultimately supplant the existing A8.
Unlike the A9, the A5 is not a full superscalar pipeline and processes instructions entirely in-order. However, it can despatch two instructions inn parallel if they target different parts of the core. For example, branches can be processed in parallel with arithmetic instructions.
By putting down four instances of the A5, the aggregate performance of the core will exceed one or two A9 processors. “They do overlap but that overlap provides choice,” said Schorn.
ARM estimates that a single A5 core with its first-level cache will consume a little under 1sq mm of die area on a 45nm-class process. Support for multiprocessor operation adds 5 per cent to the overall area. Schorn claimed that, implemented on TSMC’s 40G process, the clock speed of the A5 could hit 1GHz. This drops to around 500MHz on the low-power version of TSMC’s 40nm process.
“If you put down four cores, you are still half the size of the Intel Atom and you would outperform it. And power would still be less than half that of the Atom,” Schorn claimed, conceding that to get the full performance, parallel threads need to be running. “You also have the advantage that you can turn each processor on and off as necessary and you can lower the voltage and frequency dynamically. For example, if you have a browser with four threads providing an 80 per cent loading on a single processor, if you distribute each thread to a different core, you could then lower your voltage and frequency to save power on each one. You should get a 50 per cent saving in power.
“Often, people think ‘wow, four cores, that’s four times the power’, but in the cases where it does work you can cut power, which is kind of counter-intuitive,” Schorn added.
Although some customers have already signed up for the A5 and have access to the current design files, the core will be made generally available at the end of the year. Products using the core will probably arrive after 2011, said Schorn.