The Full Monte-cito (part 6)
Beyond Multicore
Just as the era of increasing the frequency came to an end, so the era of multicore will come to an end eventually. We don’t know when, however.
The Itanium Solutions Alliance thinks that this process of shrinking and adding cores shall continue for some time to come. Multi-core holds “the promise of ongoing performance scaling through this decade and beyond” (p. 5).
The number of cores in a CPU should continue to increase for as long as die sizes shrink. Transistors, however, cannot shrink forever. At some point the current rate of shrinking has to end, and the CPU industry will have hit its second wall. It won’t be possible to double the number of cores at the same rate of the past.
Maybe this won’t happen anytime soon. Maybe the era of multicore has a long life in front of it. However, maybe the end will come sooner than everyone thinks. The gigahertz era came to an untimely end, and no one seemed to see the end coming.
Regardless of when the end comes, however, at that time the computing industry will have to turn to alternative methods of increasing performance if it is to keep up the rate of the past.
At that time, the current 32-bit Intel Architecture (IA-32) should have run its course. There’s only so much additional performance that can be squeezed out of IA-32 through microarchitectural innovations and other techniques.
The EPIC architecture of Itanium and Itanium 2, however, was designed with limitations of frequency speed and transistor size in mind.
Back when the chip was under development, CPU architects could see that transistors could not shrink forever, and that the days of increasing the CPU frequency were numbered, too.
The Itanium family therefore strives to increase performance by increasing the number of things that a processor does at a time. Sound familiar?
The Itanium 2 microarchitecture is in many ways the antithesis of the NetBurst microarchitecture. NetBurst was designed primarily to increase performance by increasing the CPU clock.
Itanium 2, however, is engineered to increase performance through increasing the number of instructions that can be carried out during a given clock cycle, that is, by increasing the instructions per clock (IPC).
Compare Itanium 2 with Intel’s latest 32-bit design, the Core microarchitecture, which is also optimized for greater instructions per clock.
One of the advantages that the Core microarchitecture has is it’s ability to execute up to 4 instructions for every clock cycle. That’s an improvment over the Pentium D and Core Duo, which were only able to carry out three instructions at a time.
Itanium 2, however, can execute up to 6 instructions for every clock cycle.
And that’s just the beginning. The architecture was designed to scale using microarchitectural innovations. To this end, the Itanium 2 CPU contains 128 general purpose registers
Ever wonder why the Itanium CPUs are clocked so low? Clock frequency is less important to Itanium 2 in part because it can perform more work for each clock cycle.
Maybe when the size of transistors has become so small that they can shrink no more, and the number of cores can no longer be doubled, and the frequency can be pushed no further, maybe then the IT industry will turn to the Itanium design to continue its onward march in performance. Intel would certainly like that.