Original Link: https://www.anandtech.com/show/819
Intel Developer Forum - Fall 2001: CPU Report
by Anand Lal Shimpi on August 28, 2001 1:56 PM EST- Posted in
- CPUs
While we were at the show yesterday, today is really the first official day of IDF. And in the usual IDF tradition, there were a number of announcements and demonstrations that required us to give you a report directly from the show floor.
Northwood at 3.5GHz
If you weren't too impressed by the Pentium 4 2.0GHz launch yesterday, then today's demo of a 0.13-micron Northwood Pentium 4 processor running at 3.5GHz should pique your interest.
While Intel clocked their Northwood at 3.5GHz, they ran through a pretty intensive demo with the CPU clocked at 3.0GHz. The presenters seemed surprised that they were actually up and running at 3.5GHz indicating an unexpectedly high yield on the part. We were slightly disappointed to find out that the system was supercooled but it does show off the potential of the chip. Later today Intel will be showing off an air-cooled 4GHz double pumped ALU (effectively 8GHz) from a future Intel processor. The interesting thing about this demo is that it will be a 32-bit ALU that's being shown off…
The 3.0GHz demo that was run involved the machine running Quake III Arena in a window while serving as a video management server for other computers in the same household. The digital video feeds encoded and sent out by the 3.0GHz system combined with the run of Quake III Arena kept the CPU utilization at 100%. The fact that the demo system did not crash while running at 3GHz with 100% utilization indicates that the yield on the CPU being used was very high.
During a Windows XP demonstration, the Northwood 2.2GHz Pentium 4 was used to demonstrate some of the new features of Windows XP.
With a 512KB L2 cache, AMD will find it very difficult to compete with very high clock speeds with their current line of Athlon processors. Luckily for AMD, the Pentium 4 will only hit 2.2GHz this year with the Northwood core.
Now that the Pentium 4 is finally shedding its low introductory clock speeds, the Athlon vs. Pentium 4 debate is going to get a lot more interesting over the next few months.
With Northwood, Intel will also be introducing mobile Pentium 4 devices in the next year. Although AMD's Athlon 4 has brought AMD a high-performance mobile solution, they have yet to compete with Intel in the mobility facet of notebook computing. A repackaged, cooler running 0.13-micron Athlon 4 will be necessary for AMD to be taken seriously in the mobile market. It's coming, but it's not here yet.
A little on Banias
As you may already know, the Intel Timna design team out of Israel has been working on an entirely new microprocessor known as Banias. Banias has been designed from the ground up to be a low power, high-performance solution primarily targeted at the mobile market.
Intel revealed a bit about how the Banias achieves its low power and high performance characteristics.
One of the most common ways of reducing power with mobile CPUs is by clock gating. The idea of clock gating is simple; areas/units of the CPU that aren't being used are powered down. According to Intel, Banias will have even more aggressive clock gating enabling many more areas of the CPU to be powered down when not being utilized.
When AMD released the Palomino core they claimed a 20% decrease in power consumption by use of more specialized transistors and design optimizations; the Banias processor implements similar transistor choice and sizing techniques but on a much more extreme level.
The final architectural feature of Banias that was discussed today was what Intel calls "Micro Ops Fusion." You'll remember from our previous explanations that Micro Ops are the decoded instructions that the CPU's execution units actually work with. Micro Ops Fusion takes these Micro Ops and combines them to be executed in a much more resourceful fashion. The goal of Micro Ops Fusion is to make more efficient use of the execution units which is actually a goal of another Intel technology introduced today.
The end result of Micro Ops Fusion is an increased number of Instructions (Processed) Per Clock (IPC).
Jackson gets a name: Hyper-Threading Technology
During our Spring IDF 2001 coverage we introduced the idea of Simultaneous Multi-Threading (SMT) architecture being present in the upcoming 4-way Xeon MP processors. Today, Intel put a face with the name and introduced the official marketing name for Jackson Technology/SMT: Hyper-Threading Technology.
There are many types of parallelism that can be achieved on a microprocessor level. Instruction Level Parallelism ensures the simultaneous issuing and execution of instructions. Thread Level Parallelism was previously only attainable through the use of multiple processors which enables the simultaneous execution of threads. A thread is the simplest form of execution that an Operating System can issue to the processor(s) in the system; the inherent limitation being that only a single thread can be sent to a single processor at any given time. The problem with this ends up being that these single threads rarely make 100% usage of the execution power of the CPU.
In order to attain Thread Level Parallelism you'd naturally have to make use of more than one processor since you're limited to one thread executed per CPU at any given time. Another way to execute multiple threads in parallel is by enabling virtual multiprocessing where the OS thinks that your single CPU is actually two or more CPUs and issues it more than one thread to execute. This is what Simultaneous Multi-Threading (SMT) technology enables.
Originally known under the codename Jackson, Intel announced Hyper-Threading technology which is the marketing spin on SMT. We have hypothesized in the past that what is now known as Hyper-Threading technology has been on all of the Pentium 4 dies created up to this point; it was simply not enabled. Hyper-Threading will finally make its debut next year on the 2-way/4-way Xeon processors and hopefully by the end of the year it will transition down to the desktop level in the Pentium 4.
Hyper-Threading requires little more than OS support for multiple thread execution. In this case, OSes like Windows XP Professional will see the single Hyper-Threading enabled processor as being two CPUs.
Intel actually gave a demonstration of the performance improvement offered by Hyper-Threading by comparing a single Xeon processor with HT enabled and disabled. Under Maya 4.0 the HT enabled Xeon processor was able to render a scene 30% faster than the HT disabled Xeon processor. The systems were identical in configuration.
We often complain about the lower IPC of the Pentium 4, but when HT comes down to the desktop level the IPC of the processor will be improved by a potentially very significant margin.
That's all for now…
For us, it's off to a meeting with AMD then NVIDIA (yep, at IDF) and as usual we'll keep you up to date with everything that happens at the conference. Until then, enjoy ;)