Intel Xeon 7460: Six Cores to Bulldoze Opteron
by Johan De Gelas on September 23, 2008 12:00 AM EST- Posted in
- IT Computing
Limitations of this report
We are happy that we finally feel comfortable with most of our virtualization testing. We still have to do some in-depth profiling to be completely sure what is going on, but we decided to not wait any longer. This is only the beginning, though. We have tested several other virtualization scenarios (including Windows as Guest OS, Hyper-V as hypervisor, Oracle as database, and so on) but we are still checking the validity of those benchmarks. In other words, we are well aware that this report cannot give you a complete picture; it's only an initial rough draft.
Here are the limitations of our current virtualization testing:
- Out of all the databases, MySQL has shown the best performance on the AMD platform relative to the Intel platform. This is probably a result of the excellent Opteron and Athlon 64 optimizations in the gcc compiler.
- We use a 64-bit version of MySQL, and the Intel architectures pay a small penalty when you run a 64-bit database (no macro-op fusion for example). However, as the 64-bit MySQL performs quite a bit better than the 32-bit one, we feel we made the right decision.
- Our best Opteron is a 95W Opteron 8356, while we used a 130W Xeon X7460 and a 130W Xeon X7350. This is simply a result of what we have had available in the labs in the past months. This problem is easy to solve: the performance of the Opteron 8360SE (125W) will be between 1% and 8% higher, so for those looking at the Opteron 8360SE it is pretty easy to get an idea what this CPU could do.
- No HPC benchmarking, as we wanted to focus our efforts and time on our first virtualization results. Priorities…
Please keep these limitations in mind.
Conclusion
The third party benchmark numbers are unanimous: servers based on Intel's monster hex-core processor are the best choice when for high-end database/ERP applications. Compared to the previous Xeons, performance has increased by 40% or more while power consumption has dropped. The 6-core Xeon is the clear winner and offers a very nice upgrade path for owners of current Xeon 73xx servers. We even dare to predict that the newest Nehalem based Xeons will not really enter this market before the octal-core Beckton is launched in the second half of 2009.
When it comes to the virtualization market, which is a much larger market (in shipments), it is a very different picture. Where the 6-core CPU extends an existing lead elsewhere, for virtualization the new 45nm Xeon MP comes just in time. The quad-core Opteron has been giving the Xeon 73xx a serious beating, offering up to 24% better performance while using 20-25% less power (X7350 versus 8356). If you prefer to look at CPUs with approximates the same TDP, Opteron was offering about a third more performance while consuming a few Watts less. The hot and power hungry FB-DIMMs do not help in a market where performance/Watt and more memory (higher consolidation ratios) rule, and the Opteron clearly has better virtualization support.
The new 45nm Xeon X7460 brings the virtualized performance/Watt crown back to the Intel camp, and we expect the E7450 (2.4GHz) to offer an even better performance/Watt ratio. After all, the E7450 also has six cores but at a lower TDP. In the very near term, AMD will probably have no other choice than to lower the price of its fastest quad-cores. Nevertheless, the battle for the virtualization market is still not over, as both AMD and Intel have new quad-cores lined up.
Quite a few people gave us assistance with this project, and as always we would like to thank them. Our thanks goes to Sanjay Sharma, Trevor Lawless, Kristof Sehmke, Matty Bakkeren, Damon Muzny, Brent Kerby, Michael Kalodrich and Angela Rosario. A very special thanks to Kaushik Banerjee who pointed out errors in our virtualization benchmarking procedure and Tijl Deneut, who helped me solve the weirdest problems despite the numerous setbacks we encountered in this project.
34 Comments
View All Comments
npp - Tuesday, September 23, 2008 - link
I didn't got this one very clear - why should a bigger cache reduce cache syncing traffic? With a bigger cache, you would have the potential risc of one CPU invalidating a larger portion of the data another CPU has already in its own cache, hence there would be more data to move between the sockets at the end. If we exaggerate this, every CPU having a copy of the whole main memory in its own cache would obviously lead to enormous syncing effort, not the oposite.I'm not familiar with the cache coherence protocol used by Intel on that platform, but even in the positive scenario of a CPU having data for read-only access in its own cache, a request from another CPU for the same data (the chance for this being bigger given the large cache size) may again lead to increased inter-socket communication, since these data won't be fetched from main memory again.
In all cases, inter-socket communication should be much cheaper than the cost of a main memory access, and it shifts the balance in the right direction - avoiding main memory as long as possible. And now it's clear why Dunnington is a six- rather than eight-core - more cores and less cache would yield a shift in the entirely opposite direction, which isn't what Intel is needing until QPI arrives.
narlzac85 - Wednesday, September 24, 2008 - link
In the best case scenario (I hope the system is smart enough to do it this way), with each VM having 4 CPU cores, they can keep all their threads on one physical die. This means that all 4 cores are working on the same VM/data and should need minimal access to data that another die has changed (if the hypervisor/hostOS processes jump around from core to core would be about it). The inter-socket cache coherency traffic will go down (in the older quad cores, since the 2 physical dual cores have to communicate over the FSB, it might as well have been the same as an 8 socket system populated by dual cores)Nyceis - Tuesday, September 23, 2008 - link
Can we post here now? :)JohanAnandtech - Wednesday, September 24, 2008 - link
Indeed. As the IT forums gave quite a few times trouble and we assume quite a few people do not comment in the IT forums as they have to register again. I am still searching for a good solution as these "comment boxes" get messy really quickly.Nyceis - Tuesday, September 23, 2008 - link
PS - Awesome article - makes me want hex-cores rather than quads in my Xen Servers :)Nyceis - Tuesday, September 23, 2008 - link
Looks like it :)erikejw - Tuesday, September 23, 2008 - link
Great article as always.However the performance / watt comparison is quite useless for virtualization systems though since they scale well at a multisystem level and for other reasons too
I won't hurt to make them but what users really care of is performance / dollar (for a lifetime)
Say the system will be in use for 3 years.
That makes the total powerbill for a 600W system about 2000$, less then the cost of one Dunnington and since the price difference between the Opteron and Dunnington cpus is like 4800$ you gotta be pretty ignorant to choose system with the performance / watt cost.
Lets say the AMD system costs 10000$ and the Intel 14800$(will be more due to Dimm differences) and have a 3 year life then the total cost for the systems and power will be 12000 and 16800.
That leaves us with a real basecost/transaction ratio of
Intel 5.09 : 4.25 AMD
AMD is hence 20% more cost effective than Intel in this case.
Any knowledgable buyer has to look at the whole picture and not at just one cost factor.
I hope that you include this in your other virtualization articles.
JohanAnandtech - Wednesday, September 24, 2008 - link
You are right, the best way to do this is work with TCO. We have done that in our Sun fir x4450 article. And the feedback I got was to calculate on 5 years, because that was more realistic.But for the rest I fully agree with you. Will do asap. How did you calculate the power bill?
erikejw - Wednesday, September 24, 2008 - link
Sounds good, will be interesting.The calculations was just a quick and dirty 600W 24/7 for 3 years and using current power prices.
VM servers are supposed to run like that.
It would also be interesting to see how the Dunnington responds when using more virtual cores than physical. Will the decline be less than the older Xeons?
What is a typical (core)load when it comes to this?
The Nehalems will respond more like the Athlons in this regard and not loose as much when the load increases, at a higher level than AMD though.
I realised the other day that it seems as AMD have built a servercpu that they take the best of and brings to the desktop market and Intel have done it the other way around.
The Nehalems architechture seems more "serverlike" but will make a bang on the desktop side too.
kingmouf - Thursday, September 25, 2008 - link
I think this is because they have (or should I say had) a different CPU that they wanted to cover that space, the Itanium. But now they are fully concentrated to x86, so...