Software Rendering
Some of you might remember the "Kribi" engine, an ultra-powerful real-time software rendering 3D engine. It seems like madness to invest time in a software 3D engine now that the GeForce 8800 has 128 small FPUs working at 1.35GHz, but software rendering is far from dead. The new Intel Core architecture can perform up to four 64-bit FP (3 sustained) instructions per clock cycle, and now we have cheap quad cores at 2.4GHz. That is a lot of FP power too, if you carefully optimize for it. That is exactly what the people of zVisuel in Lausanne, Switzerland have been doing. There are quite a few advantages for using software rendering. For example, the end result looks the same on every PC and it runs (although potentially faster or slower depending on hardware) on every PC. That is a big advantage for companies where many people use portables.
If you are still not convinced that real-time software rendering can offer great results, take a look at this movie.
An example what the zVisuel Kribi 3D Engine can create
Eric Bron provided us with a benchmark which is based on real world use by zVisuel's clients. The first benchmark does not use antialiasing.
As we explained here, the new Core architecture has theoretically twice the SSE2 power of the Athlon X2. Extremely carefully optimized SSE2 applications such as the 3D engine of zVisuel show that this leads to a 70% IPC advantage in practice. This shows very nicely why AMD needs the new K10 family: in this case the Athlon 64 architecture is starting to show its age.
We performed the same benchmark, but now antialiasing was applied.
AA clearly makes the application more memory intensive. The two quad core Xeons are only 38% faster than one CPU, while they were 50% faster in the previous benchmark. This helps the Opteron to make the gap a little smaller: the Xeon 3GHz is 48% faster clock for clock, instead of 70%.
Some of you might remember the "Kribi" engine, an ultra-powerful real-time software rendering 3D engine. It seems like madness to invest time in a software 3D engine now that the GeForce 8800 has 128 small FPUs working at 1.35GHz, but software rendering is far from dead. The new Intel Core architecture can perform up to four 64-bit FP (3 sustained) instructions per clock cycle, and now we have cheap quad cores at 2.4GHz. That is a lot of FP power too, if you carefully optimize for it. That is exactly what the people of zVisuel in Lausanne, Switzerland have been doing. There are quite a few advantages for using software rendering. For example, the end result looks the same on every PC and it runs (although potentially faster or slower depending on hardware) on every PC. That is a big advantage for companies where many people use portables.
If you are still not convinced that real-time software rendering can offer great results, take a look at this movie.
An example what the zVisuel Kribi 3D Engine can create
Eric Bron provided us with a benchmark which is based on real world use by zVisuel's clients. The first benchmark does not use antialiasing.
As we explained here, the new Core architecture has theoretically twice the SSE2 power of the Athlon X2. Extremely carefully optimized SSE2 applications such as the 3D engine of zVisuel show that this leads to a 70% IPC advantage in practice. This shows very nicely why AMD needs the new K10 family: in this case the Athlon 64 architecture is starting to show its age.
We performed the same benchmark, but now antialiasing was applied.
AA clearly makes the application more memory intensive. The two quad core Xeons are only 38% faster than one CPU, while they were 50% faster in the previous benchmark. This helps the Opteron to make the gap a little smaller: the Xeon 3GHz is 48% faster clock for clock, instead of 70%.
30 Comments
View All Comments
piroroadkill - Tuesday, August 7, 2007 - link
it is a car analogyGul Westfale - Monday, August 6, 2007 - link
good analogy there, except that mustangs (and various other cars) use pickup truck engines for cost reasons. large trucks use larger engines (often diesels) because they offer considerably more torque at much lower RPM than a smaller gasoline engine; and thus provide more pulling power.Gul Westfale - Monday, August 6, 2007 - link
these are not regular consumer cpus, but intended for use in commercial servers and workstations. they and their motherboards cost more because they support features such as multiple sockets (so in addition to having multiple cores on one chip you can also have multiple chips on one motherboard).yyrkoon - Monday, August 6, 2007 - link
they win 1 of 2 tests, and it is clear they are the winner ? Why ? Because they won the software rendering also ? Anyone interrested enough in rendering, and HAVING to have this sort of hardware for it is NOT going to bother with software . . .
This means your conclusion on this point is incorrect, and in which case, it boils down to which application the rendering machine is going to do.
Man you guys come to the wierdest conclusions based on your own data, and I am not even the first to notice/mention this sort of thing . . .
JohanAnandtech - Monday, August 6, 2007 - link
The Quadcore wins all high resolution rendering tests. Where do you see the DC opterons win against the Quadcore Intel in high resolution rendering? Show me a rendering engine where a 3 GHz K8 DC core is faster in high resolution renderering than a 2.33 GHz Quadcore. All decent and used in the realworld rendering engines will more or less show the same picture.In fact, the "rendering performance" situation will get worse for the K8 as SSE-2 tuning will get more common. All Intel CPUs since core and all AMD CPUs since Barcelona will show (or are already showing) high performance boost from using better SSE-2 code.
yyrkoon - Monday, August 6, 2007 - link
Ok, I see now with the graphs 'lower is better' on 3ds max, I missed that with the tables, which is actually what I meant this morning 'table obfustication'. I personally do not mind tables, but when the data is not in a uniform spot, it confuses/makes it harder to read at a glance.Anyhow, I was tired when I posted this morning, cranky, and was overly harsh I think. However it *is* much easier for me personaly to read the graphs at a glance (I cannot speak for everyone though).
yyrkoon - Monday, August 6, 2007 - link
Oh, and while on the subject, you guys here at anandtech have lately mastered the art of graph obfustication. Is it really THAT hard leaving items in the same rows / columns for different tests ? Are we trying to confuse the results, or is there some other reason this happens, and has gone completely over my head ?JohanAnandtech - Monday, August 6, 2007 - link
The only reason is that until very recently I didn't master the graphing engine. I got some weird error messages and gave up. But I have found the error, and you should see some nice graphs which don't obfusticate...Spoelie - Monday, August 6, 2007 - link
the gif on page 2 is non-looping, so after a very quick jump from 1ghz -> 2.8ghz (why??) -> 3.2ghz , it stays put on the 3.2ghz image. If reading the article, by the time the reader sees the image, it's already 5 minutes on the last image and staying there, making it for all intents and purposes a static image instead of an animated one:)
JohanAnandtech - Monday, August 6, 2007 - link
Thanks, fixed that. The reason to show 2.8 GHz is that for example Specjbb and other applications sometimes don't completely stress the CPU and then the cpu dynamically goes back to 2.8 GHz. It are simply the 3 stages I saw the most, and found the most interesting to show.