Objectives
This analysis compares the SPECfp95 performance of single vs. dual CPU Octane configurations. At present, this requires two analyses: 1x195MHz vs. 2x195MHz and 1x250MHz vs. 2x250MHz.
SPECint95 is not covered because the MIPSpro Auto Parallelizing Option does not appear to be relevant for running integer tasks on multi-CPU systems, as explained on my Octane Dual-CPU comparison page.
This analysis deals only with SPECfp95 because reliable data for other applications, such as LSDyna and Performer examples, is hard to obtain.
Note that I do not have any SPECfp95 data for R10K/175MHz. Since many systems will be using this CPU, please contact me if you have any relevant detailed data.
1x195MHz vs. 2x195MHz
All source data for this analysis came from www.specbench.org.
Given below is a comparison table of available single vs. dual R10000 195MHz SPECfp95 test results for Octane. Faster configurations are leftmost in the table (in the Inventor graph, they're placed at the back). After the table and 3D graphs is a short-cut index to the original results pages for the various systems.
R10000 R10000 % Increase 2x195MHz 1x195MHz (1x195 -> 2x195) tomcatv 45.7 25.3 80.6% swim 58.9 40.6 45.1% su2cor 17.0 9.64 76.4% hydro2d 14.4 9.97 44.4% mgrid 28.4 15.9 78.6% applu 17.3 11.2 54.5% turb3d 13.4 13.8 -2.9% apsi 12.7 12.8 -0.8% fpppp 29.8 29.7 0.3% wave5 21.7 22.4 -3.1%
Next, a separate comparison graph for each of the ten SPECfp95 tests:
tomcatv:
swim:
su2cor:
hydro2d:
mgrid:
applu:
turb3d:
apsi:
fpppp:
wave5:
Observations
Before I comment on the above, I strongly suggest you first read my page comparing dual-CPU Octane systems which use different R10000s (eg. 2x195 vs. 2x250).
turb3d, apsi, fpppp and wave5 are not affected, while all the other tests gain to a significant degree. This doesn't mean that the non-accelerated tests cannot be parallelised; it merely means that SGI's compilers don't currently affect them (whether this is because the tests factually cannot be accelerated is a separate issue). The reason I say this is because results from other vendors show that different vendors' autoparallelising compiler options behave in very different ways. Note: if you're wondering why some tests appear to slow down slightly, this is because, obviously in my opinion, the autoparallelising option will interfere slightly with the optimisations which would normally occur for the relevant tests in a single CPU system (besides, the differences are well within standard margins of error anyway).
Given the way the above results are combined into final averages, I could write at length about the obvious statistical insanity of using such final averages to perform system comparisons. However, as implied above, the dual-CPU comparison page contains everything I would wish to say on the matter. Please read the observations on that page closely.
On the general topic of multi-CPU processing, it is important to remember that just because a particular SPECfp95 is accelerated by the autoparallelising option, this doesn't mean your particular task will also be accelerated, even if your code type is (as far as you can tell) similar to said SPECfp95 test.
Identifying which SPECfp95 test might be like your own is hard enough already, but when dealing with parallel code possibilities, one must absolutely have proper real tests performed on any target system. For decision makers, this means asking for a test to be carried out on an example 'upgraded' system before deciding whether or not to upgrade.
Obviously, the same applies to any task, not just SPECfp95-style work. One complaint I have about dual-CPU Wintel systems is that they're often spoken of by people as if they're the jewel in the crown because they can be so cheap, but few people ever bother to question whether their task can actually take advantage of both CPUs at the same time - many cannot.
Of course, dual-CPUs isn't always about parallelising code or tasks. Some Octane owners will intend using a dual-CPU system as a two-seat system, with two users, two keyboards and two monitors. For such people, what matters more is confidence that any upgrade will still allow each user to take full advantage of his/her CPU without their tasks being interfered with by whatever the other user is doing. Thankfully, because of the way Octane works (eg. HEART/CPU link speed tied to CPU speed), I'm pretty sure that one can always have such confidence. Even so, I'd always ask for tests to be carried out.
Next there follows a comparison involving R10000 250MHz.
All source data for this analysis came from www.specbench.org.
Given below is a comparison table of available single vs. dual R10000 250MHz SPECfp95 test results for Octane. Faster configurations are leftmost in the table (in the Inventor graph, they're placed at the back). After the table and 3D graphs is a short-cut index to the original results pages for the various systems.
R10000 R10000 % Increase 2x250MHz 1x250MHz (1x250 -> 2x250) tomcatv 49.1 29.4 67.0% swim 63.6 46.3 37.4% su2cor 20.0 11.2 78.6% hydro2d 15.6 11.4 36.8% mgrid 32.2 18.5 74.1% applu 19.9 13.2 50.8% turb3d 16.4 16.9 -3.0% apsi 15.7 16.0 -1.9% fpppp 37.7 37.1 1.6% wave5 29.3 27.4 6.9%
Next, a separate comparison graph for each of the ten SPECfp95 tests:
tomcatv:
swim:
su2cor:
hydro2d:
mgrid:
applu:
turb3d:
apsi:
fpppp:
wave5:
Observations
All of the important points have already been stated above and elsewhere (examine my other SPEC95 analysis pages).