[Future Technology Research Index] [SGI Tech/Advice Index] [Nintendo64 Tech Info Index]

[WhatsNew] [P.I.] [Indigo] [Indy] [O2] [Indigo2] [Crimson] [Challenge] [Onyx] [Octane] [Origin] [Onyx2]

Ian's SGI Depot: FOR SALE! SGI Systems, Parts, Spares and Upgrades

(check my current auctions!)

Octane SPECfp95 Single vs. Dual R10000
CPU Performance Comparison Using Identical R10000s

Last Change: 28/May/1998

SPEC's Introduction to SPEC95
1x195MHz vs. 2x195MHz
1x250MHz vs. 2x250MHz

Note: the 2D bar graphs shown here for the various SPECfp95 tests have been drawn to the same scale. They are also at the same scale as other 2D bar graphs for dual-CPU systems, but they are not to the same scale as any 2D bar graph for a single-CPU system, or 4-CPU system, etc.

Objectives

This analysis compares the SPECfp95 performance of single vs. dual CPU Octane configurations. At present, this requires two analyses: 1x195MHz vs. 2x195MHz and 1x250MHz vs. 2x250MHz.

SPECint95 is not covered because the MIPSpro Auto Parallelizing Option does not appear to be relevant for running integer tasks on multi-CPU systems, as explained on my Octane Dual-CPU comparison page.

This analysis deals only with SPECfp95 because reliable data for other applications, such as LSDyna and Performer examples, is hard to obtain.

Note that I do not have any SPECfp95 data for R10K/175MHz. Since many systems will be using this CPU, please contact me if you have any relevant detailed data.

1x195MHz vs. 2x195MHz

As usual a 3D Inventor model of the data is available (screenshots of this are included below). Load the file into SceneViewer or ivview and switch into Orthographic mode (ie. no perspective), rotate the object horizontally then vertically, etc.

All source data for this analysis came from www.specbench.org.

Given below is a comparison table of available single vs. dual R10000 195MHz SPECfp95 test results for Octane. Faster configurations are leftmost in the table (in the Inventor graph, they're placed at the back). After the table and 3D graphs is a short-cut index to the original results pages for the various systems.

          R10000    R10000        % Increase
         2x195MHz  1x195MHz   (1x195 -> 2x195)

tomcatv    45.7      25.3            80.6%
swim       58.9      40.6            45.1%
su2cor     17.0      9.64            76.4%
hydro2d    14.4      9.97            44.4%
mgrid      28.4      15.9            78.6%
applu      17.3      11.2            54.5%
turb3d     13.4      13.8            -2.9%
apsi       12.7      12.8            -0.8%
fpppp      29.8      29.7            0.3%
wave5      21.7      22.4            -3.1%

(click on the images above to download larger versions of the views shown)

[Test Suite Description | 2x195MHz | 1x195MHz ]

Next, a separate comparison graph for each of the ten SPECfp95 tests:

tomcatv:

tomcatv comparison graph

swim:

swim comparison graph

su2cor:

su2cor comparison graph

hydro2d:

hydro2d comparison graph

mgrid:

mgrid comparison graph

applu:

applu comparison graph

turb3d:

turb3d comparison graph

apsi:

apsi comparison graph

fpppp:

fpppp comparison graph

wave5:

wave5 comparison graph

Observations

Before I comment on the above, I strongly suggest you first read my page comparing dual-CPU Octane systems which use different R10000s (eg. 2x195 vs. 2x250).

turb3d, apsi, fpppp and wave5 are not affected, while all the other tests gain to a significant degree. This doesn't mean that the non-accelerated tests cannot be parallelised; it merely means that SGI's compilers don't currently affect them (whether this is because the tests factually cannot be accelerated is a separate issue). The reason I say this is because results from other vendors show that different vendors' autoparallelising compiler options behave in very different ways. Note: if you're wondering why some tests appear to slow down slightly, this is because, obviously in my opinion, the autoparallelising option will interfere slightly with the optimisations which would normally occur for the relevant tests in a single CPU system (besides, the differences are well within standard margins of error anyway).

Given the way the above results are combined into final averages, I could write at length about the obvious statistical insanity of using such final averages to perform system comparisons. However, as implied above, the dual-CPU comparison page contains everything I would wish to say on the matter. Please read the observations on that page closely.

On the general topic of multi-CPU processing, it is important to remember that just because a particular SPECfp95 is accelerated by the autoparallelising option, this doesn't mean your particular task will also be accelerated, even if your code type is (as far as you can tell) similar to said SPECfp95 test.

Identifying which SPECfp95 test might be like your own is hard enough already, but when dealing with parallel code possibilities, one must absolutely have proper real tests performed on any target system. For decision makers, this means asking for a test to be carried out on an example 'upgraded' system before deciding whether or not to upgrade.

Obviously, the same applies to any task, not just SPECfp95-style work. One complaint I have about dual-CPU Wintel systems is that they're often spoken of by people as if they're the jewel in the crown because they can be so cheap, but few people ever bother to question whether their task can actually take advantage of both CPUs at the same time - many cannot.

Of course, dual-CPUs isn't always about parallelising code or tasks. Some Octane owners will intend using a dual-CPU system as a two-seat system, with two users, two keyboards and two monitors. For such people, what matters more is confidence that any upgrade will still allow each user to take full advantage of his/her CPU without their tasks being interfered with by whatever the other user is doing. Thankfully, because of the way Octane works (eg. HEART/CPU link speed tied to CPU speed), I'm pretty sure that one can always have such confidence. Even so, I'd always ask for tests to be carried out.

Next there follows a comparison involving R10000 250MHz.

1x250MHz vs. 2x250MHz

A 3D Inventor model of the data is available (screenshots of this are included below). Load the file into SceneViewer or ivview and switch into Orthographic mode (ie. no perspective), rotate the object horizontally then vertically, etc.

All source data for this analysis came from www.specbench.org.

Given below is a comparison table of available single vs. dual R10000 250MHz SPECfp95 test results for Octane. Faster configurations are leftmost in the table (in the Inventor graph, they're placed at the back). After the table and 3D graphs is a short-cut index to the original results pages for the various systems.

          R10000    R10000       % Increase
         2x250MHz  1x250MHz   (1x250 -> 2x250)

tomcatv    49.1      29.4           67.0%
swim       63.6      46.3           37.4%
su2cor     20.0      11.2           78.6%
hydro2d    15.6      11.4           36.8%
mgrid      32.2      18.5           74.1%
applu      19.9      13.2           50.8%
turb3d     16.4      16.9           -3.0%
apsi       15.7      16.0           -1.9%
fpppp      37.7      37.1           1.6%
wave5      29.3      27.4           6.9%