Results of Testing

Twenty-nine machines have had some or all of the tests run on them. For many machines no formal statement of their architecture has been given. Of those machines for which an architecture was known, only the Hitachi machine was more relaxed than was allowed by its architecture.

The amount of information associated with the following results varies widely. In some cases I personally made the test runs and so am confident of their validity. In many cases I relied on others to make the runs; their responses varied from an email message ("Passed all tests.") to a copy of the program output, but often with limited identification of the hardware system. These results should be regarded as tentative, until full, formal testing can be done.

The machines tested have fallen into one of three more or less clearly defined groups. Some machines were sequentially consistent (SC). This corresponds to an architecture of A(CMP,UPO,PO,CC1). Others obeyed A(CMP,UPO,RR,WW,RW,CC1), that is, they relaxed sequential consistency only in allowing reads to occur before logically preceding writes. Call this the POK architecture, since the IBM mainframes designed in Poughkeepsie were the first to follow this standard. The third group of machines are called processor consistent (PCon) [good89]. They relax both the WR and the CC1 rules. The architecture is A(CMP,UPO,RR,WW,RW,CC3). (More details on the testing can be found in the Appendix.) Given the degree of theoretical interest in relaxed architectures, it is surprising to see how few are the relaxations found on actual machines.

The SC machines were:

The POK machines were: The PCon machines were: A machine that did not reveal its behavior: In addition, a half dozen machines currently in development have been tested. No publishable results are yet available for them.

A recent study [adgh96] suggests that even more relaxed behavior will be visible on the Alpha, the PowerPC, the T3D, and some other machines.

The results of testing the above 31 machines are given in more detail below. Some number of testers who provided data are not yet willing to be identified.

SC01. The Ultra machine at New York University.

  N=4 T=4 K=20000    1991  M. Ayala
  1 O. 2 O.. 3 O.. 4 O.. 5 O.. 6 O.. 7 O.. 8 .. 9 .. 11 ... 12 ...

SC02. A Solborne 3-way Sparc at the Compaq Corp. (73K)

  N=3 T=3 K=?        1993  D. Koenen
  1 .. 2 O.. 3 ... 4 O.. 5 O.. 6 ... 7 O.. 8 .. 9 .. 11 ... 12 ...

SC03. An SGI Onyx at the University of Minnesota. (39K)

  N=4 T=4 K=20000    1994  F. Mounes-Toussi, D. Lilja, Univ. of Minnesota
  1 .. 2 OO. 3 OO. 4 OO. 5 OO. 6 OO. 7 OO. 8 O. 9 .. 11 ... 12 ...

SC04. A 4-way Sun Sparc 630 at SUNY New Paltz.

  N=4 T=4 K=200000   1997  W. Collier
  1 OO 2 OOO 3 OOO 4 OOO 5 OOO 6 OOO 7 OOO 8 OO 9 OO 11 OOO 12 OOO
Three runs were made with operands forced into read-only state in the cache always (591K), sometimes (582K), or never (595K). All runs showed the same logical behavior. Small variations in the durations of the tests can be seen, but no general pattern is visible.

SC05. A KSR-1 machine at the University of Toronto.

The results are divided between two files, Tests T1-T7 (186K) and Tests T8-T12 (137K).
  N=8 T=8 K=500000   1996  N. Manjikian, University of Toronto
  1 OO 2 OOO 3 OOO 4 OOO 5 OOO 6 OOO 7 OOO 8 OO 9 OO 11 OOO 12 OOO

SC06. A NUMAchine at the University of Toronto. (97K)

  N=3 T=3 K=200000   1997  N. Manjikian, University of Toronto
  1 OO 2 OOO 3 ... 4 OOO 5 OOO 6 ... 7 OOO 8 OO 9 .. 11 ... 12 ...

PK01. Three IBM mainframes (a 4-way 3080, a 2-way 3090, and a 4-way 3090).

  N=4 T=4 K=200000   1992  W. Collier
  1 O. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 OO. 8 .. 9 .. 11 ... 12 ...

PK02. An Amdahl mainframe.

  N=4 T=4 K=200000   1992  W. Collier
  1 O. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 OO. 8 .. 9 .. 11 ... 12 ...

PK03. An Intergraph 2-way TD4 (16K)

  N=2 T=2 K=20000    1995  Anon
  1 .. 2 OO. 3 ... 4 XO. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...

PK04. An ALR Revolution 2-way. (35K)

  N=2 T=2 K=500000   1995  Anon
  1 .. 2 OO. 3 ... 4 XX. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...

PK05. An AST Manhattan V Series 5090 (14K)

  N=2 T=2 K=20000    1995  Anon
  1 .. 2 OO. 3 ... 4 XO. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...

PK06. A Hewlett-Packard Vectra XU 5/90.

  N=2 T=2 K=20000    1995  Anon
  1 .. 2 OO. 3 ... 4 OO. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...
  N=2 T=2 K=250000   1995  Anon
  1 .. 2 OO. 3 ... 4 XX. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...
Two runs were made: a short run (13K) and a long run. (15K). The short run (K = 20000) showed sequentially consistent behavior. The long run (K = 250000) showed behavior which was only rarely and just barely over the edge into relaxed territory.

PK07. An XBit Computer ASUS motherboard. (18K)

  N=2 T=2 K=200000   1995  Anon
  1 OO 2 ... 3 ... 4 XX. 5 ... 6 ... 7 OO. 8 O. 9 .. 11 ... 12 ...

PK08. A 2-way Dell PowerEdge 133-2 running Windows NT 4.0.

  N=2 T=2 K=200000   1998  Ted and Bill Weidenbacher
  1 OO 2 OOO 3 ... 4 OXO 5 ... 6 ... 7 OOO 8 OO 9 .. 11 ... 12 ...
Three runs were made with operands forced into read-only state in the cache always (225K), sometimes (223K), or never (223K). All runs showed the same logical behavior. Small variations in the durations of the tests can be seen, but no general pattern is visible.

This machine was very close to being sequentially consistent. Only the "never" option showed relaxed behavior, and then only with Test T410, and then only with a distance d of -1. (See Job Run Time Note below*)

PC01. An Hitachi mainframe.

  N=4 T=4 K=200000   1992  W. Collier
  1 O. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 XX. 8 .. 9 .. 11 ... 12 ...
The architecture for this machine is POK; it should not have failed Test T7.

PC02. An SGI Power Challenge.

  N=2 T=2 K=500000   1995  F. Mounes-Toussi, D. Lilja, Univ. of Minnesota
  1 .. 2 OO. 3 ... 4 OX. 5 ... 6 ... 7 OX. 8 O. 9 .. 11 ... 12 ...
This machine is SC except when operating on both integer and floating operands. Two runs were made: run1 (19K) and run2 (50K).

PC03. An ALR Revolution 4/100. (89K)

  N=4 T=4 K=400000   1995  Anon
  1 .. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 XX. 8 O. 9 .. 11 ... 12 ...

PC04. A Compaq Proliant 3-way. (73K)

  N=3 T=3 K=500000   1995  Anon
  1 .. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 XX. 8 O. 9 .. 11 ... 12 ...

PC05. A Compaq 4000 5/66. (60K)

  N=R T=R K=20000    1995  Anon
  1 .. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 XX. 8 O. 9 .. 11 ... 12 ...

PC06. An NCR 2-way. (63K)

  N=2 T=2 K=500000   1995  Anon
  1 .. 2 OO. 3 ... 4 XX. 5 ... 6 ... 7 XX. 8 O. 9 .. 11 ... 12 ...

PC07. An Olivetti 3-way. (70K)

  N=3 T=3 K=500000   1995  Anon
  1 .. 2 OO. 3 ... 4 XX. 5 OO. 6 ... 7 XX. 8 O. 9 .. 11 ... 12 ...

PC08. A Sequent 6-way. (575K)

  N=6 T=6 K=250000   1995  Anon
  1 .. 2 OO. 3 OO. 4 XX. 5 OO. 6 OO. 7 XX. 8 O. 9 .. 11 ... 12 ...

PC09. A Tricord server.

  N=8 T=8 K=20000    1995  M. Olson
  1 .. 2 OOO 3 ... 4 XXX 5 ... 6 ... 7 XXX 8 O. 9 .. 11 ... 12 ...

PC10. A 2-way Sun Sparc 20 at SUNY New Paltz.

  N=2 T=2 K=200000   1997  W. Collier
  1 OO 2 OOO 3 ... 4 XXX 5 ... 6 ... 7 XXX 8 OO 9 .. 11 ... 12 ...
Three runs were made with operands forced into read-only state in the cache always (313K),sometimes (270K), or never (286K). All runs showed the same logical behavior. Small variations in the durations of the tests can be seen, but the variations are different for different tests. No general pattern is apparent.

PC11. A 2-way Sun Spark Ultra-2 runnings Solaris 2.5.1 at NPAC at Syracuse University.

  N=2 T=2 K=200000   1998  W. Collier
  1 OO 2 OOO 3 ... 4 XXX 5 ... 6 ... 7 XXX 8 OO 9 .. 11 ... 12 ...
Three runs were made with operands forced into read-only state in the cache always (237K),sometimes (231K), or never (287K). All runs showed the same logical behavior. Small variations in the durations of the tests can be seen, but the variations are different for different tests. No general pattern is apparent.

PC12. An Intergraph TD-400, with 2 200 MHz Pentium Pro processors, running Windows NT 4.0.

  N=2 T=2 K=500000   1998   Jim Reilly, Aqua Process Corp.
  1 OO 2 OOO 3 ... 4 XXX 5 ... 6 ... 7 XXX 8 OO 9 .. 11 ... 12 ...
Three runs were made with operands forced into read-only state in the cache always (259K),sometimes (252K), or never (256K). All three runs showed the same logical behavior. This is the most aggressively relaxed machine seen so far. Values of d = -25 were seen for tests T4 and T7. (See Job Run Time Note below*)

PC13. A 2-way SunBlade 1000 at Vassar College (731K).

  N=2 T=4 K=500000   2004  Brad Richards
  1 OO 2 OOO 3 OOO 4 XXX 5 OOO 6 OOO 7 XXX 8 OO 9 OO 11 XXX 12 XXX

PC14. A Sun E3500 with 8 400MHz processors and 6 Gig of RAM at Vassar College (741K).

  N=8 T=4 K=500000   2004  Brad Richards
  1 OO 2 OOO 3 OOO 4 XXX 5 OOO 6 OOO 7 XXX 8 OO 9 OO 11 XXX 12 XXX

NO01. A Hewlett-Packard (222K) Dual Pentium II 400 MHZ w/ Intel 440 BX AGPset, running Windows NT 4.0.

N=2 T=2 K=500000 1998 Dave Magram, at the HP Booth at PC Expo.
1 OO 2 OOO 3 ...  4 OOO 5 ...  6 ...  7 OOO 8 OO 9 ..  11 ...  12 ...
What initially appeared to be SC results turned out on closer inspection to be more prosaic. It appears that only one processor was active. Thus there was no simultaneous execution and so no chance to see relaxed behavior. The machine was fast. The test output is displayed to show just how fast. (See Job Run Time Note below*)

*Job Run Time Note. The minimum run time for a job in the end of run report is shown as 0.5 seconds. This is not accurate. Accurate run time information can be found in the data for each test. The source of this misinformation lies in a degradation of function in Visual C++ Version 5 as compared to an earlier version. More accurate information will be available in the future.

Site Map

References

Last updated January 4, 2006.