BenchBSSN (40x40x20)
The BenchBSSN benchmark
application is the computational kernal for the state of the
art black hole simulations performed by the numerical
relativity group at the Albert Einstein Institute (Max
Planck Institute for Gravitational Physics). Their
production simulations are typically run across hundreds of
processors for days at a time.
This benchmark assigns a constant load of 40x40x20 grid points
on each processor.
| Machine ID |
Architecture |
Fortran Compiler |
Single Processor (secs) [1] |
MFlops [2][3] (%Peak) |
Scaling (16 procs) [4] |
| AMD |
|
| Amarok |
Athlon Box |
Portland 3.2-3 |
123 |
232.7 |
|
| Apple |
|
| G4 |
Apple G4 |
Absoft 7.0 |
|
|
|
| greengrass.cct.lsu.edu |
Apple G5 |
IBM xlf 8.1 |
56.2 |
509.2 (6.36%) |
|
| greengrass.cct.lsu.edu |
Apple G5 |
Absoft f90 8.2 |
73.3 |
390.4 (4.88%) |
|
| Compaq |
|
| Lemieux |
Alpha |
Native |
63.71 |
449 (22%) |
|
| Hitachi |
|
| Hitachi SR-8000 |
SR-8000 |
Native |
117.4 |
243.8 (16%) |
|
| IA32 Linux |
|
| Platinum |
Pentium III |
Portland 3.3-2 |
270.83 |
105.7 (11%) |
|
| Peyote |
Pentium Xeon IV |
Intel 8.0 |
51.8 |
552.5 (9.11%) |
|
| Tungsten |
Xeon |
Intel 8.0 |
49 |
130 (4.1%) |
|
| Xeon |
Pentium Xeon IV |
Intel 8.0 |
83.9 |
341.1 (10.03%) |
|
| IA64 Linux |
|
| Titan |
IA64 |
efc 7.0.50 Beta |
220 |
130 (4.1%) |
|
| TeraGrid |
IA64 |
Intel 8.0 |
59 |
130 (4.1%) |
|
| IBM |
|
| Psi |
Power 4 |
xlc |
136 |
210 (4%) |
|
| Seaborg |
SP3 |
Native |
324 |
88 (6%) |
|
| SGI |
|
| AEI |
Origin 2000 |
Native MIPSpro 7.3.1.2m |
411 |
69.6 (18%) |
|
Notes
- Measured using Cactus timer (Total time) with gettimeofday clock.
- MFlops: This is calculated using the number of floating
point operations calculated on the Origin 2000 using perfex
including standard optimisation but
switching off multadds
(compiling with -TARG:madd=OFF). The number of floating
point operations for the different cases are listed on
the pages describing the individual benchmarks (to come).
- MFlops here means 1,000,000 Flops.
- Scaling is defined by
user time on one processor/user time on 16 processors
Run Notes
- [a] Before additional optimisation flags used
|