Cactus Run on First Loongson 2F Chinese Cluster

On December 26, 2007, China revealed its first supercomputer of 1 teraflops utilizing the domestic Chinese CPU Loongson 2F at Hefei, designated as KD-50-I. This supercomputer was designed by a joint team led by Dr. Chen Guoliang, professor of the Computer Science and Technology Department of the University of Science and Technology of China (the primary contractor, with the Research Institute of Computational Technology of Chinese Academy of Sciences as the secondary contractor).

KD-50-I is a diskless cluster with a total of 336 Loongson-2F CPUs (750MHz) and 1GB memory for each CPU. The operating system is Debian Linux. The size of the supercomputer is about that of a household refrigerator and the cost is about $110,000.

Cactus standard testsuites were verified on multiple processors. Several jobs were run with up to 128 processors. The results are presented on the graph below. The compute time seemed to scale well while the wall time seemed to be affected by I/O and other overheads when the number of processors went beyond 16.

The white cluster is the KD-50-I cluster

#The MPI compilers/scripts are based on gcc version 4.1.2 20061115 (pre-release) (Debian 4.1.1-21).

CC               =   mpicc
LD               =   mpicxx
CXX              =   mpicxx
F77              =   mpif77
F90              =   mpif90

#compiler flags
FPPFLAGS = -traditional

OPTIMISE          =  yes
C_OPTIMISE_FLAGS  =  -pipe -O2 -finline-functions
CXX_OPTIMISE_FLAGS=  -pipe -O2 -finline-functions
F77_OPTIMISE_FLAGS=  -pipe -O2 -finline-functions
F90_OPTIMISE_FLAGS=  -pipe -O2 -finline-functions

#warning flags

#debugging flags

# emable MPI


3 Feb 2008 — elena