[Users] How to generate a profiling file using mpiP

Hee Il Kim heeilkim at gmail.com
Tue Apr 1 15:11:43 CST 2008


Thanks all,



2008/4/1, Erik Schnetter <schnetter at cct.lsu.edu>:
>
>
> how large is the cluster?  A factor of 2 is not really bad, since you
> still get a factor of 32 speedup compared to running on a single CPU.
> What is the slowdown when you go from using 1 to using 2 full nodes?



I can use max 80 cpus. If the factor of 2 is not really bad and unavoidable
to a low performace network cluster, I think I'd better stop here. I spent
too much time on this ^^

Anyway I got many commnets and suggestions from mpich-discuss forum. I found
our switch has a good bandwidth value even though its latency is not good as
high performace hardwares. Also it equips with the ability to use Open-MX. I
will test Open-MX to reduce the latency.

The numbers below taken from mpich2. Note the iteration number was taken 32,
a half of the previous one.

Thanks!

Hee Il


==========================

# 1 node = 8 cpus

./CCTK_Proc0.out:                | Total time for simulation
|        463.46191400 |     423.44646400
./CCTK_Proc1.out:                | Total time for simulation
|        463.41564800 |     443.49971700
./CCTK_Proc2.out:                | Total time for simulation
|        463.41577600 |     441.01956200
./CCTK_Proc3.out:                | Total time for simulation
|        463.41576900 |     415.26195200
./CCTK_Proc4.out:                | Total time for simulation
|        463.41564200 |     444.48777900
./CCTK_Proc5.out:                | Total time for simulation
|        463.41567800 |     421.04631400
./CCTK_Proc6.out:                | Total time for simulation
|        463.41577200 |     421.11031800
./CCTK_Proc7.out:                | Total time for simulation
|        463.44232100 |     411.46171500


# 2 node = 16 cpus

./CCTK_Proc0.out:                | Total time for simulation
|        481.08626400 |     439.36345800
./CCTK_Proc10.out:                | Total time for simulation
|        481.05228700 |     449.02006200
./CCTK_Proc11.out:                | Total time for simulation
|        481.05252200 |     423.33445700
./CCTK_Proc12.out:                | Total time for simulation
|        481.05242800 |     444.17976000
./CCTK_Proc13.out:                | Total time for simulation
|        481.05249500 |     415.08594100
./CCTK_Proc14.out:                | Total time for simulation
|        481.05234400 |     413.60184900
./CCTK_Proc15.out:                | Total time for simulation
|        481.05244200 |     407.84548900
./CCTK_Proc1.out:                | Total time for simulation
|        481.05222500 |     415.46996500
./CCTK_Proc2.out:                | Total time for simulation
|        481.05224300 |     415.90599200
./CCTK_Proc3.out:                | Total time for simulation
|        481.05421800 |     404.89330400
./CCTK_Proc4.out:                | Total time for simulation
|        481.05222600 |     446.89592900
./CCTK_Proc5.out:                | Total time for simulation
|        481.05237600 |     419.93424400
./CCTK_Proc6.out:                | Total time for simulation
|        481.04626200 |     423.71448100
./CCTK_Proc7.out:                | Total time for simulation
|        481.09418200 |     419.81023700
./CCTK_Proc8.out:                | Total time for simulation
|        481.05225000 |     430.81092400
./CCTK_Proc9.out:                | Total time for simulation
|        481.05227900 |     448.58403400


# 4 nodes = 32 cpus

./CCTK_Proc0.out:                | Total time for simulation
|        688.29916500 |     415.56597200

max  460.74879500

# 8 nodes = 64 cpus

./CCTK_Proc0.out:                | Total time for simulation
|        794.68444700 |     428.21476200

max  470.84142600
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.cactuscode.org/pipermail/users/attachments/20080402/f137ca0c/attachment.html 


More information about the Users mailing list