[Users] PUGH Questions

Erik Schnetter schnetter at cct.lsu.edu
Thu May 8 22:35:43 CDT 2008


On May 8, 2008, at 19:44:52, Frank Loeffler wrote:

> Hi,
>
> On Fri, May 09, 2008 at 02:04:55AM +0200, Andreas Schäfer wrote:
>> * In a cluster with heterogeneous machines, can PUGH perform  
>> automatic
>>  static load balancing to give faster machines more grid points?
>
> No. You could distribute the data by hand probably, but AFAIK PUGH
> cannot do that automatically. PUGH does not know about the machine
> specifics like available memory.
>
>> * Can PUGH perform dynamic load balancing?
>
> No.
>
>> * When I use numbers of nodes that are not powers of 2, the domain
>>  decomposition sometimes degrades. For instance for 31 nodes a cube
>>  would be decomposed into 31 thin slices, leading to a rather bad
>>  surface to volume ratio. Is there any way to avoid such situations?
>>  (using only powers of two as numbers of nodes is not an option ;-) )
>
> PUGH will (by default) try to devide the domain evenly to all
> processors with the restriction that you cannot have a processor  
> domain
> edge like this:
>      domain 1
> ----------+----------
>  domain 2 | domain 3
>           |
> Because 31 is a prime number, PUGH has no other option as to devide it
> like you described. If you choose 30, it could devide e.g. into 2*3*5.
>
>> * Are alternatives to PUGH available that perform better in regards  
>> to
>>  the questions above?
>
> I do not know one from the top of my head, but that does not mean that
> there couldn't exist one. However, you might want to have a look into
> Carpet (www.carpetcode.org), even if you do not intend to use mesh
> refinement. I am not sure if Carpet could e.g. handle the last  
> question
> better. Erik Schnetter would be the right person to ask.


Yes, Carpet can handle the case where the number of processors is not  
a power of 2.

What kindsw of clusters are you considering?  Is the compute hardware  
different, or do you refer e.g. to the network topology?  Would your  
problems be compute bound, memory bound, communication bound, or I/O  
bound?  Do you expect the hardware performance to change at run time,  
or be approximately constant over a run?  Or do you rather expect the  
computational load to be different in different parts of the  
simulation domain?

Carpet assigns to each process a part of the problem that corresponds  
to the number of threads running in this process, assuming that each  
thread runs at the same speed.  Handling heterogeneous machines  
statically would be rather easy to implement.  For example, the thorn  
LSUThorns/DGEMM runs a short CPU benchmark at startup, which could be  
used to determine the compute power of a process.  Changing the load  
distribution (even dynamically) is not difficult; the difficult part  
is determining _how_ to change it, since different parts of the  
evolution algorithm may run with differing speeds on the different  
processors.

I do not typically use heterogeneous machines; if you are interested,  
we could offer an API that lets people dynamically set the relative  
performance of each MPI process, and Carpet would the change the load  
distribution according to these performances.

-erik

-- 
Erik Schnetter <schnetter at cct.lsu.edu>   http://www.cct.lsu.edu/~eschnett/

My email is as private as my paper mail.  I therefore support encrypting
and signing email messages.  Get my PGP key from www.keyserver.net.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
Url : http://www.cactuscode.org/pipermail/users/attachments/20080508/ba608c39/attachment.bin 


More information about the Users mailing list