[Users] PUGH Questions
Erik Schnetter
schnetter at cct.lsu.edu
Thu May 8 22:35:43 CDT 2008
On May 8, 2008, at 19:44:52, Frank Loeffler wrote:
> Hi,
>
> On Fri, May 09, 2008 at 02:04:55AM +0200, Andreas Schäfer wrote:
>> * In a cluster with heterogeneous machines, can PUGH perform
>> automatic
>> static load balancing to give faster machines more grid points?
>
> No. You could distribute the data by hand probably, but AFAIK PUGH
> cannot do that automatically. PUGH does not know about the machine
> specifics like available memory.
>
>> * Can PUGH perform dynamic load balancing?
>
> No.
>
>> * When I use numbers of nodes that are not powers of 2, the domain
>> decomposition sometimes degrades. For instance for 31 nodes a cube
>> would be decomposed into 31 thin slices, leading to a rather bad
>> surface to volume ratio. Is there any way to avoid such situations?
>> (using only powers of two as numbers of nodes is not an option ;-) )
>
> PUGH will (by default) try to devide the domain evenly to all
> processors with the restriction that you cannot have a processor
> domain
> edge like this:
> domain 1
> ----------+----------
> domain 2 | domain 3
> |
> Because 31 is a prime number, PUGH has no other option as to devide it
> like you described. If you choose 30, it could devide e.g. into 2*3*5.
>
>> * Are alternatives to PUGH available that perform better in regards
>> to
>> the questions above?
>
> I do not know one from the top of my head, but that does not mean that
> there couldn't exist one. However, you might want to have a look into
> Carpet (www.carpetcode.org), even if you do not intend to use mesh
> refinement. I am not sure if Carpet could e.g. handle the last
> question
> better. Erik Schnetter would be the right person to ask.
Yes, Carpet can handle the case where the number of processors is not
a power of 2.
What kindsw of clusters are you considering? Is the compute hardware
different, or do you refer e.g. to the network topology? Would your
problems be compute bound, memory bound, communication bound, or I/O
bound? Do you expect the hardware performance to change at run time,
or be approximately constant over a run? Or do you rather expect the
computational load to be different in different parts of the
simulation domain?
Carpet assigns to each process a part of the problem that corresponds
to the number of threads running in this process, assuming that each
thread runs at the same speed. Handling heterogeneous machines
statically would be rather easy to implement. For example, the thorn
LSUThorns/DGEMM runs a short CPU benchmark at startup, which could be
used to determine the compute power of a process. Changing the load
distribution (even dynamically) is not difficult; the difficult part
is determining _how_ to change it, since different parts of the
evolution algorithm may run with differing speeds on the different
processors.
I do not typically use heterogeneous machines; if you are interested,
we could offer an API that lets people dynamically set the relative
performance of each MPI process, and Carpet would the change the load
distribution according to these performances.
-erik
--
Erik Schnetter <schnetter at cct.lsu.edu> http://www.cct.lsu.edu/~eschnett/
My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from www.keyserver.net.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 194 bytes
Desc: This is a digitally signed message part
Url : http://www.cactuscode.org/pipermail/users/attachments/20080508/ba608c39/attachment.bin
More information about the Users
mailing list