From tradke at aei.mpg.de Tue Apr 1 02:17:46 2008 From: tradke at aei.mpg.de (Thomas Radke) Date: Tue, 01 Apr 2008 10:17:46 +0200 Subject: [Users] How to generate a profiling file using mpiP In-Reply-To: References: <47F0AF49.3080409@aei.mpg.de> Message-ID: <47F1EFAA.7050208@aei.mpg.de> Hee Il Kim wrote: > Thanks Thomas, > > I referred the site you introduced, > http://numrel.aei.mpg.de/Research/Peyote/Docs/mpiP.html. I didn't try IPM > yet. > > I recently made a cluster of 10 computing nodes with dual quad core Xeon > (Harpertown E5420). The nodes are connected with a usual giagabit switch. > Unfortunately, when I tested the system performance using Cactus benchmarks, > BSSN_PUGH and BSSN_Carpet_Whisky, I got a bad scalability (~700sec for 1CPU, > 1500sec for 64CPU). It's not a code problem. I got similar reports from my > colleague that it required unexpectedly large communication time for his > GADGET-2 test. I did ping-pong test for the network and heard that it is > relatively good as a gigabit system. > > So, I'm trying to get profiling information. Actually, I don't know what I > can do next with it and whether it could be improved by a user-level > prescription. Anyway I think the problem looks serious. I don't think it's > soley a problem of the large latency network because the benchmarks will not > require large communications. Now I'm going to apply Open-MX to reduce > latency time. I haven't heard any benchmarks for this Harpertown cluster > with gigabit network. I appreciate any comments and suggestions. You could try some of the standard SPEC benchmarks (http://www.spec.org/benchmarks.html) and compare them for varying numbers of nodes. -- Cheers, Thomas. From schnetter at cct.lsu.edu Tue Apr 1 07:50:06 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 1 Apr 2008 08:50:06 -0500 Subject: [Users] How to generate a profiling file using mpiP In-Reply-To: <47F1EFAA.7050208@aei.mpg.de> References: <47F0AF49.3080409@aei.mpg.de> <47F1EFAA.7050208@aei.mpg.de> Message-ID: <0CAFF03E-349A-4CFC-B96D-B8F1BA26985C@cct.lsu.edu> On Apr 1, 2008, at 03:17:46, Thomas Radke wrote: > Hee Il Kim wrote: >> Thanks Thomas, >> >> I referred the site you introduced, >> http://numrel.aei.mpg.de/Research/Peyote/Docs/mpiP.html. I didn't >> try IPM >> yet. >> >> I recently made a cluster of 10 computing nodes with dual quad core >> Xeon >> (Harpertown E5420). The nodes are connected with a usual giagabit >> switch. >> Unfortunately, when I tested the system performance using Cactus >> benchmarks, >> BSSN_PUGH and BSSN_Carpet_Whisky, I got a bad scalability (~700sec >> for 1CPU, >> 1500sec for 64CPU). It's not a code problem. I got similar reports >> from my >> colleague that it required unexpectedly large communication time >> for his >> GADGET-2 test. I did ping-pong test for the network and heard that >> it is >> relatively good as a gigabit system. >> >> So, I'm trying to get profiling information. Actually, I don't know >> what I >> can do next with it and whether it could be improved by a user-level >> prescription. Anyway I think the problem looks serious. I don't >> think it's >> soley a problem of the large latency network because the benchmarks >> will not >> require large communications. Now I'm going to apply Open-MX to >> reduce >> latency time. I haven't heard any benchmarks for this Harpertown >> cluster >> with gigabit network. I appreciate any comments and suggestions. > > You could try some of the standard SPEC benchmarks > (http://www.spec.org/benchmarks.html) and compare them for varying > numbers of nodes. There are some HPC Challenge Benchmarks at . This site stores also some results. These benchmarks measure low- level system properties such as speed of memory access, communication bandwidth/latency etc. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080401/84bda55d/attachment-0001.bin From heeilkim at gmail.com Tue Apr 1 15:11:43 2008 From: heeilkim at gmail.com (Hee Il Kim) Date: Wed, 2 Apr 2008 06:11:43 +0900 Subject: [Users] How to generate a profiling file using mpiP In-Reply-To: <4952A3CC-C372-467A-B735-F1BFD3D1A5BD@cct.lsu.edu> References: <47F0AF49.3080409@aei.mpg.de> <4952A3CC-C372-467A-B735-F1BFD3D1A5BD@cct.lsu.edu> Message-ID: Thanks all, 2008/4/1, Erik Schnetter : > > > how large is the cluster? A factor of 2 is not really bad, since you > still get a factor of 32 speedup compared to running on a single CPU. > What is the slowdown when you go from using 1 to using 2 full nodes? I can use max 80 cpus. If the factor of 2 is not really bad and unavoidable to a low performace network cluster, I think I'd better stop here. I spent too much time on this ^^ Anyway I got many commnets and suggestions from mpich-discuss forum. I found our switch has a good bandwidth value even though its latency is not good as high performace hardwares. Also it equips with the ability to use Open-MX. I will test Open-MX to reduce the latency. The numbers below taken from mpich2. Note the iteration number was taken 32, a half of the previous one. Thanks! Hee Il ========================== # 1 node = 8 cpus ./CCTK_Proc0.out: | Total time for simulation | 463.46191400 | 423.44646400 ./CCTK_Proc1.out: | Total time for simulation | 463.41564800 | 443.49971700 ./CCTK_Proc2.out: | Total time for simulation | 463.41577600 | 441.01956200 ./CCTK_Proc3.out: | Total time for simulation | 463.41576900 | 415.26195200 ./CCTK_Proc4.out: | Total time for simulation | 463.41564200 | 444.48777900 ./CCTK_Proc5.out: | Total time for simulation | 463.41567800 | 421.04631400 ./CCTK_Proc6.out: | Total time for simulation | 463.41577200 | 421.11031800 ./CCTK_Proc7.out: | Total time for simulation | 463.44232100 | 411.46171500 # 2 node = 16 cpus ./CCTK_Proc0.out: | Total time for simulation | 481.08626400 | 439.36345800 ./CCTK_Proc10.out: | Total time for simulation | 481.05228700 | 449.02006200 ./CCTK_Proc11.out: | Total time for simulation | 481.05252200 | 423.33445700 ./CCTK_Proc12.out: | Total time for simulation | 481.05242800 | 444.17976000 ./CCTK_Proc13.out: | Total time for simulation | 481.05249500 | 415.08594100 ./CCTK_Proc14.out: | Total time for simulation | 481.05234400 | 413.60184900 ./CCTK_Proc15.out: | Total time for simulation | 481.05244200 | 407.84548900 ./CCTK_Proc1.out: | Total time for simulation | 481.05222500 | 415.46996500 ./CCTK_Proc2.out: | Total time for simulation | 481.05224300 | 415.90599200 ./CCTK_Proc3.out: | Total time for simulation | 481.05421800 | 404.89330400 ./CCTK_Proc4.out: | Total time for simulation | 481.05222600 | 446.89592900 ./CCTK_Proc5.out: | Total time for simulation | 481.05237600 | 419.93424400 ./CCTK_Proc6.out: | Total time for simulation | 481.04626200 | 423.71448100 ./CCTK_Proc7.out: | Total time for simulation | 481.09418200 | 419.81023700 ./CCTK_Proc8.out: | Total time for simulation | 481.05225000 | 430.81092400 ./CCTK_Proc9.out: | Total time for simulation | 481.05227900 | 448.58403400 # 4 nodes = 32 cpus ./CCTK_Proc0.out: | Total time for simulation | 688.29916500 | 415.56597200 max 460.74879500 # 8 nodes = 64 cpus ./CCTK_Proc0.out: | Total time for simulation | 794.68444700 | 428.21476200 max 470.84142600 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080402/f137ca0c/attachment.html From schnetter at cct.lsu.edu Tue Apr 1 15:49:21 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 1 Apr 2008 16:49:21 -0500 Subject: [Users] How to generate a profiling file using mpiP In-Reply-To: References: <47F0AF49.3080409@aei.mpg.de> <4952A3CC-C372-467A-B735-F1BFD3D1A5BD@cct.lsu.edu> Message-ID: <32E1CFFF-D4B5-4325-9537-E0289A1DB252@cct.lsu.edu> Hee Il, below you only showed the total simulation time for each process. I rather meant that the time spent e.g. in the evolution equations and in the boundary conditions could be different between processes. The total time will always be very similar since it is essentially synchronised when the termination condition is communicated. -erik On Apr 1, 2008, at 16:11:43, Hee Il Kim wrote: > Thanks all, > > > > 2008/4/1, Erik Schnetter : > how large is the cluster? A factor of 2 is not really bad, since you > still get a factor of 32 speedup compared to running on a single CPU. > What is the slowdown when you go from using 1 to using 2 full nodes? > > > I can use max 80 cpus. If the factor of 2 is not really bad and > unavoidable to a low performace network cluster, I think I'd better > stop here. I spent too much time on this ^^ > > Anyway I got many commnets and suggestions from mpich-discuss forum. > I found our switch has a good bandwidth value even though its > latency is not good as high performace hardwares. Also it equips > with the ability to use Open-MX. I will test Open-MX to reduce the > latency. > > The numbers below taken from mpich2. Note the iteration number was > taken 32, a half of the previous one. > > Thanks! > > Hee Il > > > ========================== > > # 1 node = 8 cpus > > ./CCTK_Proc0.out: | Total time for > simulation | 463.46191400 | 423.44646400 > ./CCTK_Proc1.out: | Total time for > simulation | 463.41564800 | 443.49971700 > ./CCTK_Proc2.out: | Total time for > simulation | 463.41577600 | 441.01956200 > ./CCTK_Proc3.out: | Total time for > simulation | 463.41576900 | 415.26195200 > ./CCTK_Proc4.out: | Total time for > simulation | 463.41564200 | 444.48777900 > ./CCTK_Proc5.out: | Total time for > simulation | 463.41567800 | 421.04631400 > ./CCTK_Proc6.out: | Total time for > simulation | 463.41577200 | 421.11031800 > ./CCTK_Proc7.out: | Total time for > simulation | 463.44232100 | 411.46171500 > > > # 2 node = 16 cpus > > ./CCTK_Proc0.out: | Total time for > simulation | 481.08626400 | 439.36345800 > ./CCTK_Proc10.out: | Total time for > simulation | 481.05228700 | 449.02006200 > ./CCTK_Proc11.out: | Total time for > simulation | 481.05252200 | 423.33445700 > ./CCTK_Proc12.out: | Total time for > simulation | 481.05242800 | 444.17976000 > ./CCTK_Proc13.out: | Total time for > simulation | 481.05249500 | 415.08594100 > ./CCTK_Proc14.out: | Total time for > simulation | 481.05234400 | 413.60184900 > ./CCTK_Proc15.out: | Total time for > simulation | 481.05244200 | 407.84548900 > ./CCTK_Proc1.out: | Total time for > simulation | 481.05222500 | 415.46996500 > ./CCTK_Proc2.out: | Total time for > simulation | 481.05224300 | 415.90599200 > ./CCTK_Proc3.out: | Total time for > simulation | 481.05421800 | 404.89330400 > ./CCTK_Proc4.out: | Total time for > simulation | 481.05222600 | 446.89592900 > ./CCTK_Proc5.out: | Total time for > simulation | 481.05237600 | 419.93424400 > ./CCTK_Proc6.out: | Total time for > simulation | 481.04626200 | 423.71448100 > ./CCTK_Proc7.out: | Total time for > simulation | 481.09418200 | 419.81023700 > ./CCTK_Proc8.out: | Total time for > simulation | 481.05225000 | 430.81092400 > ./CCTK_Proc9.out: | Total time for > simulation | 481.05227900 | 448.58403400 > > > # 4 nodes = 32 cpus > > ./CCTK_Proc0.out: | Total time for > simulation | 688.29916500 | 415.56597200 > max > 460.74879500 > > # 8 nodes = 64 cpus > > ./CCTK_Proc0.out: | Total time for > simulation | 794.68444700 | 428.21476200 > max > 470.84142600 > > > > > > > > > > _______________________________________________ > Users mailing list > Users at cactuscode.org > http://www.cactuscode.org/mailman/listinfo/users -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080401/ad7de6dd/attachment.bin From heeilkim at gmail.com Tue Apr 1 16:38:15 2008 From: heeilkim at gmail.com (Hee Il Kim) Date: Wed, 2 Apr 2008 07:38:15 +0900 Subject: [Users] How to generate a profiling file using mpiP In-Reply-To: References: <47F0AF49.3080409@aei.mpg.de> <4952A3CC-C372-467A-B735-F1BFD3D1A5BD@cct.lsu.edu> <32E1CFFF-D4B5-4325-9537-E0289A1DB252@cct.lsu.edu> Message-ID: > > Erik, > > I just wanted to show you the increased communication time as the number > of nodes. Most of differences between gettimeofday and getrusage come from > "Boundary enforcement" step of BSSN_MoL and "Do the boundary condition" step > of Whisky. The differences increase as the number of nodes and differ by > cpus. I could not get more info, so I attached some files on this. Sorry for > the mass spam numbers ^^ > > Hee Il > > 2008/4/2, Erik Schnetter : > > > > Hee Il, > > > > below you only showed the total simulation time for each process. I > > rather meant that the time spent e.g. in the evolution equations and > > in the boundary conditions could be different between processes. The > > total time will always be very similar since it is essentially > > synchronised when the termination condition is communicated. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080402/ddf7d90e/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: timing_info.tar.gz Type: application/x-gzip Size: 3658 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/users/attachments/20080402/ddf7d90e/attachment-0001.gz From schnetter at cct.lsu.edu Thu Apr 3 14:03:53 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Thu, 3 Apr 2008 15:03:53 -0500 Subject: [Users] help to run ADM In-Reply-To: References: Message-ID: <86E3AD54-4C3E-44B1-A66A-E56E9403E3FA@cct.lsu.edu> On Mar 10, 2008, at 03:31:11, wrote: > Dear Sir > > I would like to run ADM after I already done some simple examples like > wavetoy. > I have problem with ThornList. Would you please send us the proper > stable Thorn for this application. Hossein, the Cactus web site contains some tutorials. These give a basic introduction for downloading, installing, and building Cactus, and for running a scalar wave example. The standard distribution of Cactus come also with some basic numerical relativity thorns, namely the CactusEinstein arrangement. Evolving the Einstein (ADM) equations is more complex since the underlying physics is much more complicated. You may be able to get started by looking at some of the example parameter files that come with the CactusEinstein arrangement. I would suggest to contact one of the numerical relativity groups who use Cactus if you are interested in research using the Einstein equations. If you have a particular problem downloading or compiling, we can probably help you if you give us details about what you are trying to do and what the problems or error messages are. Good luck, -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080403/9b63abe6/attachment.bin From NGUY0045 at ntu.edu.sg Thu Apr 3 20:25:44 2008 From: NGUY0045 at ntu.edu.sg (#NGUYEN CONG TRI#) Date: Fri, 4 Apr 2008 10:25:44 +0800 Subject: [Users] periodic boundary condition Message-ID: Dear all, I'm sorry but it's not very clear for me how PUGH actually implement periodic boundary conditions. For example, I want to specify driver::periodic_x = "yes" and driver::global_nx = 65, driver::ghost_size = 1, then during partition will the driver automatically augment 2 appropriate ghost cells and my physical domain is still [0, 64] or it will use points 0 and 64 as ghost cells and my physical domain is only [1,63] ? I believe the latter is used in reflection symmetry but I'm not sure about the periodic boundary condition. Thank you. Best regards, Tri -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080404/4c0b2683/attachment.html From schnetter at cct.lsu.edu Thu Apr 3 21:16:40 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Thu, 3 Apr 2008 22:16:40 -0500 Subject: [Users] periodic boundary condition In-Reply-To: References: Message-ID: <3CB0E3E9-DDD8-496A-B463-CA59D19D1932@cct.lsu.edu> On Apr 3, 2008, at 21:25:44, #NGUYEN CONG TRI# wrote: > Dear all, > > I'm sorry but it's not very clear for me how PUGH actually implement > periodic boundary conditions. For example, I want to specify > driver::periodic_x = "yes" and driver::global_nx = 65, > driver::ghost_size = 1, then during partition will the driver > automatically augment 2 appropriate ghost cells and my physical > domain is still [0, 64] or it will use points 0 and 64 as ghost > cells and my physical domain is only [1,63] ? I believe the latter > is used in reflection symmetry but I'm not sure about the periodic > boundary condition. Tri, In your case, the physical domain would be [1, 63]. Internally in PUGH, periodic boundaries are treated in the same way as inter-processor boundaries, and no special explicit symmetry operations are required as would be the case for reflection symmetry. This means among other things that the additional points which implement periodicity are not output, in the same way that inter- processor ghost zones are not output. This can be slightly confusing, since one may think that these points are missing. When you use periodic boundaries, you have to be careful about the size of the simulation domain that you specify. If you want to set up a domain of extent [0; 1) with a grid spacing of 0.1 and one ghost zone, then you need to specify xmin=-0.1, xmax=+1.0, and you need to choose 12 grid points. The thorn CactusExamples/WaveToy1DF77 has an example with a 1D wave equation. The example is not quite ideal, since it specifies 101 grid points which leads to 99 interior points, or a physical domain of size 0.99 instead of 1.0. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080403/500b5fa0/attachment.bin From D.J.Baker at soton.ac.uk Fri Apr 4 04:28:01 2008 From: D.J.Baker at soton.ac.uk (Baker D.J.) Date: Fri, 4 Apr 2008 11:28:01 +0100 Subject: [Users] Cactus -- Bench_Whisky_Carpet benchmark Message-ID: Hi all, Hopefully this is very easy question for someone. My knowledge of Cactus is minimal. I'm not a user of the package, however I am involved in preparing some applications for benchmarking. I've recently been testing out Bench_Whisky_Carpet since this benchmark most closely represents the type of Cactus applications that researchers at Southampton are running. On our latest Opteron (3 GHz) processors it's taking in the region of 11 minutes to complete, and my major concern is that is will complete even faster as we move to other systems (perhaps..). We have some Intel quad core machines arriving soon, and that will be an interesting test case. My question is essentially how to "rebase" this benchmark without actually losing the essence of what it's doing. That is, is there a parameter that I can adjust to increase the number of iterations completed or whatever? Please could some one advise me. Regards -- David Baker. From schnetter at cct.lsu.edu Fri Apr 4 08:25:44 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Fri, 4 Apr 2008 09:25:44 -0500 Subject: [Users] Cactus -- Bench_Whisky_Carpet benchmark In-Reply-To: References: Message-ID: On Apr 4, 2008, at 05:28:01, Baker D.J. wrote: > Hi all, > > Hopefully this is very easy question for someone. > > My knowledge of Cactus is minimal. I'm not a user of the package, > however I am involved in preparing some applications for > benchmarking. I've recently been testing out Bench_Whisky_Carpet > since this benchmark most closely represents the type of Cactus > applications that researchers at Southampton are running. On our > latest Opteron (3 GHz) processors it's taking in the region of 11 > minutes to complete, and my major concern is that is will complete > even faster as we move to other systems (perhaps..). We have some > Intel quad core machines arriving soon, and that will be an > interesting test case. > > My question is essentially how to "rebase" this benchmark without > actually losing the essence of what it's doing. That is, is there a > parameter that I can adjust to increase the number of iterations > completed or whatever? > > Please could some one advise me. David, the benchmarks were designed with two goals in mind: the amount of memory per core should be between 500 MB and 1000 GB, and it should finish in between 10 to 15 minutes. Both can be adapted. (In fact, there are even two benchmarks on the web page with different memory requirements.) The parameters global_nx, global_ny, and global_nz define how much memory is used. The parameter cctk_itlast defines how long it runs -- it sets the number of iterations. Obviously, if you change memory usage, the time is going to change proportionally as well. Due to the AMR algorithm, it is "nice" to keep cctk_itlast at a small integer multiple of a power of two, but this is not a requirement. We are currently updating the benchmarks, including using a new and more efficient version of the underlying mesh refinement driver. We have finalised some vacuum benchmarks and are currently working on updating the hydrodynamics benchmark. I suggest to not use the current benchmark for large (>100) number of processors. If you are interested, we can give you access to an early version of the new benchmark. Good luck, -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080404/ecc2f868/attachment.bin From schnetter at cct.lsu.edu Sun Apr 6 22:29:47 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Sun, 6 Apr 2008 22:29:47 -0500 Subject: [Users] ADMConstraints In-Reply-To: <200705311456.00561.szilagyi@aei.mpg.de> References: <5AFF5575-C165-45D8-82D5-B0B77A7DB2F1@cct.lsu.edu> <200705311456.00561.szilagyi@aei.mpg.de> Message-ID: <2F132516-BF1B-44F2-BFD4-A4765C16664E@cct.lsu.edu> After testing for some time, I just applied corresponding changes to ADMAnalysis. -erik On May 31, 2007, at 07:56:00, Bela Szilagyi wrote: > Should there be a similar patch applied to ADMAnalysis as well? > > > On Thursday 31 May 2007 03:58:46 Erik Schnetter wrote: >> While calculating the constraints with ADMConstraints in the presence >> of mesh refinement works fine, there was a problem with calculating >> norms of the constraints, which may require time interpolation. In >> order to do this correctly, it is necessary to keep several time >> levels of the constraints, and to re-calculate the constraints after >> changing the grid hierarchy. If this is not done, then the values of >> the constraints at each output grid point are correct, but norms of >> the constraints may not be. >> >> I updated ADMConstraints so that it supports this behaviour. In >> order to remain backward compatible, this behaviour cannot be the >> default. Obtaining correct norms for the constraints requires >> setting the parameters >> >> ADMConstraints::constraints_persist = yes >> ADMConstraints::constraints_timelevels = 3 >> >> where the number 3 depend on the time interpolation order; for linear >> time interpolation, you only need 2 time levels. >> >> On the other hand, if you want to save space and time, and if you do >> not need the norms of the constraints, then you should set the >> parameter >> >> ADMConstraints::constraints_prolongation_type = "none" >> >> -erik > > > > > > Bela Szilagyi > ---------------------------------------------------- > Max-Planck-Institut f?r Gravitationsphysik > Albert-Einstein-Institut > Tel: +49 331 567 7189 > Fax: +49 331 567 7252 > ---------------------------------------------------- > > > _______________________________________________ > Users mailing list > Users at cactuscode.org > http://www.cactuscode.org/mailman/listinfo/users > -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080406/19340fb3/attachment.bin From NGUY0045 at ntu.edu.sg Sat Apr 12 13:30:24 2008 From: NGUY0045 at ntu.edu.sg (#NGUYEN CONG TRI#) Date: Sun, 13 Apr 2008 02:30:24 +0800 Subject: [Users] Compiling Cactus with PETSc Message-ID: Dear all, I don't know whether it is appropriate to raise this question here. I want to compile Cactus together with MPI (LAM) and PETSc and I've got problem when linking with the PETSc fortran interface library. I received the error like this: /usr/bin/ld: cannot find -lpetscfortran collect2: ld returned 1 exit status gmake[1]: *** [/home/tri/CACTUS/Cactus/exe/cactus_singlepillar] Error 1 gmake: *** [hello] Error 2 I'm using the newest version of PETSc 2.3.3. I couldn't figure a way to fix this although I tried the solution suggested from PETSc troubleshooting guides by executing command >make BOPT=g fortran (which is confusing to me as in why there is a space in between and it didn't work). As for the compilers, I used: CC = gcc CXX = c++ F90 = ifort F77 = ifort CPP = /lib/cpp FPP = /lib/cpp LD = c++ AR = ar RANLIB = ranlib PERL = perl Does anyone know how can I fix this problem, I really appreciate your help. Best regards, Tri. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080413/42c29ecb/attachment-0001.html From frank.loeffler at aei.mpg.de Sat Apr 12 15:52:36 2008 From: frank.loeffler at aei.mpg.de (Frank Loeffler) Date: Sat, 12 Apr 2008 15:52:36 -0500 Subject: [Users] Compiling Cactus with PETSc In-Reply-To: References: Message-ID: <20080412205236.GF980@numrel07.cct.lsu.edu> On Sun, Apr 13, 2008 at 02:30:24AM +0800, #NGUYEN CONG TRI# wrote: > /usr/bin/ld: cannot find -lpetscfortran That means that the linker could not find the library petscfortran, which can have different reasons. For debugging the problem, it would be important to know, if this happenes because the library really does not exist, if it exists, but the linker cannot find it or because it finds it, but it migt not be compatible (you can have that on mixed 32bit/64bit systems). So, please check first, if you find files with the name libpetscfortran.* (where the * can e.g. be 'a' or 'so'.). The best place to look would be the nstallation directory of petsc. If you find it, I would expect the problem to be Cactus/the linker not finding the library. But let us do that after we confirmed the first step. Frank From NGUY0045 at ntu.edu.sg Sun Apr 13 04:58:52 2008 From: NGUY0045 at ntu.edu.sg (#NGUYEN CONG TRI#) Date: Sun, 13 Apr 2008 17:58:52 +0800 Subject: [Users] Compiling Cactus with PETSc Message-ID: Dear Frank, Thanks to your help, I could figure out how to fix the problem already. I checked the PETSc website and found out that from 2.3.1 version onwards the libpetscfortran.a library has been removed (fortran interface & c interface now go into the same library). So I manually edited the relevant libraries that should be included in the config file make.extra.defn. It works fine now. Best regards, Tri. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080413/097fb71f/attachment.html From avi at sicortex.com Mon Apr 14 10:04:19 2008 From: avi at sicortex.com (Avi Purkayastha) Date: Mon, 14 Apr 2008 10:04:19 -0500 Subject: [Users] using PETSc with the appropriate benchmark Message-ID: Hi all, I hope this is the appropriate list for the following question, or please redirect if not. From the tutorial it seems like CactusElliptic/EllPETSc seems to be the only thorn using the petsc lib. However the default Cactus thorn list (see below) does not include CactusElliptic. So my newbie question is the following: I have run the default build with the Bench_BSSN_PUGH_80l.par parameter file. If I were to add the CactusElliptic thorn and build it, does the elliptic equations get solved when using the above benchmark, or do I need a different benchmark? If there is documentation which explains the above i.e. benchmarks with the appropriate thorn lists, that would be very helpful. Thanks for the help. -- Avi P.S. the default thorn list I was referring to has CactusBase, CactusEinstein and CactusPUGH as the high level thorns. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080414/b8bd0bec/attachment.html From schnetter at cct.lsu.edu Mon Apr 14 10:59:10 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Mon, 14 Apr 2008 10:59:10 -0500 Subject: [Users] using PETSc with the appropriate benchmark In-Reply-To: References: Message-ID: <6404F96D-2EC4-401F-A2F7-BC11650E3017@cct.lsu.edu> On Apr 14, 2008, at 10:04:19, Avi Purkayastha wrote: > Hi all, > > I hope this is the appropriate list for the following question, or > please redirect if not. > > From the tutorial it seems like CactusElliptic/EllPETSc seems to be > the only thorn using the petsc lib. However the default Cactus thorn > list (see below) does not include CactusElliptic. So my newbie > question is the following: > > I have run the default build with the Bench_BSSN_PUGH_80l.par > parameter file. If I were to add the CactusElliptic thorn and build > it, does the elliptic equations get solved when using the above > benchmark, or do I need a different benchmark? > > If there is documentation which explains the above i.e. benchmarks > with the appropriate thorn lists, that would be very helpful. > > Thanks for the help. > > -- Avi > > P.S. the default thorn list I was referring to has CactusBase, > CactusEinstein and CactusPUGH as the high level thorns. Avi, the thorn EllPETSc is a wrapper of the PETSc library. It implements a certain interface which is defined in the thorn EllBase. Another thorn implementing the same interface is EllSOR; this thorn implements an SOR solver, which is a much simpler algorithm, but obviously not suited for large problems. EllPETSc is not part of the default thorn lists because it requires PETSc to be installed on the system, which is not always the case. One particular problem you may come across with PETSc is that the API is updated from time to time, and we then have to update the PETSc wrapper in Cactus to follow these changes, while remaining backwards compatible. The benchmark Bench_BSSN_PUGH_80l.par uses an explicit time integrator, which means that there are no elliptic equations which need to be solved. Therefore adding EllPETSc to your thorn list won't actually change anything in the benchmark, and it won't use PETSc. We have at the moment no benchmark for elliptic equations, but we have examples solving elliptic equations e.g. in the thorn CactusWave/ IDScalarWaveElliptic. This example solves an elliptic equation for the initial condition, and then continues with a few time steps of explicit time stepping without solving an elliptic equation. This example could be standardised into a benchmark, which would mostly entail choosing interesting and reasonable problem sizes. The problem size is chosen by the parameter global_nsize e.g. in CactusWave/IDScalarWaveElliptic/par/source_petsc.par. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080414/cb1877c2/attachment.bin From avi at sicortex.com Mon Apr 14 16:06:00 2008 From: avi at sicortex.com (Avi Purkayastha) Date: Mon, 14 Apr 2008 16:06:00 -0500 Subject: [Users] using PETSc with the appropriate benchmark In-Reply-To: <6404F96D-2EC4-401F-A2F7-BC11650E3017@cct.lsu.edu> References: <6404F96D-2EC4-401F-A2F7-BC11650E3017@cct.lsu.edu> Message-ID: Erik, thanks for the update.. > > We have at the moment no benchmark for elliptic equations, but we > have examples solving elliptic equations e.g. in the thorn > CactusWave/IDScalarWaveElliptic. This example solves an elliptic > equation for the initial condition, and then continues with a few > time steps of explicit time stepping without solving an elliptic > equation. This example could be standardised into a benchmark, > which would mostly entail choosing interesting and reasonable > problem sizes. > This would be of some interest to me, although solving an elliptic equation just for the initial condition would probably not be sufficient to benchmark petsc performance and/or the wave elliptic thorn. > The problem size is chosen by the parameter global_nsize e.g. in > CactusWave/IDScalarWaveElliptic/par/source_petsc.par. btw, I noticed that was no CactusWave thorn in the Cactus source that I have. How does one get it separately? -- Avi > > _______________________________________________ > Users mailing list > Users at cactuscode.org > http://www.cactuscode.org/mailman/listinfo/users From diener at cct.lsu.edu Thu Apr 17 11:49:29 2008 From: diener at cct.lsu.edu (Peter Diener) Date: Thu, 17 Apr 2008 11:49:29 -0500 (CDT) Subject: [Users] Checkpointing issue. Message-ID: Hi, Sorry about the long e-mail. I just saw the following weird behaviour when restarting from a set of Carpet mesh refinementcheckpoint files. I was running with 9 refinementl levels on 128 MPI-processes and was restarting on the same number of processes when it seemed like the restart stalled when reading in refinement level 5. I killed the job and set IO::verbose = "full" to get more info and CarpetIOHDF5::open_one_input_file_at_a_time = yes to make sure that it didn't open more than one file at a time. Below is a summary of the restart process on stdout from MPI process 0: INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_0.h5' INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 0 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 0 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.576 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 1 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 1 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.288 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 2 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 2 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.144 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 3 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 3 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.072 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 4 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 4 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.036 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 5 INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_1.h5' INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_2.h5' INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_97.h5' INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_98.h5' INFO (CarpetIOHDF5): reading 'ADMCONSTRAINTS::ham' from dataset 'ADMCONSTRAINTS::ham it=19968 tl=0 rl=5 c=98' INFO (CarpetIOHDF5): reading 'PSIKADELIA::riczz' from dataset 'PSIKADELIA::riczz it=19968 tl=0 rl=5 c=98' INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 5 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.018 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 6 INFO (CarpetIOHDF5): opening checkpoint file 'random-1o1-med-5/checkpoint.chkpt.it_19968.file_0.h5' INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 6 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.009 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 7 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 7 at iteration 19968 (simulation time 89.856) INFO (ADMMacros): Spatial finite differencing order: 4 INFO (Time): Timestep set to 0.0045 (courant_static) INFO (CarpetIOHDF5): reading grid variables on mglevel 0 reflevel 8 INFO (CarpetIOHDF5): restarting simulation on mglevel 0 reflevel 8 at iteration 19968 (simulation time 89.856) The whole process took almost 7 hours with almost all of the time spent on reflevel 5 opening 98 files before finding the right data. So it seems that for some reason all refinement levels, except for level 5, where distributed in the same way in the original run and the restart run. What used to be on MPI process 0 on level 5 was suddenly on MPI process 98. I don't have output for the other MPI processes, so I don't know how much was moved around... Has anybody seen something like this before? Any suggestions as to what to do about it? Cheers, Peter Diener From hinder at gravity.psu.edu Thu Apr 17 12:49:22 2008 From: hinder at gravity.psu.edu (Ian Hinder) Date: Thu, 17 Apr 2008 13:49:22 -0400 Subject: [Users] Formaline Message-ID: <48078DA2.8050404@gravity.psu.edu> Hi, I am using the Formaline thorn to store a copy of my Cactus source tree in my output directories. In my source tree, I have some thorns which are actually symlinked from outside the tree. Formaline faithfully stores the symlinks, but not the targets of the symlink. Would it be better to dereference symlinks automatically, or to have a parameter to do this? -- Ian Hinder hinder at gravity.psu.edu http://www.gravity.psu.edu/~hinder From schnetter at cct.lsu.edu Thu Apr 17 13:41:40 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Thu, 17 Apr 2008 13:41:40 -0500 Subject: [Users] Formaline In-Reply-To: <48078DA2.8050404@gravity.psu.edu> References: <48078DA2.8050404@gravity.psu.edu> Message-ID: <67F68CB0-BC68-4429-9F38-0C76C149DC24@cct.lsu.edu> On Apr 17, 2008, at 12:49:22, Ian Hinder wrote: > Hi, > > I am using the Formaline thorn to store a copy of my Cactus source > tree > in my output directories. > > In my source tree, I have some thorns which are actually symlinked > from > outside the tree. > > Formaline faithfully stores the symlinks, but not the targets of the > symlink. Would it be better to dereference symlinks automatically, or > to have a parameter to do this? Yes, that should happen automatically. find or tar probably have options for this. Are you using GNU tar? -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080417/d4fc3464/attachment-0001.bin From hinder at gravity.psu.edu Thu Apr 17 14:15:30 2008 From: hinder at gravity.psu.edu (Ian Hinder) Date: Thu, 17 Apr 2008 15:15:30 -0400 Subject: [Users] Formaline In-Reply-To: <67F68CB0-BC68-4429-9F38-0C76C149DC24@cct.lsu.edu> References: <48078DA2.8050404@gravity.psu.edu> <67F68CB0-BC68-4429-9F38-0C76C149DC24@cct.lsu.edu> Message-ID: <4807A1D2.7090600@gravity.psu.edu> Erik Schnetter wrote: > On Apr 17, 2008, at 12:49:22, Ian Hinder wrote: > >> Hi, >> >> I am using the Formaline thorn to store a copy of my Cactus source tree >> in my output directories. >> >> In my source tree, I have some thorns which are actually symlinked from >> outside the tree. >> >> Formaline faithfully stores the symlinks, but not the targets of the >> symlink. Would it be better to dereference symlinks automatically, or >> to have a parameter to do this? > > > Yes, that should happen automatically. find or tar probably have > options for this. Are you using GNU tar? This is on Ranger, and yes, it seem to be GNU tar. I couldn't find where in the Formaline thorn the tarballs are created - I assumed it was some script which is invoked during the build process, but I couldn't find anything. Is it in Formaline or in Cactus? -- Ian Hinder hinder at gravity.psu.edu http://www.gravity.psu.edu/~hinder From schnetter at cct.lsu.edu Thu Apr 17 14:26:26 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Thu, 17 Apr 2008 14:26:26 -0500 Subject: [Users] Formaline In-Reply-To: <4807A1D2.7090600@gravity.psu.edu> References: <48078DA2.8050404@gravity.psu.edu> <67F68CB0-BC68-4429-9F38-0C76C149DC24@cct.lsu.edu> <4807A1D2.7090600@gravity.psu.edu> Message-ID: On Apr 17, 2008, at 14:15:30, Ian Hinder wrote: > Erik Schnetter wrote: >> On Apr 17, 2008, at 12:49:22, Ian Hinder wrote: >> >>> Hi, >>> >>> I am using the Formaline thorn to store a copy of my Cactus source >>> tree >>> in my output directories. >>> >>> In my source tree, I have some thorns which are actually symlinked >>> from >>> outside the tree. >>> >>> Formaline faithfully stores the symlinks, but not the targets of the >>> symlink. Would it be better to dereference symlinks >>> automatically, or >>> to have a parameter to do this? >> >> >> Yes, that should happen automatically. find or tar probably have >> options for this. Are you using GNU tar? > > This is on Ranger, and yes, it seem to be GNU tar. I couldn't find > where in the Formaline thorn the tarballs are created - I assumed it > was > some script which is invoked during the build process, but I couldn't > find anything. Is it in Formaline or in Cactus? It is in Formaline, in the file src/make.configuration.deps. It could be the option -h. Do you know whether -h is standard for all tar implementations? -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080417/ede36e0a/attachment.bin From schnetter at cct.lsu.edu Thu Apr 17 18:28:35 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Thu, 17 Apr 2008 18:28:35 -0500 Subject: [Users] Formaline In-Reply-To: <48078DA2.8050404@gravity.psu.edu> References: <48078DA2.8050404@gravity.psu.edu> Message-ID: On Apr 17, 2008, at 12:49:22, Ian Hinder wrote: > Hi, > > I am using the Formaline thorn to store a copy of my Cactus source > tree > in my output directories. > > In my source tree, I have some thorns which are actually symlinked > from > outside the tree. > > Formaline faithfully stores the symlinks, but not the targets of the > symlink. Would it be better to dereference symlinks automatically, or > to have a parameter to do this? I have arrangements which are symbolic links from outside the source tree, and these arrangements are placed into the tarballs just fine. That is, I didn't encounter your problem. It could also be the find command that needs to be updated, not the tar command. Can you try replacing the line find arrangements/$(filter %/$*,$(THORNS)) \ by find arrangements/$(filter %/$*,$(THORNS))/. \ in the file Formaline/src/make.configuration.deps? That is, you would append "/." (slash-dot) to the directory name. Does this solve your problem? -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080417/2abd835e/attachment.bin From avi at sicortex.com Fri Apr 18 08:12:16 2008 From: avi at sicortex.com (Avi Purkayastha) Date: Fri, 18 Apr 2008 08:12:16 -0500 Subject: [Users] BenchIO_HDF5 questions Message-ID: Hello, I have been testing the BenchIO_HDF5 benchmark and a number of issues came up: 1) I was testing on a 10 GB filesystem and seem to run out of space on runs with varying proc counts. The benchmark page lists that each proc contributes 352 MB to the overall data checkpoint, so does that mean that the total required disk space is N*352 MB, where N is the number of procs? 2) Is there an order of preference based on importance, for these three tests -- onefile, eachproc, and 8proc from the Cactus user community, or all of equal importance? 3) Is there any data from these I/O tests listed anywhere from different architecture systems with fast disks and/or filesystems, so one can get a sense of what is considered good, bad or ugly IO rates? Thanks -- Avi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.cactuscode.org/pipermail/users/attachments/20080418/23b23190/attachment.html From tradke at aei.mpg.de Fri Apr 18 12:17:40 2008 From: tradke at aei.mpg.de (Thomas Radke) Date: Fri, 18 Apr 2008 19:17:40 +0200 Subject: [Users] Checkpointing issue. In-Reply-To: References: Message-ID: <4808D7B4.1020605@aei.mpg.de> Peter Diener wrote: > Hi, > > Sorry about the long e-mail. > > I just saw the following weird behaviour when restarting from a set of > Carpet mesh refinementcheckpoint files. I was running with 9 refinementl > levels on 128 MPI-processes and was restarting on the same number of > processes when it seemed like the restart stalled when reading in > refinement level 5. > > The whole process took almost 7 hours with almost all of the time spent on > reflevel 5 opening 98 files before finding the right data. > > So it seems that for some reason all refinement levels, except for level > 5, where distributed in the same way in the original run and the restart > run. What used to be on MPI process 0 on level 5 was suddenly on MPI > process 98. I don't have output for the other MPI processes, so I don't > know how much was moved around... > > Has anybody seen something like this before? > > Any suggestions as to what to do about it? Hi Peter, I don't know why the grid structure for refinement level 5 would be different in the recovery run but not for other levels. I have seen this behaviour before though but couldn't find out why Carpet was doing that. -- Cheers, Thomas. From tradke at aei.mpg.de Fri Apr 18 12:23:10 2008 From: tradke at aei.mpg.de (Thomas Radke) Date: Fri, 18 Apr 2008 19:23:10 +0200 Subject: [Users] BenchIO_HDF5 questions In-Reply-To: References: Message-ID: <4808D8FE.4090601@aei.mpg.de> Avi Purkayastha wrote: > Hello, > I have been testing the BenchIO_HDF5 benchmark and a number of issues > came up: > > 1) I was testing on a 10 GB filesystem and seem to run out of space on > runs with varying proc counts. The benchmark page lists that each proc > contributes 352 MB to the overall data checkpoint, so does that mean > that the total required disk space is N*352 MB, where N is the number > of procs? Yes. > 2) Is there an order of preference based on importance, for these three > tests -- onefile, eachproc, and 8proc from the Cactus user community, > or all of equal importance? Nowadays we almost always output in parallel on each processor because it gives the best performance. > 3) Is there any data from these I/O tests listed anywhere from > different architecture systems with fast disks and/or filesystems, so > one can get a sense of what is considered good, bad or ugly IO rates? I don't have any results for these benchmarks. You could take a look at the disk I/O scaling tests we have performed on our AEI clusters when they were delivered. They don't use Cactus but mimic Cactus behaviour during a parallel checkpoint. -- Cheers, Thomas. From avi at sicortex.com Fri Apr 18 13:54:30 2008 From: avi at sicortex.com (Avi Purkayastha) Date: Fri, 18 Apr 2008 13:54:30 -0500 Subject: [Users] BenchIO_HDF5 questions In-Reply-To: <4808D8FE.4090601@aei.mpg.de> References: <4808D8FE.4090601@aei.mpg.de> Message-ID: <487B02E0-C9FF-4C40-ACA9-C5379449ADC0@sicortex.com> Thomas, >> 2) Is there an order of preference based on importance, for these >> three >> tests -- onefile, eachproc, and 8proc from the Cactus user >> community, >> or all of equal importance? > > Nowadays we almost always output in parallel on each processor because > it gives the best performance. I understand from the message that yours and the communities choice for 'eachproc' is due to lack of fast disks in combination with parallel f/s not delivering the performance that you are getting from 'eachproc'. But in the future if the other choices did deliver comparable performance in terms of throughput would that be preferable from the algorithmic and implementation standpoint? > > I don't have any results for these benchmarks. You could take a > look at > the disk I/O scaling tests we have performed on our AEI clusters when > they were delivered. They don't use Cactus but mimic Cactus behaviour > during a parallel checkpoint. I looked at www.cactuscode.org but did not find anything there. Maybe you can point me to where that data might be? Thanks -- Avi From schnetter at cct.lsu.edu Fri Apr 18 15:06:56 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Fri, 18 Apr 2008 15:06:56 -0500 Subject: [Users] BenchIO_HDF5 questions In-Reply-To: References: Message-ID: <72F1A50F-51DC-4A8B-BE71-D595D8670B36@cct.lsu.edu> On Apr 18, 2008, at 08:12:16, Avi Purkayastha wrote: > Hello, > I have been testing the BenchIO_HDF5 benchmark and a number of > issues came up: > > 1) I was testing on a 10 GB filesystem and seem to run out of space > on runs with varying proc counts. The benchmark page lists that each > proc contributes 352 MB to the overall data checkpoint, so does that > mean that the total required disk space is N*352 MB, where N is the > number of procs? Yes, this is correct. This corresponds to a weak scaling test where the problem size grows with the number of available processors. This test emulates writing a checkpoint file where each processor writes out the complete state information to disk. > 2) Is there an order of preference based on importance, for these > three tests -- onefile, eachproc, and 8proc from the Cactus user > community, or all of equal importance? I would order them in order of decreasing importance as 8proc, eachproc, and then with a large distance onefile. We don't use onefile any more for production runs on large numbers of processors. > 3) Is there any data from these I/O tests listed anywhere from > different architecture systems with fast disks and/or filesystems, > so one can get a sense of what is considered good, bad or ugly IO > rates? I'm not aware of any. We have had concentrated benchmarking efforts for CPU power, but not for disk I/O. I could be wrong. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080418/627425b3/attachment.bin From schnetter at cct.lsu.edu Fri Apr 18 15:47:32 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Fri, 18 Apr 2008 15:47:32 -0500 Subject: [Users] Checkpointing issue. In-Reply-To: <4808D7B4.1020605@aei.mpg.de> References: <4808D7B4.1020605@aei.mpg.de> Message-ID: <7F8AAC52-1466-42D5-BA00-3B7E81395D37@cct.lsu.edu> On Apr 18, 2008, at 12:17:40, Thomas Radke wrote: > Peter Diener wrote: >> Hi, >> >> Sorry about the long e-mail. >> >> I just saw the following weird behaviour when restarting from a set >> of >> Carpet mesh refinementcheckpoint files. I was running with 9 >> refinementl >> levels on 128 MPI-processes and was restarting on the same number of >> processes when it seemed like the restart stalled when reading in >> refinement level 5. >> >> The whole process took almost 7 hours with almost all of the time >> spent on >> reflevel 5 opening 98 files before finding the right data. >> >> So it seems that for some reason all refinement levels, except for >> level >> 5, where distributed in the same way in the original run and the >> restart >> run. What used to be on MPI process 0 on level 5 was suddenly on MPI >> process 98. I don't have output for the other MPI processes, so I >> don't >> know how much was moved around... >> >> Has anybody seen something like this before? >> >> Any suggestions as to what to do about it? > > Hi Peter, > > I don't know why the grid structure for refinement level 5 would be > different in the recovery run but not for other levels. I have seen > this > behaviour before though but couldn't find out why Carpet was doing > that. It could be that Carpet, for some reason, decides to use a different processor decomposition, and therefore has to read in data that were written by other processors. (It could also be some other inconsistency in checkpointing). Peter, do you have the old and new processor decompositions? It should suffice to look at the shapes of the components in the old checkpoint file and in a checkpoint file that you write just after restarting. Since reading in so many files is so slow, we may need to change recovery to make every processor open only one file and send the data around via MPI. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080418/2d8a8f5c/attachment-0001.bin From schnetter at cct.lsu.edu Fri Apr 18 16:01:29 2008 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Fri, 18 Apr 2008 16:01:29 -0500 Subject: [Users] BenchIO_HDF5 questions In-Reply-To: <487B02E0-C9FF-4C40-ACA9-C5379449ADC0@sicortex.com> References: <4808D8FE.4090601@aei.mpg.de> <487B02E0-C9FF-4C40-ACA9-C5379449ADC0@sicortex.com> Message-ID: <64BA5BC1-1974-433A-BC69-411F4A1C7E80@cct.lsu.edu> On Apr 18, 2008, at 13:54:30, Avi Purkayastha wrote: > Thomas, > >>> 2) Is there an order of preference based on importance, for these >>> three >>> tests -- onefile, eachproc, and 8proc from the Cactus user >>> community, >>> or all of equal importance? >> >> Nowadays we almost always output in parallel on each processor >> because >> it gives the best performance. > > I understand from the message that yours and the communities choice > for 'eachproc' is due to lack of fast disks in combination with > parallel f/s not delivering the performance that you are getting from > 'eachproc'. But in the future if the other choices did deliver > comparable performance in terms of throughput would that be > preferable from the algorithmic and implementation standpoint? Our preferred I/O method would currently be: Each node (if we combine OpenMP and MPI) or each processor (if we use only MPI) writes its data into a file. Having a single file would indeed be simpler than having N files and would in principle be preferred. We are using the HDF5 library for I/O, so this also depends on the performance of the parallel HDF5 implementation. We are currently only using the serial version of HDF5. >> I don't have any results for these benchmarks. You could take a >> look at >> the disk I/O scaling tests we have performed on our AEI clusters when >> they were delivered. They don't use Cactus but mimic Cactus behaviour >> during a parallel checkpoint. > > I looked at www.cactuscode.org but did not find anything there. Maybe > you can point me to where that data might be? Thomas is probably referring to . The AEI uses Cactus for much of its computational work; we have a collaboration between LSU and AEI in numerical relativity. -erik -- Erik Schnetter http://www.cct.lsu.edu/~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/users/attachments/20080418/fde94a93/attachment.bin