From jtao at cct.lsu.edu Sun Dec 2 13:17:56 2007 From: jtao at cct.lsu.edu (Jian Tao) Date: Sun, 02 Dec 2007 13:17:56 -0600 Subject: [Developers] remove dependency of NaNChecker from MoL Message-ID: <475304E4.1060001@cct.lsu.edu> Hi, Currently, MoL explicit requires NaNChecker. It is better to remove the dependency of NaNChecker from MoL by using an alias function. ---------------------------------------------------- In the interface.ccl of MoL --------------------------- #USES INCLUDE: NaNChecker.h ################################# ### Functions from NaNChecker ### ################################# CCTK_INT FUNCTION CheckVarsForNaN \ (CCTK_POINTER_TO_CONST IN cctkGH, \ CCTK_INT IN report_max, \ CCTK_STRING IN vars, \ CCTK_STRING IN check_for, \ CCTK_STRING IN action_if_found) USES FUNCTION CheckVarsForNaN In the interface.ccl of NaNChecker ---------------------------------- # Provide the function to check NaNs CCTK_INT FUNCTION CheckVarsForNaN \ (CCTK_POINTER_TO_CONST IN cctkGH, \ CCTK_INT IN report_max, \ CCTK_STRING IN vars, \ CCTK_STRING IN check_for, CCTK_STRING IN action_if_found) PROVIDES FUNCTION CheckVarsForNaN WITH NaNChecker_CheckVarsForNaN LANGUAGE C Regards, Jian From tradke at aei.mpg.de Mon Dec 3 09:52:49 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Mon, 03 Dec 2007 16:52:49 +0100 Subject: [Developers] remove dependency of NaNChecker from MoL In-Reply-To: <475304E4.1060001@cct.lsu.edu> References: <475304E4.1060001@cct.lsu.edu> Message-ID: <47542651.5040105@aei.mpg.de> Jian Tao wrote: > Hi, > > Currently, MoL explicit requires NaNChecker. It is better to > remove the dependency of NaNChecker from MoL by using an alias > function. > > ---------------------------------------------------- > In the interface.ccl of MoL > --------------------------- > USES FUNCTION CheckVarsForNaN In order to keep runtime backwards compatibility MoL should REQUIRE FUNCTION CheckVarsForNaN. There is no convention about names for aliased functions but I suggest to prefix them with the name of the providing thorn/implementation. > In the interface.ccl of NaNChecker > ---------------------------------- > PROVIDES FUNCTION CheckVarsForNaN WITH NaNChecker_CheckVarsForNaN > LANGUAGE C For completeness, NaNChecker should provide its full API via aliased functions. -- Cheers, Thomas. From szilagyi at aei.mpg.de Mon Dec 3 09:57:39 2007 From: szilagyi at aei.mpg.de (Bela Szilagyi) Date: Mon, 3 Dec 2007 16:57:39 +0100 Subject: [Developers] remove dependency of NaNChecker from MoL In-Reply-To: <47542651.5040105@aei.mpg.de> References: <475304E4.1060001@cct.lsu.edu> <47542651.5040105@aei.mpg.de> Message-ID: <200712031657.39178.szilagyi@aei.mpg.de> > For completeness, NaNChecker should provide its full API via aliased > functions. One feature I always wished in NaNChecker is to be able to call it at any time for any set of variables. Which, I guess, could be well taken care of by adding the aliased function interface. Another feature I find equally important (with mesh refinement) is to check past time-levels as well, if available. Personally I ended up writing my own nanchecking routine, looking for all the variables I use and on all their timelevels. Perhaps if we give easy access to the functionality of NaNchecker (and extend it a bit), it could become a useful debugging tool (rather than being used mainly as an emergency brake, as things stand at the moment). From tradke at aei.mpg.de Mon Dec 3 10:28:33 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Mon, 03 Dec 2007 17:28:33 +0100 Subject: [Developers] remove dependency of NaNChecker from MoL In-Reply-To: <200712031657.39178.szilagyi@aei.mpg.de> References: <475304E4.1060001@cct.lsu.edu> <47542651.5040105@aei.mpg.de> <200712031657.39178.szilagyi@aei.mpg.de> Message-ID: <47542EB1.8080300@aei.mpg.de> Bela Szilagyi wrote: > One feature I always wished in NaNChecker is to be able to call it at any time > for any set of variables. Which, I guess, could be well taken care of by > adding the aliased function interface. > > Another feature I find equally important (with mesh refinement) is to check > past time-levels as well, if available. > > Personally I ended up writing my own nanchecking routine, looking for all the > variables I use and on all their timelevels. Perhaps if we give easy access > to the functionality of NaNchecker (and extend it a bit), it could become a > useful debugging tool (rather than being used mainly as an emergency brake, > as things stand at the moment). Hi Bela, both features are there already, and they are even documented :-) For checking individual timelevels please refer to the end of section 2, for a description of the API to section 4 in the NaNChecker thorn documentation. -- Cheers, Thomas. From schnetter at cct.lsu.edu Mon Dec 3 11:11:22 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Mon, 3 Dec 2007 11:11:22 -0600 Subject: [Developers] remove dependency of NaNChecker from MoL In-Reply-To: <47542651.5040105@aei.mpg.de> References: <475304E4.1060001@cct.lsu.edu> <47542651.5040105@aei.mpg.de> Message-ID: On Dec 3, 2007, at 09:52:49, Thomas Radke wrote: > Jian Tao wrote: >> Hi, >> >> Currently, MoL explicit requires NaNChecker. It is better to >> remove the dependency of NaNChecker from MoL by using an alias >> function. >> >> ---------------------------------------------------- >> In the interface.ccl of MoL >> --------------------------- >> USES FUNCTION CheckVarsForNaN > > In order to keep runtime backwards compatibility MoL should REQUIRE > FUNCTION CheckVarsForNaN. Many people use MoL without the NaNChecker; if MoL requires the function, NaNChecker always needs to be present, even if it is not used. That is inconvenient. I would keep it as "uses" and add a run- time check depending on MoL's parameter settings. > There is no convention about names for aliased functions but I suggest > to prefix them with the name of the providing thorn/implementation. That would require renaming the existing function, which we shouldn't do. Do you have a suggestion for a good name? >> In the interface.ccl of NaNChecker >> ---------------------------------- >> PROVIDES FUNCTION CheckVarsForNaN WITH NaNChecker_CheckVarsForNaN >> LANGUAGE C > > For completeness, NaNChecker should provide its full API via aliased > functions. I think there is only one other function, SetVarsToNaN. -erik -- Erik Schnetter My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071203/1ea91def/attachment.bin From tradke at aei.mpg.de Mon Dec 3 11:38:30 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Mon, 03 Dec 2007 18:38:30 +0100 Subject: [Developers] remove dependency of NaNChecker from MoL In-Reply-To: References: <475304E4.1060001@cct.lsu.edu> <47542651.5040105@aei.mpg.de> Message-ID: <47543F16.8020404@aei.mpg.de> Erik Schnetter wrote: > Many people use MoL without the NaNChecker; if MoL requires the > function, NaNChecker always needs to be present, even if it is not > used. That is inconvenient. I would keep it as "uses" and add a run- > time check depending on MoL's parameter settings. If I remember right a thorn needs to be activated in order to provide aliased functions. Yes, then this would be inconvenient. >> There is no convention about names for aliased functions but I suggest >> to prefix them with the name of the providing thorn/implementation. > > > That would require renaming the existing function, which we shouldn't > do. Do you have a suggestion for a good name? Not really, and I don't know how many thorns would need to be changed to use renamed functions. So maybe we should just stick with CheckVarsForNaN() and SetVarsToNaN(). -- Cheers, Thomas. From oweidner at cct.lsu.edu Mon Dec 3 18:29:05 2007 From: oweidner at cct.lsu.edu (Ole Weidner) Date: Mon, 3 Dec 2007 18:29:05 -0600 Subject: [Developers] CactusIO::IOJpeg standard vs. remove mode / output format PATCH Message-ID: Aloha, following up the thread "CactusIO::IOJpeg standard vs. remove mode" on the users mailing list (http://www.cactuscode.org/old/pipermail/users/2007-December.txt ) here is the according patch. Cheers, Ole --- StripMime Report -- processed MIME parts --- multipart/mixed text/plain (text body -- kept) application/octet-stream text/plain (text body -- kept) --- From tradke at aei.mpg.de Tue Dec 4 04:56:08 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Tue, 04 Dec 2007 11:56:08 +0100 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long Message-ID: <47553248.1010309@aei.mpg.de> Hi, the Cactus integration tests (for which the results can be found in the portal https://portal.cactuscode.org) are also run every night on an IBM SP5 at LSU. There I have the problem that, for a configuration with very many thorns, the Cactus bindings library cannot be generated: > gmake[2]: execvp: ar: The parameter or environment lists are too long. > gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/libthorn_CactusBindings.a] Error 127 Didn't we once had a similar problems with command lines being too long ? Was there a solution to that problem ? -- Cheers, Thomas. From tradke at aei.mpg.de Tue Dec 4 05:25:22 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Tue, 04 Dec 2007 12:25:22 +0100 Subject: [Developers] synchronise all processors before aborting on parameter errors Message-ID: <47553922.9090202@aei.mpg.de> Hi, there exists a PARAM_CHECK bin in which thorns can schedule routines to check the consistency of parameters and have the run stopped (using CCTK_Abort) if there are errors. Now Bela reported the problem that, for multi-processor simulations using certain Infiniband MPI implementations, the run would die prematurely because some processors call CCTK_Abort() earlier than others, and in the logfile one cannot easily find the real reason for the abort anymore. Putting output buffer caching issues aside, the problem could be fixed by inserting a CCTK_Barrier() call in the flesh function CCTKi_FinaliseParamWarn(), just before it would check whether there were any (local) parameter errors and then call CCTK_Abort(). I guess this small performance penalty would be acceptable ? Or does someone have a better solution ? -- Cheers, Thomas. From goodale at cct.lsu.edu Tue Dec 4 05:32:51 2007 From: goodale at cct.lsu.edu (Tom Goodale) Date: Tue, 4 Dec 2007 11:32:51 +0000 (GMT) Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <47553248.1010309@aei.mpg.de> References: <47553248.1010309@aei.mpg.de> Message-ID: On Tue, 4 Dec 2007, Thomas Radke wrote: > Hi, > > the Cactus integration tests (for which the results can be found in the > portal https://portal.cactuscode.org) are also run every night on an IBM > SP5 at LSU. There I have the problem that, for a configuration with very > many thorns, the Cactus bindings library cannot be generated: > >> gmake[2]: execvp: ar: The parameter or environment lists are too long. >> gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/libthorn_CactusBindings.a] Error 127 Ouch. > Didn't we once had a similar problems with command lines being too long > ? Was there a solution to that problem ? There used to be two cases: in older versions of Cactus all files were put as .o files in one directory and linked together - we got around that by making a library for each thorn (which makes more sense anyway); I think there was also a problem with the shell on older IBMs but that should not be a problem with bash. One way it could be fixed here would be to incrementally build the ar - i.e. make an archive for the files in one directory, and then append files from other directories. The alternative is incremental linking and making a .o file for each thorn, which would have the advantage of catching duplicate symbols at link time, but a disadvantage in being very linker dependent on how it's done so needing more effort to port to different architectures. Cheers, Tom From goodale at cct.lsu.edu Tue Dec 4 05:40:02 2007 From: goodale at cct.lsu.edu (Tom Goodale) Date: Tue, 4 Dec 2007 11:40:02 +0000 (GMT) Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <47553922.9090202@aei.mpg.de> References: <47553922.9090202@aei.mpg.de> Message-ID: On Tue, 4 Dec 2007, Thomas Radke wrote: > Hi, > > there exists a PARAM_CHECK bin in which thorns can schedule routines to > check the consistency of parameters and have the run stopped (using > CCTK_Abort) if there are errors. > Now Bela reported the problem that, for multi-processor simulations > using certain Infiniband MPI implementations, the run would die > prematurely because some processors call CCTK_Abort() earlier than > others, and in the logfile one cannot easily find the real reason for > the abort anymore. > > Putting output buffer caching issues aside, the problem could be fixed > by inserting a CCTK_Barrier() call in the flesh function > CCTKi_FinaliseParamWarn(), just before it would check whether there were > any (local) parameter errors and then call CCTK_Abort(). > > I guess this small performance penalty would be acceptable ? Or does > someone have a better solution ? It sounds good to me. Tom From schnetter at cct.lsu.edu Tue Dec 4 10:52:37 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 4 Dec 2007 10:52:37 -0600 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <47553248.1010309@aei.mpg.de> References: <47553248.1010309@aei.mpg.de> Message-ID: <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> On Dec 4, 2007, at 04:56:08, Thomas Radke wrote: > Hi, > > the Cactus integration tests (for which the results can be found in > the > portal https://portal.cactuscode.org) are also run every night on > an IBM > SP5 at LSU. There I have the problem that, for a configuration with > very > many thorns, the Cactus bindings library cannot be generated: > >> gmake[2]: execvp: ar: The parameter or environment lists are too >> long. >> gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/ >> libthorn_CactusBindings.a] Error 127 > > Didn't we once had a similar problems with command lines being too > long > ? Was there a solution to that problem ? I have a work-around using xargs and some additional make goals. The additional goals build a text file containing all object files, and then ar is called via xargs to create the thorn library piecewise. See the goal $(NAME), and the additional goals near the end of this file. -erik -- Erik Schnetter My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: make.thornlib Type: application/octet-stream Size: 5555 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/developers/attachments/20071204/23684737/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071204/23684737/attachment.bin From schnetter at cct.lsu.edu Tue Dec 4 10:55:17 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 4 Dec 2007 10:55:17 -0600 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: References: <47553248.1010309@aei.mpg.de> Message-ID: On Dec 4, 2007, at 05:32:51, Tom Goodale wrote: > On Tue, 4 Dec 2007, Thomas Radke wrote: > >> Hi, >> >> the Cactus integration tests (for which the results can be found >> in the >> portal https://portal.cactuscode.org) are also run every night on >> an IBM >> SP5 at LSU. There I have the problem that, for a configuration >> with very >> many thorns, the Cactus bindings library cannot be generated: >> >>> gmake[2]: execvp: ar: The parameter or environment lists are too >>> long. >>> gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/ >>> libthorn_CactusBindings.a] Error 127 > > Ouch. > >> Didn't we once had a similar problems with command lines being too >> long >> ? Was there a solution to that problem ? > > There used to be two cases: in older versions of Cactus all files > were > put as .o files in one directory and linked together - we got > around that > by making a library for each thorn (which makes more sense anyway); I > think there was also a problem with the shell on older IBMs but that > should not be a problem with bash. > > One way it could be fixed here would be to incrementally build the > ar - > i.e. make an archive for the files in one directory, and then > append files > from other directories. The alternative is incremental linking and > making > a .o file for each thorn, which would have the advantage of catching > duplicate symbols at link time, This by itself would be very useful, because it catches a common programming error that is otherwise very difficult to detect. The trap is: one copies a thorn, gives it a new name, and makes modifications to it. Unless all function names are changed, there is a big chance of a segmentation fault at run time, since the linker will "randomly" choose which of two functions with identical names to use. The other is discarded without warning. This is called a feature of Unix linkers, because this makes it possible to overwrite functions at link time. > but a disadvantage in being very linker > dependent on how it's done so needing more effort to port to different > architectures. It would be sufficient to have this on the main architectures, where it would catch the errors. Other architectures can still use libraries. -erik -- Erik Schnetter My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071204/29ad3eac/attachment.bin From schnetter at cct.lsu.edu Tue Dec 4 10:58:00 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 4 Dec 2007 10:58:00 -0600 Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <47553922.9090202@aei.mpg.de> References: <47553922.9090202@aei.mpg.de> Message-ID: <0ED902E5-EE2C-4AF6-9F06-66B8635F97AC@cct.lsu.edu> On Dec 4, 2007, at 05:25:22, Thomas Radke wrote: > Hi, > > there exists a PARAM_CHECK bin in which thorns can schedule > routines to > check the consistency of parameters and have the run stopped (using > CCTK_Abort) if there are errors. > Now Bela reported the problem that, for multi-processor simulations > using certain Infiniband MPI implementations, the run would die > prematurely because some processors call CCTK_Abort() earlier than > others, and in the logfile one cannot easily find the real reason for > the abort anymore. > > Putting output buffer caching issues aside, the problem could be fixed > by inserting a CCTK_Barrier() call in the flesh function > CCTKi_FinaliseParamWarn(), just before it would check whether there > were > any (local) parameter errors and then call CCTK_Abort(). > > I guess this small performance penalty would be acceptable ? Or does > someone have a better solution ? In addition to this good idea, we could insert a sleep(10) in CCTK_Abort, so that other processors have a bit of time to catch up before aborting. This should often be enough for them to produce some additional debug output. (I'm thinking of a new parameter INT sleep_time_before_abort, with a default value of 10 or so.) -erik -- Erik Schnetter My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071204/7d7e2b62/attachment.bin From goodale at cct.lsu.edu Tue Dec 4 11:46:41 2007 From: goodale at cct.lsu.edu (Tom Goodale) Date: Tue, 4 Dec 2007 17:46:41 +0000 (GMT) Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <0ED902E5-EE2C-4AF6-9F06-66B8635F97AC@cct.lsu.edu> References: <47553922.9090202@aei.mpg.de> <0ED902E5-EE2C-4AF6-9F06-66B8635F97AC@cct.lsu.edu> Message-ID: On Tue, 4 Dec 2007, Erik Schnetter wrote: > On Dec 4, 2007, at 05:25:22, Thomas Radke wrote: > >> Hi, >> >> there exists a PARAM_CHECK bin in which thorns can schedule routines to >> check the consistency of parameters and have the run stopped (using >> CCTK_Abort) if there are errors. >> Now Bela reported the problem that, for multi-processor simulations >> using certain Infiniband MPI implementations, the run would die >> prematurely because some processors call CCTK_Abort() earlier than >> others, and in the logfile one cannot easily find the real reason for >> the abort anymore. >> >> Putting output buffer caching issues aside, the problem could be fixed >> by inserting a CCTK_Barrier() call in the flesh function >> CCTKi_FinaliseParamWarn(), just before it would check whether there were >> any (local) parameter errors and then call CCTK_Abort(). >> >> I guess this small performance penalty would be acceptable ? Or does >> someone have a better solution ? > > > In addition to this good idea, we could insert a sleep(10) in CCTK_Abort, so > that other processors have a bit of time to catch up before aborting. This > should often be enough for them to produce some additional debug output. > (I'm thinking of a new parameter INT sleep_time_before_abort, with a default > value of 10 or so.) Makes sense (although needs to be a separate commit). Cheers, Tom From baiotti at ea.c.u-tokyo.ac.jp Wed Dec 5 00:06:11 2007 From: baiotti at ea.c.u-tokyo.ac.jp (Luca Baiotti) Date: Wed, 5 Dec 2007 15:06:11 +0900 (JST) Subject: [Developers] MPI_Finalize Message-ID: <20071205150611.2E667304@ea.c.u-tokyo.ac.jp> Dear developers, has there been any follow-up to this proposal? Ciao Luca >Erik Schnetter wrote: >> Currently, the flesh calls MPI_Init, but the driver is supposed to call >> MPI_Finalize. This happens presumably in the terminate bin as >> Driver_Terminate. According to the MPI standard, no externally visible >> operations may be performed after that, so that the driver may as well >> call exit at the same time. If not, strange errors can happen, as >> reported today elsewhere by Luca Baiotti. >> >> I propose to have the flesh call MPI_Finalize, to make things symmetric >> to MPI_Init. It would do so very late in CCTKi_ShutdownCactus after >> printing "Done.". This means that thorns could continue to schedule >> actions in the terminate bin after Driver_Terminate, and in the shutdown >> bin. >> >> I attach a diff. Of course, PUGH and Carpet would also need to be >> changed, so that they don't call MPI_Finalize any more. >> >> -erik -------------- next part -------------- An embedded message was scrubbed... From: "Luca Baiotti" Subject: Re: MPI_Finalize Date: Wed, 5 Dec 2007 13:37:50 +0900 (JST) Size: 1434 Url: http://www.cactuscode.org/pipermail/developers/attachments/20071205/4d423f8a/attachment.mht From tradke at aei.mpg.de Wed Dec 5 10:58:01 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Wed, 05 Dec 2007 17:58:01 +0100 Subject: [Developers] MPI_Finalize In-Reply-To: <20071205150611.2E667304@ea.c.u-tokyo.ac.jp> References: <20071205150611.2E667304@ea.c.u-tokyo.ac.jp> Message-ID: <4756D899.8040207@aei.mpg.de> Luca Baiotti wrote: > Dear developers, > > has there been any follow-up to this proposal? I like Erik's suggestion to have the flesh invoke both MPI_Init() and MPI_Finalize(), rather than leaving the latter up to the driver. Apart from the potentially unnecessary Cactus::process_termination parameter logic, the diff looks good to me. http://www.cactuscode.org/old/pipermail/developers/2007-October/005446.html -- Cheers, Thomas. From tradke at aei.mpg.de Wed Dec 5 11:13:11 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Wed, 05 Dec 2007 18:13:11 +0100 Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: References: <47553922.9090202@aei.mpg.de> <0ED902E5-EE2C-4AF6-9F06-66B8635F97AC@cct.lsu.edu> Message-ID: <4756DC27.3020409@aei.mpg.de> Tom Goodale wrote: > On Tue, 4 Dec 2007, Erik Schnetter wrote: > > >>On Dec 4, 2007, at 05:25:22, Thomas Radke wrote: >> >> >>>Hi, >>> >>>there exists a PARAM_CHECK bin in which thorns can schedule routines to >>>check the consistency of parameters and have the run stopped (using >>>CCTK_Abort) if there are errors. >>>Now Bela reported the problem that, for multi-processor simulations >>>using certain Infiniband MPI implementations, the run would die >>>prematurely because some processors call CCTK_Abort() earlier than >>>others, and in the logfile one cannot easily find the real reason for >>>the abort anymore. >>> >>>Putting output buffer caching issues aside, the problem could be fixed >>>by inserting a CCTK_Barrier() call in the flesh function >>>CCTKi_FinaliseParamWarn(), just before it would check whether there were >>>any (local) parameter errors and then call CCTK_Abort(). >>> >>>I guess this small performance penalty would be acceptable ? Or does >>>someone have a better solution ? >> >> >>In addition to this good idea, we could insert a sleep(10) in CCTK_Abort, so >>that other processors have a bit of time to catch up before aborting. This >>should often be enough for them to produce some additional debug output. >>(I'm thinking of a new parameter INT sleep_time_before_abort, with a default >>value of 10 or so.) It does sleep already before calling abort(). I committed that patch to src/comm/CactusDefaultComm.c back in 2003 already, although it has the delay hard-coded to 5 seconds. Is that not enough ? > Makes sense (although needs to be a separate commit). Okay, I will then just commit the CCTK_Barrier() in CCTKi_FinaliseParamWarn(). -- Cheers, Thomas. From szilagyi at aei.mpg.de Wed Dec 5 11:18:40 2007 From: szilagyi at aei.mpg.de (Bela Szilagyi) Date: Wed, 5 Dec 2007 18:18:40 +0100 Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <4756DC27.3020409@aei.mpg.de> References: <47553922.9090202@aei.mpg.de> <4756DC27.3020409@aei.mpg.de> Message-ID: <200712051818.41069.szilagyi@aei.mpg.de> I definitely don't seem to have to wait 5 seconds before the job aborts in case of incorrect parameters. > It does sleep already before calling abort(). I committed that patch to > src/comm/CactusDefaultComm.c back in 2003 already, although it has the > delay hard-coded to 5 seconds. Is that not enough ? > > > Makes sense (although needs to be a separate commit). > > Okay, I will then just commit the CCTK_Barrier() in > CCTKi_FinaliseParamWarn(). From tradke at aei.mpg.de Wed Dec 5 11:29:35 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Wed, 05 Dec 2007 18:29:35 +0100 Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <200712051818.41069.szilagyi@aei.mpg.de> References: <47553922.9090202@aei.mpg.de> <4756DC27.3020409@aei.mpg.de> <200712051818.41069.szilagyi@aei.mpg.de> Message-ID: <4756DFFF.1070308@aei.mpg.de> Bela Szilagyi wrote: > I definitely don't seem to have to wait 5 seconds before the job aborts in > case of incorrect parameters. You are right, CCTK_Abort() is overloaded by the driver, and neither PUGH nor Carpet have the sleep() in their overloading function. -- Cheers, Thomas. From tradke at aei.mpg.de Wed Dec 5 11:44:39 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Wed, 05 Dec 2007 18:44:39 +0100 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> Message-ID: <4756E387.9050702@aei.mpg.de> Erik Schnetter wrote: > On Dec 4, 2007, at 04:56:08, Thomas Radke wrote: > >> Hi, >> >> the Cactus integration tests (for which the results can be found in the >> portal https://portal.cactuscode.org) are also run every night on an IBM >> SP5 at LSU. There I have the problem that, for a configuration with very >> many thorns, the Cactus bindings library cannot be generated: >> >>> gmake[2]: execvp: ar: The parameter or environment lists are too long. >>> gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/ >>> libthorn_CactusBindings.a] Error 127 >> >> >> Didn't we once had a similar problems with command lines being too long >> ? Was there a solution to that problem ? > > > > I have a work-around using xargs and some additional make goals. The > additional goals build a text file containing all object files, and > then ar is called via xargs to create the thorn library piecewise. See > the goal $(NAME), and the additional goals near the end of this file. Thanks, Erik ! I tried with your modified version of make.thornlib, however it seems I'm missing some other modification of yours: > gmake[3]: *** No rule to make target `deps'. Stop. > gmake[2]: *** [make.checked] Error 2 > gmake[1]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/libthorn_A1JobChaining.a] Error 2 Anyway, to my surprise I got a different error message today: > ld: 0711-781 ERROR: TOC overflow. TOC size: 81120 Maximum size: 65536 > gmake[1]: *** [/scratch/tradke/Cactus/exe/cactus_Einstein] Error 12 which indicates that I got further than yesterday. Googling the error message turned up a workaround solution to the TOC overflow problem: http://gcc.gnu.org/ml/gcc/2000-12/msg00509.html and adding "-Wl,-bbigtoc" to LDFLAGS really worked ! Let's see if the Einstein integration tests will succeed to build this night. -- Cheers, Thomas. From schnetter at cct.lsu.edu Wed Dec 5 12:17:29 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Wed, 5 Dec 2007 12:17:29 -0600 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <4756E387.9050702@aei.mpg.de> References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> <4756E387.9050702@aei.mpg.de> Message-ID: <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> On Dec 5, 2007, at 11:44:39, Thomas Radke wrote: > Erik Schnetter wrote: >> On Dec 4, 2007, at 04:56:08, Thomas Radke wrote: >> >>> Hi, >>> >>> the Cactus integration tests (for which the results can be found >>> in the >>> portal https://portal.cactuscode.org) are also run every night >>> on an IBM >>> SP5 at LSU. There I have the problem that, for a configuration >>> with very >>> many thorns, the Cactus bindings library cannot be generated: >>> >>>> gmake[2]: execvp: ar: The parameter or environment lists are >>>> too long. >>>> gmake[2]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/ >>>> libthorn_CactusBindings.a] Error 127 >>> >>> >>> Didn't we once had a similar problems with command lines being >>> too long >>> ? Was there a solution to that problem ? >> >> >> >> I have a work-around using xargs and some additional make goals. The >> additional goals build a text file containing all object files, and >> then ar is called via xargs to create the thorn library >> piecewise. See >> the goal $(NAME), and the additional goals near the end of this >> file. > > Thanks, Erik ! I tried with your modified version of make.thornlib, > however it seems I'm missing some other modification of yours: > >> gmake[3]: *** No rule to make target `deps'. Stop. >> gmake[2]: *** [make.checked] Error 2 >> gmake[1]: *** [/scratch/tradke/Cactus/configs/Einstein/lib/ >> libthorn_A1JobChaining.a] Error 2 Here is my file make.subdir, which contains the target "deps" -erik -- Erik Schnetter My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: make.subdir Type: application/octet-stream Size: 1285 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/developers/attachments/20071205/5fc6d6fd/attachment.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071205/5fc6d6fd/attachment.bin From tradke at aei.mpg.de Thu Dec 6 04:51:13 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Thu, 06 Dec 2007 11:51:13 +0100 Subject: [Developers] synchronise all processors before aborting on parameter errors In-Reply-To: <4756DC27.3020409@aei.mpg.de> References: <47553922.9090202@aei.mpg.de> <0ED902E5-EE2C-4AF6-9F06-66B8635F97AC@cct.lsu.edu> <4756DC27.3020409@aei.mpg.de> Message-ID: <4757D421.4030400@aei.mpg.de> Thomas Radke wrote: > Okay, I will then just commit the CCTK_Barrier() in > CCTKi_FinaliseParamWarn(). I just did that. It also needed a fix in PUGH's overloadable routine for CCTK_Barrier() so that it can be called before a grid hierarchy exists - please update this thorn as well. Carpet's overloadable routine was safe already. -- Cheers, Thomas. From tradke at aei.mpg.de Thu Dec 6 07:03:34 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Thu, 06 Dec 2007 14:03:34 +0100 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> <4756E387.9050702@aei.mpg.de> <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> Message-ID: <4757F326.5030409@aei.mpg.de> Erik Schnetter wrote: > Here is my file make.subdir, which contains the target "deps" Thanks, Erik ! With this everything builds fine now. I have tested your patch also on three other machines to check whether it would break anything - it didn't. Okay to commit the patch ? -- Cheers, Thomas. -------------- next part -------------- A non-text attachment was scrubbed... Name: too-many-arguments.patch Type: text/x-patch Size: 5021 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/developers/attachments/20071206/27a9082f/attachment.bin From goodale at cct.lsu.edu Thu Dec 6 08:31:10 2007 From: goodale at cct.lsu.edu (Tom Goodale) Date: Thu, 6 Dec 2007 14:31:10 +0000 (GMT) Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <4757F326.5030409@aei.mpg.de> References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> <4756E387.9050702@aei.mpg.de> <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> <4757F326.5030409@aei.mpg.de> Message-ID: On Thu, 6 Dec 2007, Thomas Radke wrote: > Erik Schnetter wrote: >> Here is my file make.subdir, which contains the target "deps" > > Thanks, Erik ! > > With this everything builds fine now. I have tested your patch also on three > other machines to check whether it would break anything - it didn't. > > Okay to commit the patch ? Not as it stands. There's some extra commented out lines in there which I assume Erik has been using for experimentation. Also I think the 'deps' stuff is not part of the patch needed for this change. Please tidy it up and create the minimal necessary patch before committing. Cheers, Tom From tradke at aei.mpg.de Fri Dec 14 18:10:22 2007 From: tradke at aei.mpg.de (Thomas Radke) Date: Fri, 14 Dec 2007 18:10:22 -0600 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> <4756E387.9050702@aei.mpg.de> <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> <4757F326.5030409@aei.mpg.de> Message-ID: <47631B6E.8010905@aei.mpg.de> Tom Goodale wrote: > On Thu, 6 Dec 2007, Thomas Radke wrote: > > >>Erik Schnetter wrote: >> >>>Here is my file make.subdir, which contains the target "deps" >> >>Thanks, Erik ! >> >>With this everything builds fine now. I have tested your patch also on three >>other machines to check whether it would break anything - it didn't. >> >>Okay to commit the patch ? > > > Not as it stands. There's some extra commented out lines in there which I > assume Erik has been using for experimentation. Also I think the 'deps' > stuff is not part of the patch needed for this change. Please tidy it up > and create the minimal necessary patch before committing. Okay, here is a version which doesn't have the deps stuff in it anymore. I added a few more comments on the various methods how to create an archive. I also increased the threshold of switching from just copying to incrementally adding to the list of object files to be archived from 100 files to 1000. This works on the problematic IPM SP5. -- Cheers, Thomas. -------------- next part -------------- A non-text attachment was scrubbed... Name: make.thornlib.patch Type: text/x-patch Size: 3215 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/developers/attachments/20071214/51dce2ed/attachment-0001.bin From baiotti at ea.c.u-tokyo.ac.jp Mon Dec 17 20:38:37 2007 From: baiotti at ea.c.u-tokyo.ac.jp (Luca Baiotti) Date: Tue, 18 Dec 2007 11:38:37 +0900 Subject: [Developers] IO::recover = autoprobe Message-ID: <476732AD.60904@ea.c.u-tokyo.ac.jp> Hi, I have used (by mistake) the following settings: IO::recover = autoprobe IO::recover_dir = "checkpoint" IO::recover_file = "checkpoint.chkpt.it_83200" while I should have used IO::recover = manual However, what should be the behaviour in this case? I think it should check whether the specified checkpoint file exist and, in case the checkpoint file was found, recover from it or, otherwise, start from initial data. But I have experienced a different behaviour, namely that the run started from initial data, even if the checkpoint file existed in the specified directory. Could some developer please check? Ciao Luca From schnetter at cct.lsu.edu Mon Dec 17 15:18:31 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Mon, 17 Dec 2007 15:18:31 -0600 Subject: [Developers] cannot generate libthorn_CactusBindings.a on IBM SP5 because argument list is too long In-Reply-To: <47631B6E.8010905@aei.mpg.de> References: <47553248.1010309@aei.mpg.de> <3CD201A6-F26B-4561-8DA2-14FF8A922C01@cct.lsu.edu> <4756E387.9050702@aei.mpg.de> <324C2464-30CE-48AE-BF9E-EFB22DEDC7F4@cct.lsu.edu> <4757F326.5030409@aei.mpg.de> <47631B6E.8010905@aei.mpg.de> Message-ID: <475D3C89-7ACA-4496-A6F7-0C1323DBC8C3@cct.lsu.edu> On Dec 14, 2007, at 18:10:22, Thomas Radke wrote: > Tom Goodale wrote: >> On Thu, 6 Dec 2007, Thomas Radke wrote: >>> Erik Schnetter wrote: >>> >>>> Here is my file make.subdir, which contains the target "deps" >>> >>> Thanks, Erik ! >>> >>> With this everything builds fine now. I have tested your patch >>> also on three other machines to check whether it would break >>> anything - it didn't. >>> >>> Okay to commit the patch ? >> Not as it stands. There's some extra commented out lines in there >> which I assume Erik has been using for experimentation. Also I >> think the 'deps' stuff is not part of the patch needed for this >> change. Please tidy it up and create the minimal necessary patch >> before committing. > > Okay, here is a version which doesn't have the deps stuff in it > anymore. > > I added a few more comments on the various methods how to create an > archive. > I also increased the threshold of switching from just copying to > incrementally adding to the list of object files to be archived > from 100 files to 1000. This works on the problematic IPM SP5. I think the patch is fine and should be applied. -erik > Index: make.thornlib > =================================================================== > RCS file: /cactusdevcvs/Cactus/lib/make/make.thornlib,v > retrieving revision 1.34 > diff -u -r1.34 make.thornlib > --- make.thornlib 16 Aug 2005 17:26:48 -0000 1.34 > +++ make.thornlib 15 Dec 2007 00:03:52 -0000 > @@ -52,7 +52,7 @@ > LOCAL_SUBDIRS := . $(SUBDIRS) > > # Include all the make.code.defn files for the subdirectories > -# These have to be wrapped to allow us to concatanate all the > +# These have to be wrapped to allow us to concatenate all the > # SRCS definitions, complete with subdirectory names. > # Using -include to prevent warnings the first time the > make.identity files > # need to be made. > @@ -77,9 +77,27 @@ > > $(NAME): $(addsuffix /make.checked, $(SUBDIRS) $(THORNBINDINGS)) > if [ -r $(NAME) ] ; then echo Updating $(NAME) ; else echo > Creating $(NAME) ; fi > - if [ -r $@ ] ; then rm $@ ; fi > - $(AR) $(ARFLAGS) $@ $(OBJS) > + if [ -r $@ ] ; then rm -f $@ ; fi > +### create an archive of the object files > +# > +## This naive method will fail on some machines (eg. IBM SP5) > +## when there are too many object files to be passed on the > command line. > +# $(AR) $(ARFLAGS) $@ $(OBJS) > +# > +## This creates a list of all object files and incrementally > archives them > +## in batches not larger than $(OBJS-words-max) files at a time. > + $(MAKE) -f $(MAKE_DIR)/make.thornlib $(NAME).objectlist > + xargs -n $(OBJS-words-max) $(AR) $(ARFLAGS) $@ < $(NAME).objectlist > + $(RM) $(NAME).objectlist > +## Alternatively, we could create a single object file from the > object > +## files and put it into an archive. > +# ld -r -o $@.o $(OBJS) > +# $(AR) $(ARFLAGS) $@ $@.o > if test "x$(USE_RANLIB)" = "xyes" ; then $(RANLIB) $(RANLIBFLAGS) > $@ ; fi > +## Or we could create a dynamic library the object files. > +## to do: use a two-level namespace > +## (this requires knowing the dependencies of each thorn library) > +# libtool -dynamic -arch_only ppc -o $@ $(OBJS) -flat_namespace - > undefined suppress -single_module > @echo $(DIVIDER) > > # Extra stuff for allowing make to recurse into directories > @@ -98,3 +116,44 @@ > $(addsuffix /make.identity, $(SUBDIRS) $(THORNBINDINGS)): > if [ ! -d $(dir $@) ] ; then $(MKDIR) $(MKDIRFLAGS) $(dir $@) ; fi > echo CCTK_THIS_SUBDIR := $(dir $@) > $@ > + > + > + > +# Create a file containing the names of all object files. > + > +# Since the list may be too long to be passed to a shell, it is split > +# into a set of rules which add lines to a file. This file can later > +# be used via xargs. > + > +OBJS-words = $(words $(OBJS)) > +OBJS-words-max = 1000 > + > +ifeq ($(shell test $(OBJS-words) -le $(OBJS-words-max) && echo 1), 1) > + > +# The list is short. Create the file directly, which is faster. > + > +.PHONY: $(NAME).objectlist > +$(NAME).objectlist: > + echo $(OBJS) > $(NAME).objectlist > + > +else > + > +# The list is long. Create the file via a set of rules, one rule per > +# object file. > + > +OBJS-added = $(OBJS:%=%.added) > + > +.PHONY: $(NAME).objectlist > +$(NAME).objectlist: $(OBJS-added) > + > +# Truncate the file > +.PHONY: $(NAME).objectlist.create > +$(NAME).objectlist.create: > + : > $(NAME).objectlist > + > +# Add a line to the file > +.PHONY: $(OBJS-added) > +$(OBJS-added): $(NAME).objectlist.create > + echo $(@:%.added=%) >> $(NAME).objectlist > + > +endif > _______________________________________________ > Developers mailing list > Developers at cactuscode.org > http://www.cactuscode.org/mailman/listinfo/developers -- Erik Schnetter http://www.cct.lsu.edu/ ~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071217/d79c7b77/attachment.bin From schnetter at cct.lsu.edu Tue Dec 18 00:38:11 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Mon, 17 Dec 2007 23:38:11 -0700 Subject: [Developers] IO::recover = autoprobe In-Reply-To: <476732AD.60904@ea.c.u-tokyo.ac.jp> References: <476732AD.60904@ea.c.u-tokyo.ac.jp> Message-ID: On Dec 17, 2007, at 19:38:37, Luca Baiotti wrote: > Hi, > > I have used (by mistake) the following settings: > > IO::recover = autoprobe > IO::recover_dir = "checkpoint" > IO::recover_file = "checkpoint.chkpt.it_83200" > > while I should have used > > IO::recover = manual > > > However, what should be the behaviour in this case? I think it should > check whether the specified checkpoint file exist and, in case the > checkpoint file was found, recover from it or, otherwise, start from > initial data. But I have experienced a different behaviour, namely > that > the run started from initial data, even if the checkpoint file existed > in the specified directory. > > Could some developer please check? With autoprobe, the parameter recover_file is ignored. Cactus is supposed to look into recover_dir and restart from the latest checkpoint found there. If there is no checkpoint, it is supposed to start from scratch. I am using the autoprobe feature regularly, and it works fine for me. There could have been something wrong with the checkpoint files or with the specification of the directory. Do you still have the output from the run, which should contain explanatory comments of the recovery routine about whether it found checkpoint files and what it was going to do? -erik -- Erik Schnetter http://www.cct.lsu.edu/ ~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071217/bc7852e3/attachment.bin From kellerma at aei.mpg.de Tue Dec 18 07:29:01 2007 From: kellerma at aei.mpg.de (Thorsten Kellermann) Date: Tue, 18 Dec 2007 14:29:01 +0100 Subject: [Developers] CCTK_ORIGIN_SPACE ? Message-ID: <4767CB1D.7010106@aei.mpg.de> Hy all together Can somebody explain me the meaning of this macro, cctk_origin_space ? I guess what it should do, but it's not really clear to me, what its purpose. Thorsten From baiotti at ea.c.u-tokyo.ac.jp Tue Dec 18 03:14:39 2007 From: baiotti at ea.c.u-tokyo.ac.jp (Luca Baiotti) Date: Tue, 18 Dec 2007 18:14:39 +0900 Subject: [Developers] IO::recover = autoprobe In-Reply-To: References: <476732AD.60904@ea.c.u-tokyo.ac.jp> Message-ID: <47678F7F.9050107@ea.c.u-tokyo.ac.jp> Erik Schnetter wrote: > On Dec 17, 2007, at 19:38:37, Luca Baiotti wrote: > >> Hi, >> >> I have used (by mistake) the following settings: >> >> IO::recover = autoprobe >> IO::recover_dir = "checkpoint" >> IO::recover_file = "checkpoint.chkpt.it_83200" >> >> while I should have used >> >> IO::recover = manual >> >> >> However, what should be the behaviour in this case? I think it should >> check whether the specified checkpoint file exist and, in case the >> checkpoint file was found, recover from it or, otherwise, start from >> initial data. But I have experienced a different behaviour, namely that >> the run started from initial data, even if the checkpoint file existed >> in the specified directory. >> >> Could some developer please check? > > > With autoprobe, the parameter recover_file is ignored. Cactus is > supposed to look into recover_dir and restart from the latest checkpoint > found there. If there is no checkpoint, it is supposed to start from > scratch. > > I am using the autoprobe feature regularly, and it works fine for me. > > There could have been something wrong with the checkpoint files or with > the specification of the directory. Do you still have the output from > the run, which should contain explanatory comments of the recovery > routine about whether it found checkpoint files and what it was going to > do? Hi Erik, I attach the stdout, where, however, I cannot see relevant messages from CarpetIOHDF5. Ciao Luca -------------- next part -------------- A non-text attachment was scrubbed... Name: damiana.out Type: application/octet-stream Size: 1465816 bytes Desc: not available Url : http://www.cactuscode.org/pipermail/developers/attachments/20071218/27d69fc8/attachment-0001.obj From hinder at gravity.psu.edu Tue Dec 18 12:16:03 2007 From: hinder at gravity.psu.edu (Ian Hinder) Date: Tue, 18 Dec 2007 13:16:03 -0500 Subject: [Developers] CCTK_ORIGIN_SPACE ? In-Reply-To: <4767CB1D.7010106@aei.mpg.de> References: <4767CB1D.7010106@aei.mpg.de> Message-ID: <47680E63.7090405@gravity.psu.edu> Thorsten Kellermann wrote: > Hy all together > > Can somebody explain me the meaning of this macro, cctk_origin_space ? > I guess what it should do, but it's not really clear to me, what its > purpose. The macro CCTK_ORIGIN_SPACE (and its associated variable, cctk_origin_space) are documented in the Cactus Users' Guide: http://www.cactuscode.org/old/Guides/Stable/UsersGuide/UsersGuideHTML/node94.html Here, it says: "cctk_origin_space An array of cctk_dim CCTK_REALs with the spatial coordinates of the global origin of the grid. The coordinates of the 21#21th local grid point in the 16#16 direction can e.g. in C be calculated by x = CCTK_ORIGIN_SPACE(0) + (cctk_lsh[0] + i) * CCTK_DELTA_SPACE(0)." Now, this brings up several issues: (1) I think the example is wrong. I think it should say x = CCTK_ORIGIN_SPACE(0) + (cctk_lbnd[0] + i) * CCTK_DELTA_SPACE(0). (2) The HTML documentation on the cactuscode.org website contains expressions like the '21#21' above. What are these? While I am on the subject of documentation, something I have wanted for a long time is an online HTML version of the Cactus reference manual. It is currently only available in DVI (!!), PostScript and PDF. Worse, the internal hyperlinks in the PDF version don't seem to work for me, which means one has to scroll through pages and pages manually in order to find the function one wants. Also, the URL above mentions 'old' which is a reference I think to the site redesign that occurred in June 2006. Is there a more modern version? The site redesign is mentioned on the main page as 'recent'. Perhaps this message should be changed? Further, there is a note in the documentation section that says the documentation is for the 'released' version. Are there a significant number of users using the released version? Wow, that is a lot of complaints for one email :) -- Ian Hinder hinder at gravity.psu.edu http://www.gravity.psu.edu/~hinder From schnetter at cct.lsu.edu Tue Dec 18 14:52:58 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 18 Dec 2007 13:52:58 -0700 Subject: [Developers] CCTK_ORIGIN_SPACE ? In-Reply-To: <47680E63.7090405@gravity.psu.edu> References: <4767CB1D.7010106@aei.mpg.de> <47680E63.7090405@gravity.psu.edu> Message-ID: <62CE76C9-A0B1-47B3-84F6-38ABC20C55C4@cct.lsu.edu> On Dec 18, 2007, at 11:16:03, Ian Hinder wrote: > Thorsten Kellermann wrote: >> Hy all together >> >> Can somebody explain me the meaning of this macro, >> cctk_origin_space ? >> I guess what it should do, but it's not really clear to me, what its >> purpose. > > The macro CCTK_ORIGIN_SPACE (and its associated variable, > cctk_origin_space) are documented in the Cactus Users' Guide: > > http://www.cactuscode.org/old/Guides/Stable/UsersGuide/ > UsersGuideHTML/node94.html > > Here, it says: > > "cctk_origin_space > > An array of cctk_dim CCTK_REALs with the spatial coordinates of > the > global origin of the grid. The coordinates of the 21#21th local grid > point in the 16#16 direction can e.g. in C be calculated by x = > CCTK_ORIGIN_SPACE(0) + (cctk_lsh[0] + i) * CCTK_DELTA_SPACE(0)." > > Now, this brings up several issues: > > (1) I think the example is wrong. I think it should say > > x = CCTK_ORIGIN_SPACE(0) + (cctk_lbnd[0] + i) * CCTK_DELTA_SPACE(0). Yes. This has already been corrected in the latex version, it's just that the online pdf and html are out of date. > (2) The HTML documentation on the cactuscode.org website contains > expressions like the '21#21' above. What are these? > > While I am on the subject of documentation, something I have wanted > for > a long time is an online HTML version of the Cactus reference manual. Converting latex to html is very difficult. Usually, equations look just plain ugly, and things like tables or figures don't turn out nice either. Since PDF can be searched, are indexed by google, and have hyperlinks, I would be inclined to only have PDF. > It is currently only available in DVI (!!), PostScript and PDF. > Worse, > the internal hyperlinks in the PDF version don't seem to work for me, > which means one has to scroll through pages and pages manually in > order > to find the function one wants. > > Also, the URL above mentions 'old' which is a reference I think to the > site redesign that occurred in June 2006. Is there a more modern > version? The site redesign is mentioned on the main page as 'recent'. > Perhaps this message should be changed? > > Further, there is a note in the documentation section that says the > documentation is for the 'released' version. Are there a significant > number of users using the released version? > > Wow, that is a lot of complaints for one email :) Thanks for the pointers! -erik -- Erik Schnetter http://www.cct.lsu.edu/ ~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071218/2412e8ae/attachment.bin From schnetter at cct.lsu.edu Tue Dec 18 14:56:27 2007 From: schnetter at cct.lsu.edu (Erik Schnetter) Date: Tue, 18 Dec 2007 13:56:27 -0700 Subject: [Developers] CCTK_ORIGIN_SPACE ? In-Reply-To: <4767CB1D.7010106@aei.mpg.de> References: <4767CB1D.7010106@aei.mpg.de> Message-ID: <42654B53-FE4D-4E5B-9461-F598EA90C806@cct.lsu.edu> On Dec 18, 2007, at 06:29:01, Thorsten Kellermann wrote: > Hy all together > > Can somebody explain me the meaning of this macro, cctk_origin_space ? > I guess what it should do, but it's not really clear to me, what its > purpose. There are two entities, the variable cctk_origin_space, and the macro CCTK_ORIGIN_SPACE. The variable refers to the coarsest grid, the macro to the current refinement level. Without mesh refinement, both are the same. CCTK_ORIGIN_SPACE is the coordinate of the leftmost grid point of the current level, i.e., of the grid point with global index 0 in C. On multiple processors, it is the overall leftmost grid point, not the leftmost grid point of the current processor. -erik -- Erik Schnetter http://www.cct.lsu.edu/ ~eschnett/ My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from www.keyserver.net. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://www.cactuscode.org/pipermail/developers/attachments/20071218/1daac353/attachment.bin From baiotti at ea.c.u-tokyo.ac.jp Tue Dec 18 17:23:37 2007 From: baiotti at ea.c.u-tokyo.ac.jp (Luca Baiotti) Date: Wed, 19 Dec 2007 08:23:37 +0900 Subject: [Developers] CCTK_ORIGIN_SPACE ? In-Reply-To: <62CE76C9-A0B1-47B3-84F6-38ABC20C55C4@cct.lsu.edu> References: <4767CB1D.7010106@aei.mpg.de> <47680E63.7090405@gravity.psu.edu> <62CE76C9-A0B1-47B3-84F6-38ABC20C55C4@cct.lsu.edu> Message-ID: <47685679.2040905@ea.c.u-tokyo.ac.jp> Erik Schnetter wrote: >> While I am on the subject of documentation, something I have wanted for >> a long time is an online HTML version of the Cactus reference manual. > > Converting latex to html is very difficult. Usually, equations look > just plain ugly, and things like tables or figures don't turn out nice > either. Since PDF can be searched, are indexed by google, and have > hyperlinks, I would be inclined to only have PDF. I haven't used it myself, but you may have a look at LateXSL an on-the-fly LaTeX to HTML converter Here is the SourceForge site: http://latexsl.sourceforge.net/ It is written from a friend of ours and it's intended to make it easy for a web page author to put math formulas in a web page, with better-looking results than come from other solutions. It's very much beta software, but already displays lots of LaTeX. Ciao Luca