Routine Description Via The Scheduler ------------------------------------- Author: Erik Schnetter, Tom Goodale, + Date: December 2002 Status: Idea kicked about on developer mailing list The below is simply an archive of emails following Eriks suggestion for a project to annotate routine descriptions in the Cactus scheduler. Erik: ---- In the course of discussions with Scott Hawley and Ian Hawke, and also earlier in discussions with Richard G%/1€Œiso8859-15ünther here in T%/1€Œiso8859-15übingen, the idea came about to annotate routine descriptions in the schedule with a list of what Fortran would call intent(in), intent(out), and intent(inout) specifications for grid variables. With this list, one could check whether the preconditions for calling a certain routine have been met. Ideally, this list would be written in a formalised way, so that it can be checked automatically by the scheduler. The scheduler could use it when calculating the schedule (making certain "after" and "before" declarations superfluous), or could use it to check a schedule for consistency. The checks would also catch dependency errors between different bins. Without AMR, these depencencies can be managed, but are becoming increasingly difficult to understand -- think of MoL, gauge conditions, boundary conditions, metric states, and whatever other complications currently exist, such as coupling geometry and hydro. With AMR, the scheduling bins are traversed several times on different grid levels, and it is basically impossible to ensure that the call graph calculated by the scheduler is correct. This is even more so as currenly no intent specifications exist, not even manually written ones. That the formal declarations in the schedule.ccl file are correct could even be checked by the driver: intent(in) variables can be checksummed to detect changes, and intent(out) variables can be initialised to nans. Variables that a routine should neither read nor write can be copied away to a safe place and then be deallocated. This would help ensure that the declarations are correct. It remains to find a good way to specify these declarations. Declaring intents is only a beginning, because one wants to distinguish between the interior and the boundary of grids, and wants to include the conformal state of the metric, and because the intents depend on certain state variables (such as shift-has-storage). I think it would form a nice computational science project for a Master's thesis or part of a PhD thesis. Tom: --- Another thing to remember is that with the friends mechanism the person writing the schedule.ccl doesn't have the information to produce the all intent stuff. I know we discussed intent in/out when we were coming up with the ccl in the first place - I'll see if I can find any of the notes - I don't think we actually came to any conclusion about it in the end, apart from the problem with friends and that we couldn't support it in f77. You've certainly taken the idea much further. This is also related to issues of more efficient parallelisation - if you know when a variable is next going to be actually used, rather than just passed in because it is on the variable list, you can then sync it any time between the sync in the scheduler and the next time it is used, thus allowing more things to be synced at once. I'm not sure what you mean by getting rid of certain before/after statements - I know with such a scheme we have more info about variable use, but I can't see how that affects the schedule. If you can come up with a nice example that would be good. Obviously some of this, such as checksums, is only useful for the debugging phase, but translating schedule intents into const statements for C and intent statements for Fortran would be generally useful and would obviously help the optimiser to do a better job. I'll definitely put that on the list for 4.1. As for syntax, I guess something like schedule foo blah blah blah { INTENT-IN:var1, var2, group1, group2, var3 INTENT-OUT: group3, var4, ... } ... would work for the simple case. For the case where you want to differentiate between boundaries and interiors things of course become far more complex, and such things could only be verified with checksumming, as there is no language support to declare part of an array constant. Having the intents depend on some condition is again something there is no language support for, so would revert to checksumming. Both these cases negate the optimiser advantage which intents could give you. On the implementation side this of course requires separate CCTK_ARGUMENTS macros per routine, which is not a pleasant thing to contemplate when different routines in the same source file could have different argument lists, which is probably why we dropped the idea for 4.0. On the other hand we could just remove CCTK_ARGUMENTS and have a different argument macro per routine - e.g. CCTK_ARGUMENTS_FOO DECLARE_CCTK_ARGUMENTS_FOO The alternative would be to enhance the perl cpp to actually parse the source files and work out which subroutine is using CCTK_ARGUMENTS and expanding it that way, but that requires a lot more work, 'though it is cleaner for a thorn developer. One argument for the separate macros is that it cleans up the fortran<->C calling, making it more obvious to people that arguments may change. (Incidently either of these mechanisms could also be used for the specific-argument-list enhancement whereby you specify an explicit or semi-explicit list of vars in the schedular for some routines.) John Shalf: ---------- There is an additional benefit to these annotations for grid variables. They will allow the scheduler to offer a longer period of time to put the network transfers in the background. Currently, the network transfers are partly back-grounded because the iSend/iRecv is used to set up the boundary exchanges from a Sync/SyncGroup operation. However, because the variables might be used immediately after the Sync completes, the entire sync operation must be completed before it can continue (so there is little opportunity to completely hide the latency of this boundary exchange behind some computation). If, however, the scheduler knows the next time that a variable is actually required, then it could choose to defer the MPI_WaitAll() operation for the async transfer until the routine that requires that variable is scheduled. This kind of scheduling intelligence wouldn't net you much on a tightly coupled MPP, but it might be useful for very high-latency environments (like wide-area grid computing jobs) provided that the iSend/iRecv combination is truly asynchronous.