C2 Infrastructure Thorns

Chapter C2
Infrastructure Thorns

Concepts and terminology (Overloading and registration of functions)
The cGH structure — what it is and how to use it
Extending the cGH structure
Querying group and variable information
Providing an I/O layer
Providing a communication layer
Providing a reduction operator
Providing an interpolation operator
Overloadable functions

C2.1 Concepts and Terminology

C2.1.1 Overloading and Registration

The flesh defines a core API which guarantees the presence of a set of functions. Although the flesh guarantees the presence of these functions, they can be provided by thorns. Thorns do this by either the overloading or the registration of functions.

Overloading

Some functions can only be provided by one thorn. The first thorn to overload this function succeeds, and any later attempt to overload the function fails. For each overloadable function, there is a function with a name something like CCTK_Overload... which is passed the function pointer.

Registration

Some functions may be provided by several thorns. The thorns register their function with the flesh, and when the flesh-provided function is called, the flesh calls all the registered functions.

C2.1.2 GH Extensions

A GH extension is a way to associate data with each cGH. This data should be data that is required to be associated with a particular GH by a thorn.

Each GH extension is given a unique handle.

C2.1.3 I/O Methods

An I/O method is a distinct way to output data. Each I/O method has a unique name, and the flesh-provided I/O functions operate on all registered I/O methods.

C2.2 GH Extensions

A GH extension is created by calling CCTK_RegisterGHExtension, with the name of the extension. This returns a unique handle that identifies the extension. (This handle can be retrieved at any time by a call to CCTK_GHExtensionHandle.)

Associated with a GH extension are three functions

SetupGH: this is used to actually create the data structure holding the extension. It is called when a new cGH is created.
InitGH: this is used to initialise the extension. It is called after the scheduler has been initialised on the cGH.
ScheduleTraverseGH: this is called whenever the schedule tree is due to be traversed on the GH. It should initialise the data on the cGH and the call CCTK_ScheduleTraverse to traverse the schedule tree.

C2.3 Overloadable and Registerable Functions in Main


Function	Default

CCTK_Initialise

CCTK_Evolve

CCTK_Shutdown

C2.4 Overloadable and Registerable Functions in Comm


Function	Default

CCTK_SyncGroup

CCTK_SyncGroupsByDirI

CCTK_EnableGroupStorage

CCTK_DisableGroupStorage

CCTK_EnableGroupComm

CCTK_DisableGroupComm

CCTK_Barrier

CCTK_Reduce

CCTK_Interp

CCTK_ParallelInit

C2.5 Overloadable and Registerable Functions in I/O


Function	Default

CCTK_OutputGH

CCTK_OutputVarAsByMethod

C2.6 Drivers

The flesh does not know about memory allocation for grid variables, about how to communicate data when synchronisation is called for, or about multiple patches or adaptive mesh refinement. All this is the job of a driver.

This chapter describes how to add a driver to your code.

C2.6.1 Anatomy

A driver consists of a Startup routine which creates a GH extension, registers its associated functions, and overloads the communication functions. It may optionally register interpolation, reduction, and I/O methods.

A driver may also overload the default Initialisation and Evolution routines, although a simple unigrid evolver is supplied in the flesh.

C2.6.2 Startup

A driver consists of a GH extension, and the following overloaded functions.

CCTK_EnableGroupStorage
CCTK_DisableGroupStorage
CCTK_ArrayGroupSizeB
CCTK_QueryGroupStorageB
CCTK_SyncGroup
CCTK_SyncGroupsByDirI
CCTK_EnableGroupComm
CCTK_DisableGroupComm
CCTK_Barrier
CCTK_OverloadParallelInit
CCTK_OverloadExit
CCTK_OverloadAbort
CCTK_OverloadMyProc
CCTK_OverloadnProcs

The overloadable function CCTK_SyncGroup is deprecated, a driver should instead provide a routine to overload the more general function CCTK_SyncGroupsByDirI.

C2.6.3 The GH Extension

The GH extension is where the driver stores all its grid-dependent information. This is stuff like any data associated with a grid variable (e.g. storage and communication state), how many grids if it is AMR, ... It is very difficult to describe in general, but one simple example might be

struct SimpleExtension
{
  /* The data associated with each variable */
  /* data[var][timelevel][ijk]                */
  void ***data
} ;

with a SetupGH routine like

struct SimpleExtension *SimpleSetupGH(tFleshConfig *config, int conv_level, cGH *GH)
{
   struct SimpleExtension *extension;

   extension = NULL;

   if(conv_level < max_conv_level)
   {
      /* Create the extension */
      extension = malloc(sizeof(struct SimpleExtension));

      /* Allocate data for all the variables */
      extension->data = malloc(num_vars*sizeof(void**));

      for(var = 0 ; var < num_vars; var++)
      {
        /* Allocate the memory for the time levels */
        extension->data[var] = malloc(num_var_time_levels*sizeof(void *));

        for(time_level = 0; time_level < num_var_time_level; time_level++)
        {
          /* Initialise the data to NULL */
          extension->data[var][time_level] = NULL;
        }
      }
    }

   return extension;
}

Basically, what this example is doing is preparing a data array for use. The function can query the flesh for information on every variable. Note that scalars should always have memory actually assigned to them.

An InitGH function isn’t strictly necessary, and in this case, it could just be a dummy function.

The ScheduleTraverseGH function needs to fill out the cGH data, and then call CCTK_ScheduleTraverse to have the functions scheduled at that point executed on the grid

int SimpleScheduleTraverseGH(cGH *GH, const char *where)
{
  int retcode;
  int  var;
  int  gtype;
  int  ntimelevels;
  int  level;
  int  idir;

  extension = (struct SimpleExtension *)GH->extensions[SimpleExtension];

  for (idir=0;idir<GH->cctk_dim;idir++)
  {
    GH->cctk_levfac[idir] = 1;
    GH->cctk_nghostzones[idir] = extension->nghostzones[idir];
    GH->cctk_lsh[idir]         = extension->lnsize[idir];
    GH->cctk_gsh[idir]         = extension->nsize[idir];
    GH->cctk_bbox[2*idir]      = extension->lb[extension->myproc][idir] == 0;
    GH->cctk_bbox[2*idir+1]    = extension->ub[extension->myproc][idir]
                              == extension->nsize[idir]-1;
    GH->cctk_lbnd[idir]        = extension->lb[extension->myproc][idir];
    GH->cctk_ubnd[idir]        = extension->ub[extension->myproc][idir];

  }

  for(var = 0; var < extension->nvariables; var++)
  {
    gtype = CCTK_GroupTypeFromVarI(var);
    ntimelevels = CCTK_MaxTimeLevelsVI(var);

    for(level = 0; level < ntimelevels; level++)
    {
      switch(gtype)
      {
        case CCTK_SCALAR :
          GH->data[var][level] = extension->variables[var][level];
          break;
        case CCTK_GF     :
          GH->data[var][level] =
            ((pGF ***)(extension->variables))[var][level]->data;
          break;
        case CCTK_ARRAY :
          GH->data[var][level] =
            ((pGA ***)(extension->variables))[var][level]->data;
          break;
        default:
          CCTK_WARN(CCTK_WARN_ALERT,"Unknown group type in SimpleScheduleTraverse");
      }
    }
  }

  retcode = CCTK_ScheduleTraverse(where, GH, NULL);

  return retcode;

}

The third argument to CCTK_ScheduleTraverse is actually a function which will be called by the scheduler when it wants to call a function scheduled by a thorn. This function is given some information about the function to call, and is an alternative place where the cGH can be setup.

This function is optional, but a simple implementation might be

int SimpleCallFunction(void *function,
                       cFunctionData *fdata,
                       void *data)
{
  void (*standardfunc)(void *);

  int (*noargsfunc)(void);

  switch(fdata->type)
  {
    case FunctionNoArgs:
      noargsfunc = (int (*)(void))function;
      noargsfunc();
      break;
    case FunctionStandard:
      switch(fdata->language)
      {
        case LangC:
          standardfunc = (void (*)(void *))function;
          standardfunc(data);
          break;
        case LangFortran:
          fdata->FortranCaller(data, function);
          break;
        default :
          CCTK_WARN(CCTK_WARN_ALERT, "Unknown language.");
      }
      break;
    default :
      CCTK_WARN(CCTK_WARN_ALERT, "Unknown function type.");
  }

  /* Return 0, meaning didn’t synchronise */
  return 0;
}

The return code of the function signifies whether or not the function synchronised the groups in this functions synchronisation list of not.

The flesh will synchronise them if the function returns false.

Providing this function is probably the easiest way to do multi-patch or AMR drivers.

C2.6.4 Memory Functions

These consist of

CCTK_EnableGroupStorage
CCTK_DisableGroupStorage
CCTK_QueryGroupStorageB
CCTK_ArrayGroupSizeB

En/Disable Group Storage

These are responsible for switching the memory for all variables in a group on or off. They should return the former state, e.g. if the group already has storage assigned, they should return 1.

In our simple example above, the enabling routine would look something like

int SimpleEnableGroupStorage(cGH *GH, const char *groupname)
{

  extension = (struct SimpleExtension *)GH->extensions[SimpleExtension];

  if(extension->data[first][0][0] == NULL)
  {
    for(var = first; var <= last; var++)
    {
      allocate memory for all time levels;
    }
    retcode = 0;
  }
  else
  {
    retcode = 1;
  }

  return retcode;
}

The disable function is basically the reverse of the enable one.

The QueryGroupStorage function basically returns true or false if there is storage for the group, and the ArrayGroupSize returns the size of the grid function or array group in a particular direction.

En/Disable Group Comm

These are the communication analogues to the storage functions. Basically, they flag that communication is to be done on that group or not, and may initialise data structures for the communication.

C2.7 I/O Methods

The flesh by itself does not provide output for grid variables or other data. Instead, it provides a mechanism for thorns to register their own routines as I/O methods, and to invoke these I/O methods by either the flesh scheduler or by other thorn routines.

This chapter explains how to implement your own I/O methods and register them with the flesh.

C2.7.1 I/O Method Registration

All I/O methods have to be registered with the flesh before they can be used.

The flesh I/O registration API provides a routine CCTK_RegisterIOMethod() to create a handle for a new I/O method which is identified by its name (this name must be unique for all I/O methods). With such a handle, a thorn can then register a set of functions (using the CCTK_RegisterIOMethod*() routines from the flesh) for doing periodic, triggered, and/or unconditional output.

The following example shows how a thorn would register an I/O method, SimpleIO, with routines to provide all these different types of output.

  void SimpleIO_Startup (void)
  {
    int handle = CCTK_RegisterIOMethod ("SimpleIO");
    if (handle >= 0)
    {
      CCTK_RegisterIOMethodOutputGH (handle, SimpleIO_OutputGH);

      CCTK_RegisterIOMethodTimeToOutput (handle, SimpleIO_TimeToOutput);
      CCTK_RegisterIOMethodTriggerOutput (handle, SimpleIO_TriggerOutput);

      CCTK_RegisterIOMethodOutputVarAs (handle, SimpleIO_OutputVarAs);
    }
  }

C2.7.2 Periodic Output of Grid Variables

The flesh scheduler will automatically call CCTK_OutputGH() at every iteration after the CCTK_ANALYSIS time bin. This function loops over all I/O methods and calls their routines registered as OutputGH() (see also Section C1.2.3).

int SimpleIO_OutputGH (const cGH *GH);

The OutputGH() routine itself should loop over all variables living on the GH grid hierarchy, and do all necessary output if requested (this is usually determined by I/O parameter settings).

As its return code, it should pass back the number of variables which were output at the current iteration, or zero if no output was done by this I/O method.

C2.7.3 Triggered Output of Grid Variables

Besides the periodic output at every so many iterations using OutputGH(), analysis and output of grid variables can also be done via triggers. For this, a TimeToOutput() routine is registered with an I/O method. This routine will be called by the flesh scheduler at every iteration before CCTK_ANALYSIS with the triggering variable(s) as defined in the schedule block for all CCTK_ANALYSIS routines (see Section C1.5.4).

If the TimeToOutput() routine decides that it is now time to do output, the flesh scheduler will invoke the corresponding analysis routine and also request output of the triggering variable(s) using TriggerOutput().

int SimpleIO_TimeToOutput (const cGH *GH, int varindex);
int SimpleIO_TriggerOutput (const cGH *GH, int varindex);

Both routines get passed the index of a possible triggering grid variable.

TimeToOutput()

should return a non-zero value if analysis and output for varindex should take place at the current iteration, and zero otherwise.

TriggerOutput()

should return zero for successful output of variable varindex, and a negative value in case of an error.

C2.7.4 Unconditional Output of Grid Variables

An I/O method’s OutputVarAs() routine is supposed to do output for a specific grid variable if ever possible. It will be invoked by the flesh I/O API routines CCTK_OutputVar*() for unconditional, non-triggered output of grid variables (see also Section C1.7.2).

A function registered as an OutputVarAs() routine has the following prototype:

int SimpleIO_OutputVarAs (const cGH *GH, const char *varname, const char *alias);

The variable to output, varname, is given by its full name. The full name may have appended an optional I/O options string enclosed in curly braces (with no space between the full name and the opening curly brace). In addition to that, an alias string can be passed which then serves the purpose of constructing a unique name for the output file.

The OutputVarAs() routine should return zero if output for varname was done successfully, or a negative error code otherwise.

C2.8 Checkpointing/Recovery Methods

Like for I/O methods, checkpointing/recovery functionality must be implemented by thorns. The flesh only provides specific time bins at which the scheduler will call thorns’ routines, in order to perform checkpointing and/or recovery.

This chapter explains how to implement checkpointing and recovery methods in your thorn. For further information, see the documentation for thorn CactusBase/IOUtil.

C2.8.1 Checkpointing Invocation

Thorns register their checkpointing routines at CCTK_CPINITIAL and/or CCTK_CHECKPOINT and/or CCTK_TERMINATE. These time bins are scheduled right after all initial data has been set up, after every evolution timestep, and after the last time step of a simulation, respectively. (See Section C1.2.3 for a description of all timebins). Depending on parameter settings, the checkpoint routines decide whether to write an initial data checkpoint, and when to write an evolution checkpoint.

Each checkpoint routine should save all information to persistent storage, which is necessary to restart the simulation at a later time from exactly the same state. Such information would include

the current settings of all parameters
the contents of all grid variables which have global storage assigned and are not tagged with checkpoint="no" (see also Section D2.2.4 on page D20 for a list of possible tags)
Note that grid variables should be synced before writing them to disk.
other relevant information such as the current iteration number and physical time, the number of processors, etc.

C2.8.2 Recovery Invocation

Recovering from a checkpoint is a two-phase operation for which the flesh provides distinct timebins for recovery routines to be scheduled at:

CCTK_RECOVER_PARAMETERS: This time bin is executed before CCTK_STARTUP, in which the parameter file is parsed. From these parameter settings, the recovery routines should decide whether recovery was requested, and if so, restore all parameters from the checkpoint file and overwrite those which aren’t steerable.
The flesh loops over all registered recovery routines to find out whether recovery was requested. Each recovery routine should, therefore, return a positive integer value for successful parameter recovery (causing a shortcut of the loop over all registered recovery routines), zero for no recovery, or a negative value to flag an error.
If recovery was requested, but no routine could successfully recover parameters, the flesh will abort the run with an error message. If no routine recovered any parameters, i.e. if all parameter recovery routines returned zero, then this indicates that this run is not a recovery run.
If parameter recovery was performed successfully, the scheduler will set the recovered flag which—in combination with the setting of the Cactus::recovery_mode parameter—decides whether any thorn routine scheduled for CCTK_INITIAL and CCTK_POSTINITIAL will be called. The default is to not execute these initial time bins during recovery, because the initial data will be set up from the checkpoint file during the following CCTK_RECOVER_VARIABLES time bin.
CCTK_RECOVER_VARIABLES: Recovery routines scheduled for this time bin are responsible for restoring the contents of all grid variables with storage assigned from the checkpoint.
Depending on the setting of Cactus::recovery_mode, they should also decide how to treat errors in recovering individual grid variables. Strict recovery (which is the default) requires all variables to be restored successfully (and will stop the code if not), whereas a relaxed mode could, e.g. allow for grid variables, which are not found in the checkpoint file, to be gracefully ignored during recovery.

C2.9 Clocks for Timing

To add a Cactus clock, you need to write several functions to provide the timer functionality (see Section C1.9.1), and then register these functions with the flesh as a named clock.

The function pointers are placed in function pointer fields of a cClockFuncs structure. The fields of this structure are are:

create: void *(*create)(int)
destroy: void (*destroy)(int, void *)
start: void (*start)(int, void *)
stop: void (*stop)(int, void *)
reset: void (*reset)(int, void *)
get: void (*get)(int, void *, cTimerVal *)
set: void (*set)(int, void *, cTimerVal *)
n_vals: int

The first int argument of the functions may be used in any way you see fit.

The n_vals field holds the number of elements in the vals array field of the cTimerVal structure used by your timer (usually 1).

The return value of the create function will be a pointer to a new structure representing your clock.

The second void* argument of all the other functions will be the pointer returned from the create function.

The get and set functions should write to and read from (respectively) a structure pointed to by the cTimerVal* argument:

typedef enum {val_none, val_int, val_long, val_double} cTimerValType;

typedef struct
{
  cTimerValType type;
  const char *heading;
  const char *units;
  union
{
    int        i;
    long int   l;
    double     d;
  } val;
  double seconds;
  double resolution;
} cTimerVal;

The heading field is the name of the clock, the units field holds a string describing the type held in the val field, and the seconds field is the time elapsed in seconds. The resolution field is the smallest non-zero difference in values of two calls to the timer, in seconds. For example, it could reflect that the clock saves the time value internally as an integer value representing milliseconds.

To name and register the clock with the flesh, call the function

CCTK_ClockRegister( "my_clock_name", &clock_func ).

[prev] [prev-tail] [front] [up]

Chapter C2Infrastructure Thorns