Category
Import and Export
Function
Imports datasets from an external HDF5 data file.
Syntax
result, max_index = ImportHDF5 (filename, origin, thickness, stride, index,
reopen, single_precision, user, password,
subject, num_streams, vectorimport);
Inputs
| Name | Type | Default | Description |
|---|---|---|---|
| filename | string | (none) | filename or URL of the HDF5 file to import datasets from |
| origin | integer list or vector | NULL | lower-left corner grid point of the slab to read |
| thickness | integer list or vector | NULL | thickness in grid points of the slab to read |
| stride | integer list or vector | NULL | include every <stride>-th grid point in the slab to read |
| index | integer or string | 0 | index or name of the dataset to import |
| reopen | flag | 0 | reopen file on each execution |
| single_precision | flag | 1 | import double precision floating-point datasets as single precision | user | string | (none) | user name for standard Ftp authentification during remote file access | password | string | (none) | password for standard Ftp authentification during remote file access | subject | string | (none) | subject name for GSI authentification during remote file access | num_streams | integer | 1 | number of parallel streams to use during remote file access | vectorimport | flag | 0 | import the dataset as a vector array |
Outputs
| Name | Type | Description |
|---|---|---|
| result | field | the dataset imported as a field with regular positions and connections |
| max_index | integer | largest possible dataset index |
Functional Details
The module imports data from a single dataset of an HDF5 datafile and provides it as a field with a position-dependent data component described on a regular grid.
The imported data can comprise the full N-dimensional dataset in the file or just a slab of it. A slab is an orthogonal subregion within the full dataset, potentially with less than N dimensions, and defined by its origin, thickness, and stride parameters. If the vectorimport flag is set, the last dimension of the dataset is treated as a vector index, and the dataset is imported as (N-1)-dimensional array (or less, if a slab is selected), with each element being a vector.
There might be several datasets in an HDF5 datafile which are identfied by a unique index starting from 0. The ImportHDF5 module imports data from one dataset at a time, selected by its index or name.
The datafile can be either an HDF5 file on a local filesystem or a remote HDF5 file streamed via a live socket connection or provided by a remote GridFtp server. Each HDF5 datafile must satisfy the following conditions:
Since OpenDX fields require their data component to be either TYPE_INT, TYPE_FLOAT, or TYPE_DOUBLE, the ImportHDF5 module silently ignores any other types of datasets found within the HDF5 datafile.
Depending on a dataset's floating-point precision the data component in the created OpenDX field will have a data type of either float (for single-precision data) or double (for double-precision data). Double-precision data can also be converted on-the-fly into single precision data during the import.
The origin and delta attributes must be vectors of N floating-point numbers where N denotes the number of dimensions of the dataset. Their values should describe the dataset's underlying sample space as a regular grid.
The ImportHDF5 module uses the attributes's values as N-dimensional regular arrays in order to construct the positions and bounding box components for the imported field. If a dataset has no origin or delta attribute, the module will use a vector with default values of N zeros or N ones repectively.
Note: The connections component for the imported field will be constructed from the actual dimensions of the selected slab to read (not from the dataset's dimensions). If a slab was selected with a thickness of 1 in a given dimension the connections between points in that dimension will be eliminated (thus effectively decreasing the connection component's dimensions and changing the imported field's connections type accordingly).
When an HDF5 datafile is opened, the module browses through its contents in order to build a list of all floating-point datasets available in the file. Individual datasets can then be addressed by their index into that list. Additionally, if datasets have a time attribute attached to them (as is usually the case for a time series HDF5 datafile) the list will be sorted by their values. The value of the largest possible dataset index is available on the max_index output tab of the module.
On each module's execution, the data from a single dataset, as selected either by its index or name, is read from the HDF5 datafile - either completely as a full dataset or as a slab according to the origin, thickness, and stride slab parameters as explained below.
In addition to the four data, positions, connections, and bounding box components, the imported field will also get a copy of all the attributes attached to the selected dataset, with the same values, data type, and dimensions as their original. Additionally, if no name attribute existed in the selected dataset the ImportHDF5 module will create a string attribute with the name of the dataset as found in the HDF5 datafile and also add it to the imported field.
filename |
A required string parameter to specify the name of the HDF5 file to import datasets from. The HDF5 file can be
|
origin |
An optional parameter to specify the origin of a slab to read from a requested dataset. This parameter must be a list of integers or a vector of integer elements, giving the coordinates in grid points of the slab's start positions within the dataset. The positions are counted starting at 0 and must be in the range [0, dimsi-1] where dimsi is the size of the dataset in dimension i. For each dataset dimension, a separate position coordinate can be specified; 0 is taken as the default coordinate for unspecified dimensions; if origin has more elements than the number of dataset dimensions, exceeding elements are ignored and a warning message is printed. If origin is given as NULL (which is the default value for this parameter) then the slab's origin is taken as all zeros (ie. the slab starts at the dataset's origin). |
thickness |
An optional parameter to specify the thickness of a slab to read from a requested dataset. This parameter must be a list of integers or a vector of integer elements, giving the slab's size in grid points. The values for thickness must be in the range [0, dimsi-origini] where dimsi is the size of the dataset in dimension i, and origini is the slab's origin in that dimension. For each dataset dimension, a separate thickness can be specified; if thickness has more elements than the number of dataset dimensions, exceeding elements are ignored and a warning message is printed. For unspecified dimensions, if thicknessi is given as zero, or if thickness is given as NULL (which is the default value for this parameter), the slab's thickness defaults to dimsi-origini. |
stride |
An optional parameter to specify downsampling factors for a slab to read from a requested dataset. The parameter must be a list of integers or a vector of integer elements, giving the number of grid points to move in each dimension to get to the next grid point to be included in the slab. The values for stride must be >= 1 so that at least one grid point will be included. For each dataset dimension, a separate stride can be specified; 1 is taken as the default stride for unspecified dimensions; if stride has more elements than the number of dataset dimensions, exceeding elements are ignored and a warning message is printed. If stride is given as NULL (which is the default value for this parameter) then the slab's stride is taken as all ones (ie. the slab includes thicknessi grid points). |
index |
A required integer parameter to specify the index of the dataset to import, or a string parameter to specify the name of the dataset to import. If index is an integer, its value must be in the range [0, max_index] where max_index is the total number of datasets found in the HDF5 file, minus 1. This value is available on the max_index output. |
reopen |
An optional hidden flag parameter to specify whether the HDF5 file as given by filename should be reopened at each module execution (disabled by default). The reopen flag should be set for HDF5 files which contents might change during time (as is usually the case for streamed HDF5 files). |
single_precision |
An optional hidden flag parameter to specify whether double-precision floating-point datasets should be converted into single precision during the import (enabled by default). The single_precision flag must be reset in order to preserve the floating-point precision during an import of double-precision datasets. |
user |
An optional hidden string parameter to specify the user name for standard Ftp authentification during remote file access to a GridFTP server. If no user name is specified, ftp (for the anonymous user) will be used. |
password |
An optional hidden string parameter to specify the password for standard Ftp authentification during remote file access to a GridFTP server. The password must be given in clear text. If no password is specified, anonymous (for the anonymous user) will be used. |
subject |
An optional hidden string parameter to specify the subject name for GSI authentification during remote file access to a GridFTP server. If no subject name is specified, the subject name of the user's GSI proxy will be used. |
num_streams |
An optional hidden integer parameter to specify the number of parallel streams to use during remote file access. By default only a single stream will be used for each connection to a GridFtp server. |
vectorimport |
An optional hidden flag parameter to specify whether the dataset should be imported as a vector array (diabled by default) The vectorimport flag must be set in order to preserve dimension during an import of vector datasets (e.g., position, velocity, ... etc.). |
See Also
Import, Slab
Example Visual Programs
SlabViz.net
The example program is contained in the net/ subdirectory of the OpenDXutils package. Please make sure that you change into this directory before running the program because it uses an HDF5 sample data file located relative to that directory.
Further Documentation
The ImportHDF5 data import module is provided by the OpenDXutils package. More information about this package can be found on the Cactus Code visualization page for OpenDX.
Last modified: $Header: /cactus/CactusWebSite/VizTools/ImportHDF5.html,v 1.6 2006/09/12 08:52:32 tradke Exp $