FiberBundleHDF5  $Id: FiberHDF5.dfg,v 1.8 2006/12/12 12:32:50 werner Exp $
File Format Definition

Introduction

The purpose of this page is to define the file format.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. (http://www.ietf.org/rfc/rfc2119.txt)

File Elements

F5 will never use attribute names starting with '_'. User extensions SHOULD use names starting with '_' until the extension is included in F5.

F5 will ignore all hdf5 elements (groups, datasets, ...) starting with '_'. Applications are free to use such names for things which are never planned to be included into F5.

Root element

The root element constains a group version with an attribute version which is a url indicating the file format version. Using a url is inspired by the dtd naming scheme used by xml. (Note: it is not possible to attach a attribute to the root element itself. Otherwise this would be an appropriate place).

All urls MUST be limited to 511 characters + 1 trailing 0 to make it easier for applications to process the files.

Applications SHOULD check if urls respect this limit.

For now the version is of the format:

  • version = "http://www.zib.de/visual/F5-MAJOR.MINOR.RELEASE/"

The domain f5.org should be registered.

Toplevel Groups

The root level of an F5 files contains two reserved groups called

  • FIBER_CONTENTS ("/TableOfContents"), which contains a reverse mapping of objects contained in the F5 file for an alternative way of browsing. This group is RECOMMENDED and can be re-created from other information in the file, but is some cacheing mechanism. An implementation should check the time stamp information to see if this group is in sync with the other data.
  • FIBER_HDF5_GLOBAL_CHARTS ("Charts"), which contains a list of coordinate systems including named type definitions which are used throughout the file. This group is REQUIRED, as it is the place to store named types.
  • Parameter space slices and grid groups:
    • The group FIBER_CONTENTS / FIBER_PARAMETER_SPACE ("/TableOfContents/Parameters/") contains an arbitrary number of arbitrarily named subgroups. Each of these subgroup names defines an parameter in the parameter space. If this RECOMMENDED subgroup does not exist, then backward compatibility with version 0.1.0 assumes that there is one parameter called FIBER_HDF5_TIME_ATTRIB ("Time"). This corresponds to an F5 entry named "/TableOfContents/Parameters/Time/". Any of these parameter subgroups may contain arbitrary user-specific semantic information and is not used by F5.
    • Having defined parameters of the parameter space, we distinguish among two type of group entries in the root level:
      • groups with at least one attribute which is named as one of entries of the parameter group; these groups are called parameter slices
      • groups which contain no such attribute; these groups are called grid groups
    • parameter slice groups may contain other parameter slice groups as well as grid groups.
    • the name of parameter slice groups and grid groups is arbitrary, except that groups in the top level must not be named FIBER_CONTENTS or FIBER_HDF5_GLOBAL_CHARTS . According to HDF5, they MUST NOT contain slashes '/', and it is RECOMMENDED that they conform to identifiers rules as in C, i.e. don't contain spaces or special characters.
    • a grid group may also be a symlink to another grid group; a parameter slice group may never be a symlink.

Time Slice

Note that time slices are replaced by parameter slices in F5 version 0.1.1 (see below).

  • Naming: 'T=%f' (RECOMMENDED)
  • hdf5 Type: Group (REQUIRED)
  • Attributes:
    • Time: class float (REQUIRED)
  • Definition (v.0.1.0): Any toplevel group with an attribute "Time" of a type that is convertible to float is a "Time Slice". Groups that do not conform may have another meaning (/Charts, /TableOfContents) or may be completely ignored. In general, such non-timeslice groups contain (secondary/deriveable), but global information.
  • In a future extension, time slices might be ordered hierarchically, such that a time slice group contains multiple other time slice groups that correspond to times up to the next time slice group on the same lever. This will allow more efficient browsing of datasets with a huge number of timesteps and less "important" data on finer time resolution, e.g. time-refined adaptive mesh refinements.

Parameter Slice

  • Naming: 's=f' for each parameter s with value f (RECOMMENDED)
  • hdf5 Type: Group (REQUIRED)
  • Attributes:
    • at least one attributes of class float (REQUIRED) named according to the parameter names in the parameter space ("/TableOfContents/Parameters/"). Example:
      /TableOfContents/Parameters/Time/               GROUP
      /TableOfContents/Parameters/Pressure/           GROUP
      /T=0.4/                                         GROUP                   (Parameter slice group)
      /T=0.4/Time                                     FLOAT ATTRIBUTE                 
      /T=0.4/P=15/                                    GROUP                   (Parameter slice group)
      /T=0.4/P=15/Pressure                            FLOAT ATTRIBUTE
      /T=0.4/P=15/SimulationData/                     GROUP                   (Grid group)
      

Grids

  • Naming:
    • SHOULD make sense to humans.
    • SHOULD only contain alphanumerics.
    • SHOULD start with character.
  • hdf5 Type: Group (REQUIRED)
  • Attributes:
    • MUST NOT contain any attributes which are named as in the parameter space (v. 0.1.1) or a floating point time attribute (v. 0.1.0)
    • "TimeStep" (RECOMMENDED) Useful for data originating from subsequent simulations. It may be omitted on interpolated grid objects to specify that these are secondary data that can be reproduced from grids with Timestep information. All fields per grid object must refer to the same integer timestep. If a Grid object does not provide such timestep information, it may be an interpolated grid that can be computed from others that contain such timestep information. The timestep value by itself is arbitrary, but must be unique among all Grid objects with the same Grid identifier.
  • Definition: Any group beyond a time slice group is a grid group.

Fields

Fields SHOULD use the same naming conventions as grids do.

The name of a field MUST be unique. If fields in more than one grid have the same name, they are different representations of the same data. E.g. a field might be stored as a uniform scalar field. At the same time this field might be evaluated on a surface and stored as a surface field in another grid. The name of these two fields MUST be the same. This enables an application to change the data of the uniform field and know that it has to change the data of the surface field at the same time. If the application is not able to understand the surface representation it SHOULD warn the user and MIGHT want to remove the surface representation.

Each F5 field might be represented by more than one hdf5 dataset each with the name of the dataset and more supporting hdf5 datasets/groups.

hdf5 datasets are stored in fortran order.

The basic structure of the grid is described by the field named 'Positions'. This is a reserved keyword and MUST NOT be used by any other field.

Each field has one or more F5 ContentTypes. A field ContentType describes how the field is represented by hdf5 objects. The field ContentType is an url.
ContentTypes defined in the standard will start with the version url of the file (see above). The next part will point to a specific part of the specification together with its version.

User extension ContentTypes use other urls.

A field ContentType could e.g. be

  • "http://www.zib.de/visual/F5-MAJOR.MINOR.PATCH/Regular-MAJOR.MINOR.PATCH/StandardCartesian/Uniform/VertexCentered/"

A field with name "ImageData" with this ContentType is represented as:

Points/                                         Group
Points/StandardCartesianChart3D/                Group
Points/StandardCartesianChart3D/Positions       Group
    Attribute: DataspaceDims {3}
        Type:      32-bit big-endian integer
        Data:  30, 20, 10
    Attribute: base      scalar
        Type:      struct {
                   "x"                +0    IEEE 32-bit big-endian float
                   "y"                +4    IEEE 32-bit big-endian float
                   "z"                +8    IEEE 32-bit big-endian float
               } 12 bytes
        Data:  {-1, -1, -1}
    Attribute: delta     scalar
        Type:      struct {
                   "x"                +0    IEEE 32-bit big-endian float
                   "y"                +4    IEEE 32-bit big-endian float
                   "z"                +8    IEEE 32-bit big-endian float
               } 12 byte
        Data:  {0.2, 0.1, 0.0666667}
    Attribute: extent    {3}
        Type:      32-bit big-endian integer
        Data:  10, 20, 30
Points/StandardCartesianChart3D/ImageData       Dataset {30/30 20/20 10/10}
    Type:     

For a strict definition of all the attributes, see below. Note that the values in extent and DataspaceDims == dims(Dataset) are swapped because the data are stored in fortran order.

The loosest ContentType is

  • "http://www.zib.de/visual/F5-MAJOR.MINOR.PATCH/" Which only indicates that this field is stored in F5 fileformat and not in something else. An application might well figure out what the data represent but it is not explicitly defined for this field.

A field ContentType is just a string in compliance with the url naming scheme. It doesn't enforce any hierachy though it might be useful to have one.

Users MUST have some kind of access to the domain they're using for user defined contraints to avoid name space pollution.

All ContentTypes of each field (including the Positions) are listed in as attributes of a group Fields/Fieldname. F5 uses the content of the hdf5 attributes of this group no matter of their name. The first ContentType should be named ContentType000. The next ContentType001 and so on.

The group Field is also useful as a directory of the fields contained in this grid.

An application SHOULD try to deal with all the the listed ContentTypes. Think of the full field ContentType being the sum of all the F5 field contraints. It might be possible to ignore some of the contraints during readonly access without loosing relevant information (e.g. if a field is stored in two representations, e.g. cartesian and polar, which could be mapped to each other) but this is not guaranteed. An application SHOULD warn the user about F5 field contraints it is ignoring. It is not safe to ignore contraints during write access on an existing file (when modifying or appending data). If the application does not understand the semantics of a ContentType it will not be able to correctly modify the data. If write access is required nonstheless, the application might consider saving the data to a new file including only the contraints it is aware of. Another way could be to delete all data not used by the contraints the application is aware of. This should only be done after user confirmation.

The ContentType information for the above example would be:

Fields                                                        Group
Fields/Positions                                              Group
Fields/Positions/                                             Group
    Attribute: ContentType {STRING}
        "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/"
Fields/ImageData                                              Group
Fields/ImageData/                                             Group
    Attribute: ContentType {STRING}
        "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/VertexCentered/"

Basic building blocks

Simple dataset

Examples: curvilinear coordinates, values of a scalar field.

Storing a value for every node in fortran order.

F5 Field ContentTypes

Each listentry describes a template for the Positions part of a field type and the additional data needed to store a field at these positions. The trees reside in a F5 Grid hdf5 group.

  • "VertexCentered StandardCartesian Uniform"
    • REQUIRED for Positions
      Fields                                                        Group
      Fields/Positions                                              Group
      Fields/Positions/                                             Group
          Attribute: ContentType {STRING}
              "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/"
      Points/                                         Group
      Points/StandardCartesianChart3D/             Group
      Points/StandardCartesianChart3D/Positions    Group
          Attribute: DataspaceDims {3}
              Type:  class H5T_INTEGER 
              Data:  {extent in z, extent in y, extent in x}
          Attribute: base      scalar
              Type:      struct {
                         "x"                class H5T_FLOAT
                         "y"                class H5T_FLOAT
                         "z"                class H5T_FLOAT
                     }
              Data:  {basex, basey, basez}
          Attribute: delta     scalar
              Type: same type as Attribute base
              Data:  {deltax, deltay, deltaz}
          Attribute: extent    {3}
              Type:  class H5T_INTEGER 
              Data:  {extent in x, extent in y, extent in z}
      
      • Note that the compound datatype used for base and delta MUST be built up of three entries named x, y and z. Using exactly these names indicates that this is a cartesian chart. Using this order indicates that it is a standard chart. A more general type could use another ordering of x, y and z. The reference to detect a Standard Cartesian chart is the type of the base attribute, not the name of the group. Though the name of the group is also fixed a more general implementation need not to check for the name of the group. But it has to check for the compound type used to store positions
      • Note that the ordering of the entries in DataspaceDims and extent is reversed indicating that the hdf5 datasets are stored in fortran order.
    • REQUIRED for MyField
      Fields/MyField                                                Group
      Fields/Positions/
          Attribute: ContentType {STRING}                                 Group
              "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/VertexCentered/"
      Points/StandardCartesianChart3D/MyField    Dataset {extentz, extenty, extentx}
          Type: any hdf5 type, see Conventions for compound types.
      
    • RECOMMENDED for Positions
      Points/StandardCartesianChart3D/Positions    
          Attribute: components_min scalar
              Type: same type as Attribute base
              Data:  {minx, miny, minz}
          Attribute: components_max scalar
              Type: same type as Attribute base
              Data:  {maxx, maxy, maxz}
      
    • RECOMMENDED for MyField
      Points/StandardCartesianChart3D/MyField    
          Attribute: components_min scalar
              Type: same type as MyField hdf5 dataset 
              Data: minimum of each component 
          Attribute: components_max scalar
              Type: same type as MyField hdf5 dataset 
              Data: maximum of each component 
      
  • "VertexCentered StandardCartesian zStacked"
  • "VertexCentered StandardCartesian Rectilinear"
  • "VertexCentered StandardCartesian Curvilinear"
  • "VertexCentered StandardPolar Uniform"
    • Required for Positions
      Fields                                                        Group
      Fields/Positions                                              Group
      Fields/Positions/...
      Points/                                         Group
      Points/StandardPolarChart3D/                 Group
      Points/StandardPolarChart3D/Positions        Group
          Attribute: DataspaceDims {3}
              Type:  class H5T_INTEGER 
              Data:  {extent in phi, extent in theta, extent in r}
          Attribute: base      scalar
              Type:      struct {
                         "r"                class H5T_FLOAT
                         "theta"            class H5T_FLOAT
                         "phi"              class H5T_FLOAT
                     }
              Data:  {baser, basetheta, basephi}
          Attribute: delta     scalar
              Type:      struct {
                         "r"                class H5T_FLOAT
                         "theta"            class H5T_FLOAT
                         "phi"              class H5T_FLOAT
                     }
              Data:  {baser, basetheta, basephi}
          Attribute: extent    {3}
              Type:  class H5T_INTEGER 
              Data:  {extent in r, extent in theta, extent in phi}
      
      Required for MyField
      Fields/MyField                                                Group
      Fields/MyField/VertexCentered StandardPolar Uniform           SHOULD be Group
      Points/StandardPolarChart3D/MyField    Dataset {extentphi, extenttheta, extentr}
          Type: any hdf5 type, see Conventions for compound types.
      
  • "VertexCentered StandardPolar zStacked"
  • "VertexCentered StandardPolar Rectilinear"
  • "VertexCentered StandardPolar Curvilinear"