---
# **F5 PERFORMANCE EXTENSION: TableOfContents and Parameter Space**
---

# 0. Introduction and Scope

This document specifies the **TableOfContents** (ToC) performance extension to the F5
data model. It is independent of and orthogonal to the type extension series
(F5_Extension_1_NamedTypes.md, F5_Extension_2_CombMaps.md). It adds no semantic content
visible to an end user: a file with a TableOfContents is semantically equivalent to
one without. What it provides is **O(1) lookup** for information that would otherwise
require O(timeslices × grids) traversal.

The ToC is **semantically optional but practically essential** for large files.

The fundamental asymmetry: information in the ToC is **trivial to write** — the
writer already knows what it is writing — but **expensive to reconstruct by reading**
— it requires traversing the entire slice hierarchy.

This document also specifies:
- **Parameter attributes** (§2): how parameter values are typed and annotated beyond
  a plain double, with time as the primary example
- **Units** (§2.4): the open problem of unit specification, and the F5 approach
- **Parameter space** (§5): generalisation of the time axis to n dimensions

**Relationship to the core spec:** All rules from F5_Layout.md apply. The ToC is
supplementary. Readers MUST NOT require its presence.

---

# 1. TableOfContents Group Structure

```
/TableOfContents/
    TypeInfo               <- default named type for field grouping (§3)
    <AdditionalTypeInfo>   <- optional additional named types (§3.3)
    Grids/
        <GridName>/        <- one subgroup per Grid (§4)
            F5::TimeTable  <- extendable dataset of parameter-value/path pairs
            <SlicePath>    <- soft link per slice
    Fields/
        <FieldName>/       <- one subgroup per Field name (§4.3)
            <GridName>     <- soft link -> /TableOfContents/Grids/<GridName>
    Parameters/
        <ParamName>/       <- one subgroup per parameter dimension (§5)
```

---

# 2. Parameter Attributes: Typed Values Beyond a Plain Double

## 2.1 The core problem

Every slice group carries one attribute per parameter dimension (§5). The most common
case is a single `Time` attribute. The most naive implementation stores this as a
plain `double`. This is insufficient.

A plain double conflates fundamentally different quantities:
- **Calendar time**: a specific moment on the human calendar (satellite observation
  at 14:32 UTC, August 29, 2005)
- **Simulation coordinate time**: a unitless or physics-unit parameter with no
  calendar meaning (black hole merger at t = 3533 M, where M is the ADM mass)
- **Scaled time**: time measured in units that scale with a physical parameter
  (in numerical relativity, the time unit is often the total mass of the system,
  making it simultaneously "time" and "mass" in geometrized units)

When multiple datasets from different sources are combined — as in the Hurricane
Katrina visualization merging satellite, atmospheric, and storm-surge data — treating
all parameter values as plain doubles leads to silent errors. The Mars Climate Orbiter
failure (1999), caused by mixing imperial and metric units, illustrates the class of
failure that typed parameter values prevent.

## 2.2 The F5 approach: named HDF5 types for parameter attributes

The F5 solution is to store parameter attributes using a **named HDF5 type** rather
than a plain double. The named type is numerically identical to `H5T_NATIVE_DOUBLE`
but carries semantic information as attributes on the type object.

For the `Time` parameter:

```
/TableOfContents/Parameters/Time/
    F5::Time           <- named datatype (= H5T_NATIVE_DOUBLE numerically)
                         Attribute: TimeUnits  <F5_TimeUnits value or string>
                         Attribute: offset     <double; reference epoch if applicable>
                         Attribute: comment    <UTF-8 string; human-readable>
```

The `Time` attribute on each slice group uses this named type:

```
/t=3533.4/
    Attribute: Time    3533.4    (type: F5::Time, TimeUnits=F5_TIME_UNITLESS)
```

Using the named type has two consequences:
1. HDF5's automatic type conversion is suppressed — a reader cannot silently read a
   calendar-time value as a unitless coordinate without explicitly choosing to do so
2. All slice groups share one type definition, so global semantic metadata (units,
   reference epoch) is updated once on the type object and is immediately visible to
   all slices

For non-time parameter dimensions the same pattern applies: a named type derived from
`H5T_NATIVE_DOUBLE` with appropriate unit information as attributes. See §2.4 on the
open problem of units for non-time parameters.

## 2.3 F5_TimeUnits: a suggestive starting point

The paper (Buleu & Benger, NCUR 2007) introduced an enum for time units:

```c
typedef enum {
    F5_TIME_UNSPECIFIED   = 0,  /* unit unknown                              */
    F5_TIME_UNITLESS      = 1,  /* dimensionless coordinate                  */
    F5_TIME_NANOSECONDS   = 2,
    F5_TIME_MICROSECONDS  = 3,
    F5_TIME_MILLISECONDS  = 4,
    F5_TIME_SECONDS       = 5,
    F5_TIME_MINUTES       = 6,
    F5_TIME_HOURS         = 7,
    F5_TIME_DAYS          = 8,
    F5_TIME_YEARS         = 9,
    F5_TIME_MEGAYEARS     = 10,
    F5_TIME_ELECTRONVOLTS = 11, /* quantum mechanics (hbar/eV)               */
    F5_TIME_METRES        = 12  /* c=1 units (light-travel time)             */
} F5_TimeUnits;
```

This enum is **suggestive, not normative, and explicitly open-ended**. It covers the
most common cases encountered in the original implementation but cannot cover all
possible time units. A closed enumeration for units is impossible in principle:

- SI prefixes alone yield very many combinations — each base unit multiplied by
  the full set of SI prefixes — and enumerating them all is impractical, though
  some approaches do attempt it. A more compact approach stores the prefix as a
  numeric scale factor, reconstructable to any precision a floating-point number
  can represent
- Physical scales create non-standard units: in black hole merger simulations, the
  natural time unit is the total ADM mass M of the system — time is simultaneously
  a mass in geometrized units (G=c=1)
- There is a meaningful distinction between "unitless" (time as a pure coordinate
  with no physical scale) and "scalable" (time measured in units that are themselves
  physical quantities varying per simulation)

The enum should be understood as a registry of frequently used values, analogous to
EPSG codes for geoscientific coordinate reference systems or HDF5 filter registration
numbers for compression algorithms: a curated common subset, extensible by convention,
never exhaustive. `F5_TIME_UNSPECIFIED` is always available as the fallback; any value
is more informative than a bare double with no annotation.

The per-field versioning mechanism (§3) ensures that future changes to the unit
representation — including adoption of emerging standards — can be introduced without
invalidating existing files.

## 2.4 Units: an open problem

The general problem of unit specification for scientific data remains unsolved in a
way that satisfies all communities. Two reference points:

**CGNS units** (CFD General Notation System): a comprehensive engineering approach
covering mass, length, time, temperature, angle, and their combinations. Well-adopted
in aerospace and CFD. The approach is enum-based — a closed list of named units —
consistent with engineering practice but contrary to the F5 philosophy of favouring
structure over enumeration. CGNS is a useful reference for coverage and naming
conventions.

**C++ mp-units (P1935, targeting C++29)**: a proposed standard library for physical
quantities and units. Type-safe, dimension-aware, SI-complete. If adopted into the
C++ standard, it will represent the most rigorous software-level unit treatment
available. F5 SHOULD be designed to be compatible with mp-units encoding when it
stabilises, since the per-field versioning mechanism (§3) allows the unit
representation to evolve without breaking existing files.

**Current F5 approach:** The `F5_TimeUnits` enum and a UTF-8 `comment` attribute on
the named type are the current mechanism. This is explicitly a starting point. The
design principle is: *anything is better than a bare double*. A string annotation
that says "solar masses" is more informative than silence, even if it is not
machine-enforceable. A named type that carries any unit information is safer than a
plain double. Subsequent extensions will tighten this as unit standards mature.

String attributes for units on `Parameters/` subgroups are RECOMMENDED as advisory
annotations:

```
/TableOfContents/Parameters/Time/
    Attribute: Units   "M_sun"   (advisory, non-normative)
```

The `F5::Time` named type approach (§2.2) and the string attribute approach are
complementary, not contradictory: the named type provides the machine-readable
mechanism; the string attribute provides the human-readable documentation. Both SHOULD
be present when the unit is known.

## 2.5 Calendar vs. non-calendar time

For calendar time, the `offset` attribute stores a reference epoch. The absolute time
of a slice is `offset + TimeValue`. For unitless simulation time, the `offset`
attribute is absent or zero and is semantically meaningless.

The deeper taxonomy of time — timescales (TAI, UTC, TDT, TCB, UT1), leap seconds,
calendars (Gregorian, Julian, etc.), time zones — is deferred to a future time
semantics extension. The Buleu & Benger (2007) paper provides a roadmap. The current
`F5::Time` implementation is a necessary first step, not a complete solution.

---

# 3. TypeInfo: Field Grouping and Versioning via Named Types

## 3.1 Purpose

The `TypeInfo` named type in the TableOfContents is committed once and shared via
HDF5 named type links by all Fields in the file that are associated with it. Its
values enumerate the field storage types from core spec §8:

```c
typedef enum {
    F5_UNKNOWN_ARRAY_TYPE             = 0,
    F5_CONTIGUOUS                     = 1,
    F5_SEPARATED_COMPOUND             = 2,
    F5_CONSTANT                       = 3,
    F5_FRAGMENTED_CONTIGUOUS          = 4,
    F5_FRAGMENTED_SEPARATED_COMPOUND  = 5,
    F5_DIRECT_PRODUCT                 = 6,
    F5_INDEX_PERMUTATION              = 7,
    F5_UNIFORM_SAMPLING               = 8,
    F5_FRAGMENTED_UNIFORM_SAMPLING    = 9
} F5_TypeInfo;
```

## 3.2 Field-level grouping, not file-level

The `TypeInfo` named type is a field-level grouping mechanism. Fields that reference
the same named `TypeInfo` object belong to the same group — sharing whatever
attributes are attached to that type object.

The most important use of grouping is **field-level versioning**: within one file,
different Fields may have been written by different versions of the F5 library. The
layout convention for connectivity types may differ between library versions. F5
avoids implicit naming conventions by design, so those are not a versioning concern.

HDF5-level compression filters (deflate, szip, etc.) are transparent to readers —
the HDF5 library handles decompression automatically without reader knowledge. These
do not affect TypeInfo grouping.

An F5-level precision transformation is distinct and NOT transparent: a field of
double values (coordinates in particular) may be stored in single precision after
subtracting a numerical offset. This yields a compression factor of 2 while
preserving precision, because the offset removes the large-magnitude part of the
value before truncation to float. It is also beneficial for GPU rendering, where
double-precision is slow and VRAM is precious. However, shader code must apply the
offset explicitly — this transformation cannot be hidden from the reader.

The detection mechanism is a normative attribute on the field or fragment:

```c
#define FIBER_FRAGMENT_NUMERICALSHIFT_ATTRIBUTE  "Fiber::NumericalShift"
```

A reader checks for the presence of `Fiber::NumericalShift` on each field or
fragment and applies the offset before use if found. No TypeInfo version check is
needed for the current mechanism.

However, if a future version supersedes this mechanism in a way that conflicts with
the current one, TypeInfo versioning becomes the discriminator: a field written under
specification version 0.1.5 uses `Fiber::NumericalShift` as defined here, while a
field written under a future version may use a different attribute or encoding. The
TypeInfo version attribute (§3.4) allows a reader to select the correct interpretation
per field rather than per file.

However, versioning is not the only reason to group Fields. Multiple `TypeInfo` named
types with the same version but different user-defined attributes allow grouping by
any property: **provenance** (from which data source or simulation code was a Field
produced), **quality level** (raw vs. post-processed vs. derived), or any application-
defined category. The grouping mechanism is general; versioning is one application.

## 3.3 Multiple TypeInfo types per file

The `/TableOfContents/` group MAY contain multiple named types with TypeInfo
semantics. The type named `TypeInfo` is the **default** — used by all Fields not
requiring special grouping. Additional named types may be committed for Fields
sharing a distinct set of attributes:

```
/TableOfContents/
    TypeInfo          <- default; version 2.0.0; standard layout
    TypeInfo_v1       <- legacy layout from older writer version
    TypeInfo_ADCIRC   <- provenance: Fields from the ADCIRC surge model
    TypeInfo_MM5      <- provenance: Fields from the MM5 atmospheric model
```

All Fields referencing `TypeInfo_ADCIRC` share whatever attributes are on that type
object. A single O(1) write to that type object updates global metadata for all such
Fields simultaneously — regardless of how many datasets reference it. This
underutilised HDF5 capability is particularly powerful for provenance management in
large multi-source files.

Provenance information is especially relevant for file merging and splitting
operations (referenced in core spec §4.3). When merging files from different sources,
an advanced merge tool SHOULD assign source-specific TypeInfo types to the incoming
Fields rather than assigning all Fields the default `TypeInfo`. This preserves
provenance across the merge and allows the merged file to record which Fields
originated from which source. The converse applies to file splitting: fields sharing
a TypeInfo with specific provenance attributes can be extracted as a coherent group.
This is OPTIONAL for simple merge/split tools but RECOMMENDED when provenance matters.

## 3.4 Versioning attributes

Every `TypeInfo` named type SHOULD carry:

```
Attribute: URL      "https://www.fiberbundle.net/F5-0.1.5/"  (ASCII string)
Attribute: version  {0, 1, 5}                                 (int[3])
```

The URL points to the specification version subfolder at fiberbundle.net. Each
released specification version has its own subfolder, so a reader encountering a file
can locate the exact specification document that governed its writing. Additional
`TypeInfo` types SHOULD carry the same attributes with version values appropriate
to their layout context.

---

# 4. Grid and Field Lookup

## 4.1 TimeTable datasets — naming and structure

Each Grid's ToC entry contains one or more extendable HDF5 datasets, each storing
one record per slice for one parameter space. The **name of the dataset encodes the
parameter space**, placing information into structure rather than into a separate
metadata attribute.

**Naming the TimeTable dataset** follows the F5 principle of placing information
into structure rather than into reserved names. Three approaches are available and
all are valid:

**Canonical (parameter-name = dataset-name):** For a single time parameter, the
TimeTable is a 1D extensible dataset named after the parameter:
```
/TableOfContents/Grids/Carpet/Time    Dataset {937/Inf}
```
Compound type: `{double Time, char[56] SliceName}`

**Two separate datasets do NOT model a 2D parameter space** — they model two
independent 1D parameter spaces. A reader cannot determine from two separate datasets
whether a given slice belongs to a point in Time × MassRatio space or to two
independent one-dimensional sequences.

A true **n-dimensional parameter space** is encoded as a single dataset whose
**HDF5 rank encodes the parameter space dimension**, named by joining the parameter
names with `&` in alphabetical order:
```
/TableOfContents/Grids/Carpet/Time&MassRatio    Dataset {1 × N/Inf}
```
Compound type: `{double MassRatio, double Time, char[40] SliceName}`

HDF5 allows only one unlimited dimension. For an appendable 2D parameter space,
the fixed dimension has extent 1 and the unlimited dimension is extended on each
append. If the full parameter space is known in advance (post-simulation assembly),
a proper 2D dataset `{M × N}` may be used without the extent-1 constraint, but
it cannot be extended by append.

The dataset name `Time&MassRatio` makes the parameter space composition explicit
without requiring additional metadata. A reader determines the parameter space
dimension from the dataset rank and identifies the parameters from the compound
type member names and/or the `&`-separated dataset name. The dataset name matches
the parameter attribute names on slice groups joined with `&` (§5.2).

**Nested (group/dataset with identical names):** When scoping soft links per
parameter space is desirable for performance, a subgroup is introduced and the
dataset inside it reuses the group name:
```
/TableOfContents/Grids/Carpet/Time/          <- group
    Time    Dataset {937/Inf}                 <- dataset name = group name
    t=000000000...   -> /t=000000000...       <- links scoped to this param space
```
The redundancy `Time/Time` (group containing dataset of same name) is intentional —
it encodes "this is a parameter-space container" in the structure without reserving a
new name. A reader checks whether `Grids/<GridName>/<ParamName>` is a dataset (flat,
canonical) or a group (nested, scoped links).

**Legacy (`F5::TimeTable`):** The reference implementation uses the reserved name
`F5::TimeTable` for the time dataset. This name is a useful fallback: it covers
variations in time attribute naming (`t`, `time`, `Time`, `TIME`) without requiring
strict matching to a parameter name, and it is recognizable to legacy readers.
Readers SHOULD treat a dataset named `F5::TimeTable` as a TimeTable for the time
parameter. New writers targeting the full parameter-space model SHOULD prefer the
canonical or nested layout.

The three layouts are progressive and non-contradictory. A flat `F5::TimeTable`
file can be read by all readers; a full nested multi-parameter file requires a
reader that implements the parameter space extension.

**The TimeTable dataset and soft links are intentionally redundant.** They serve
distinct purposes:
- The TimeTable dataset is for **high-performance access**: a single sequential read
  yields the complete list of parameter values and slice paths, enabling O(log N)
  binary search without any group traversal.
- The soft links enable **direct human and tool access**: `h5ls` and similar tools
  can navigate to any slice group without reading the TimeTable dataset. They make
  the ToC self-documenting.

This redundancy enables consistency checking: a validator can verify that every
TimeTable entry has a corresponding soft link and vice versa. A missing soft link
whose TimeTable entry references an external file signals a potentially missing
external file — not an F5 error, but a file management issue the user can resolve
by copying the external file. A reader SHOULD distinguish:
- "Slice does not exist": no TimeTable entry and no soft link
- "Slice exists but file is absent": TimeTable entry present, soft link present
  but unresolvable (external file missing)

Both the TimeTable and the soft links record the slice group path as a string.
These strings MUST be textually identical to each other and to the actual slice
group path. A validator SHOULD check this three-way consistency.

**Entry structure:**

```c
typedef struct {
    double  ParameterValue;            /* parameter value for this slice       */
    char    SliceName[entry_size - 8]; /* absolute HDF5 path to slice group    */
} F5_TableEntry;
```

The total entry size SHOULD be a power of two. The reference implementation uses
64 bytes (8 bytes double + 56 bytes path string). For multi-parameter entries where
the compound type stores k doubles, the string length is `entry_size - 8k`:

| Parameters | Double bytes | String bytes | Total |
|------------|-------------|--------------|-------|
| 1          | 8           | 56           | 64    |
| 2          | 16          | 48           | 64    |
| 3          | 24          | 40           | 64    |
| 4          | 32          | 32           | 64    |

Implementations MAY choose a larger power-of-two total (128, 256) for longer paths.
The chunk size SHOULD satisfy `chunk_entries × entry_size = k × filesystem_block`.
The reference implementation uses 1024 entries per chunk (64 KB for 64-byte entries).

The power-of-two size is a performance RECOMMENDATION. The HDF5 API exposes the
type structure at runtime; readers determine layout from the type definition.

## 4.2 Write protocol and ordering

```
/TableOfContents/Grids/<GridName>/
    F5::TimeTable           Dataset {N/Inf}
    t=000000000.0000000000  -> /t=000000000.0000000000
    t=000000003.7750000000  -> /t=000000003.7750000000
    ...
```

When writing a new slice:
1. Extend the parameter's TimeTable dataset by one entry; write the parameter
   value(s) and the slice group path string
2. Create a soft link whose **name is textually identical to the slice path string**,
   pointing to the slice group at that path

The `SliceName` string in the TimeTable entry, the soft link name, and the actual
slice group path MUST all be textually identical and consistent. A validator SHOULD
check this three-way consistency.

Soft links MAY point to slice groups in **external HDF5 files**. In this case the
target is physically absent if the external file is not present, but remains
semantically valid — its existence is asserted by the ToC entry. A physically absent
slice group is not an error; it signals a missing external file, resolvable by file
copy. Readers SHOULD distinguish "slice does not exist" (no ToC entry) from "slice
exists but file is absent" (ToC entry present, link unresolvable).

**Named HDF5 types and external links:** Named HDF5 types do not propagate across
external file links — an external file cannot share the global `/TableOfContents/`
named types of the linking file. The rule is:

- Files **with a ToC** store all named types globally in `/TableOfContents/`. This
  is the preferred layout.
- Files **without a ToC** (ToC is optional) store named types locally within their
  own group hierarchy.
- When a soft link points to an external file, that external file MUST carry its
  own copies of all named types used by its Fields. If the external file has its own
  ToC, its named types are already there. If not, they must be present locally.

A merge tool assembling multiple files into one SHOULD promote all named types into
the merged file's `/TableOfContents/`. Local copies in the merged file are created
only when a genuine structural incompatibility prevents sharing — that is, when two
named types share a name but differ in structure. Version differences alone are NOT
incompatibilities: a merge tool SHOULD read fields written under an older spec
version and write them under the current version, analogous to HDF5 library version
bounds (`H5Pset_libver_bounds`). The per-field TypeInfo versioning (§3.2) enables
the merge tool to record which spec version governed each Field's original layout.

**No sorting is required on write.** Writers append in the order slices are produced.
Readers that need ordered access sort the in-memory copy after loading. Requiring
sorted order would force rewriting the entire dataset on each append for
non-monotonic sequences — a prohibitive cost for interactive applications and
keyframe editors.

For multi-dimensional parameter spaces (§5), the parameter space is encoded as a
single dataset of appropriate rank (§4.1). The concept of "sorted" is ill-defined
for 2D+ parameter spaces with non-regular sampling. HDF5 allows only one unlimited
dimension per dataset, so all TimeTable datasets are extensible along exactly one
axis regardless of the logical parameter space dimension. Sorting within the file
is neither achievable nor required.

Grid ToC groups MAY carry attributes from the actual Grid group:

```
Attribute: Refinement  {1, 1, 1}   <- AMR refinement levels (int array)
```

## 4.3 Field reverse lookup

```
/TableOfContents/Fields/<FieldName>/
    <GridName>   Soft Link {/TableOfContents/Grids/<GridName>}
```

One subgroup per Field name; one soft link per Grid containing that Field. The first
time a Field/Grid pair is written, create the link. Subsequent slices of the same
pair require no action.

---

# 5. Parameter Space

## 5.1 Time as a special case of parameter space

In current F5 files, each slice group carries a single `Time` attribute using the
`F5::Time` named type (§2). The `Parameters/Time/` ToC subgroup documents the
time parameter dimension. This is the base case: a 1-dimensional parameter space.

A time series is a 1D path through an n-dimensional parameter space. Other parameter
dimensions might include:
- Physical parameters: mass ratio, spin, eccentricity (binary merger parameter study)
- Numerical parameters: resolution level, damping coefficient
- Ensemble parameters: random seed, perturbation amplitude

## 5.2 Parameter identification — strict naming

The `Parameters/` subgroup defines the parameter dimensions present in a file. The
subgroup names are the **normative attribute names** that writers MUST use on slice
groups.

**If `/TableOfContents/Parameters/Time/` exists**, then each slice group MUST carry
an attribute named exactly `Time` (case-sensitive). Alternative names such as `time`,
`t`, or `T` are NOT permitted. The parameter name in the ToC and the attribute name
on slice groups MUST match exactly.

This strict naming rule enables O(1) attribute lookup by name and prevents ambiguity
in files that combine data from multiple sources with different naming conventions.

## 5.3 Multiple parameter dimensions and grid base spaces

A Grid over a 1-dimensional parameter space (Time only) and a Grid over a
2-dimensional parameter space (Time + MassRatio) are fundamentally different: they
are fibers over different base spaces. In fiber bundle terms, the base manifold of
the first Grid is 1-dimensional; the base manifold of the second is 2-dimensional.
These Grids MUST NOT be merged or interchanged.

Parameter space identity is determined by the **slice group attributes**, not by
the Grid group. A Grid itself does not know which parameter space it lives on —
analogously to a fiber in a fiber bundle, which does not know which point of the
base space it is attached to. The ToC associates Grids with parameter spaces by
organising slice paths into TimeTable datasets of appropriate rank and compound type.

This separation has a practical consequence: **merging multiple 1D time-series into
a 2D parameter space requires no changes to existing Grid objects**. A merge can be
a zero-data-copy operation using only external links. The merged file contains a new
2D ToC structure whose TimeTable entries reference slice groups in the original
1D files via external links. The Grid groups themselves are untouched.

Example: merging three binary black hole simulations (each a 1D time series at a
different mass ratio q=0.5, q=0.75, q=1.0) into a 2D parameter space:

```
/TableOfContents/Grids/Carpet/
    Time&MassRatio    Dataset {1 × 3×937/Inf}
        { Time=0.0,    MassRatio=0.5,  SliceName="/t=0.0"  }  -> external file 1
        { Time=0.0,    MassRatio=0.75, SliceName="/t=0.0"  }  -> external file 2
        { Time=0.0,    MassRatio=1.0,  SliceName="/t=0.0"  }  -> external file 3
        ...
    t=0.0  -> /t=0.0   (which itself is an external link to the appropriate file)
```

The Grid group MAY carry arbitrary attributes (simulation parameters, command line,
provenance) but these are not F5-normative and do not participate in parameter space
identification.

## 5.4 Multi-parameter slice structure

A slice group in a 2-parameter space carries one attribute per parameter dimension:

```
/t=100.0_q=0.5/
    Attribute: Time       100.0   (type: F5::Time, TimeUnits=F5_TIME_UNITLESS)
    Attribute: MassRatio  0.5     (type: F5::MassRatio or plain double)
```

The attribute type for non-time parameters SHOULD follow the same named-type approach
as `F5::Time` (§2.2) — a double cast into a named type carrying unit and semantic
information. In the absence of a defined named type, a plain double is permitted as a
fallback, with a string `Units` attribute on the corresponding `Parameters/<Name>/`
subgroup as advisory documentation:

```
/TableOfContents/Parameters/MassRatio/
    Attribute: Units   "dimensionless"   (advisory, non-normative)
```

The unit specification problem for arbitrary parameter dimensions is as open as for
time (§2.4). The current approach — named type where available, string annotation as
fallback — is an interim position. Future adoption of a unit standard (such as
C++29 mp-units or an F5-specific extension) will be accommodated via the per-field
versioning mechanism without breaking existing files.

## 5.5 The TimeStep attribute

Integer simulation step counters are often the primary identifier in numerical
simulations. The floating-point time value is derived from the step counter and the
timestep size, and for non-equidistant timestepping, recovering the step counter from
the time value is numerically unstable. A normative optional attribute on slice groups
preserves the step counter directly:

```c
#define FIBER_HDF5_TIMESTEP_ATTRIB  "TimeStep"
```

When present, `TimeStep` is an integer attribute on the slice group recording the
simulation step counter. Its presence is OPTIONAL but RECOMMENDED for simulation
output. A reader SHOULD use `TimeStep` for step-counter arithmetic rather than
deriving it from the `Time` double value.

The `TimeStep` attribute is independent of the `Time` attribute and of the ToC
structure. Both may coexist on the same slice group.

## 5.6 `Parameters/` subgroup content

```
/TableOfContents/Parameters/<ParamName>/
    F5::<ParamName>    <- optional named type for this parameter (RECOMMENDED)
    Attribute: Units   <- optional advisory string (RECOMMENDED when applicable)
```

All content in `Parameters/` subgroups is OPTIONAL. A reader can derive aggregate
information (range, distinct values, count) from the `F5::TimeTable` dataset. Writers
SHOULD NOT be required to maintain synchronised aggregate attributes such as min/max.

## 5.7 Multiple parameter spaces and file merging

A single F5 file may contain multiple distinct one-dimensional parameter spaces with
different semantics — for example, `Parameters/Time_Unitless/` (numerical relativity
simulation time) and `Parameters/Time_JulianDate/` (observational calendar time).
This allows files from different sources to be merged without forcing a common time
type.

For file merging when source files have incompatible time semantics, three strategies
are available:

1. **Create separate parameter spaces** (RECOMMENDED): each source file's time type
   becomes a distinct entry in `Parameters/`. Grids from each source retain their
   original time semantics. The merged file contains both parameter spaces.
2. **Refuse to merge**: safe but inflexible. Appropriate when the merge tool cannot
   determine the relationship between the time types.
3. **Retype to a common type**: lossy; not recommended unless the conversion is exact.

Strategy 1 is the recommended approach because it preserves semantic information,
is structurally supported by the current spec, and allows a downstream reader or
visualization tool to select the appropriate parameter space for its context.

For named type handling during merges, see §4.2. The key principle: version
differences are not incompatibilities; a merge tool promotes all named types into
the merged file's ToC and creates local copies only for genuine structural conflicts.

---

# 6. Reader Behavior

## 6.1 ToC-aware readers

1. Open `/TableOfContents/Parameters/` to determine the parameter space dimensions
2. Open `/TableOfContents/Grids/` to enumerate available Grids — O(1)
3. For each Grid, read its `F5::TimeTable` dataset — one sequential read
4. Sort in memory if needed — O(N log N) once, then O(log N) per query
5. Use `SliceName` to open the target slice group — O(1)
6. Open `/TableOfContents/Fields/` to enumerate available Fields — O(1)

## 6.2 Readers without ToC support

Readers iterate root-group children to find slice groups and discover Grids/Fields
by traversal per core spec §13. Readers MUST NOT require the ToC to be present.

## 6.3 Consistency

The ToC is a derived structure. Inconsistency (missing entries, stale soft links) is
a warning, not a fatal error. Readers MAY fall back to direct traversal if a ToC
entry fails to resolve. Writers MUST ensure ToC entries are appended before closing.

---

# 7. Structural Example

From an actual F5 file (binary black hole evolution, 937 timeslices, Carpet AMR):

```
/TableOfContents/
    TypeInfo               enum { Contiguous=1, ..., FragmentedUniformSampling=9 }
                             Attribute: URL      "https://www.fiberbundle.net/F5-0.1.5/"
                             Attribute: version  {0, 1, 5}

    Grids/
        Carpet/
            Attribute: Refinement  {1, 1, 1}
            F5::TimeTable      Dataset {937/Inf}, 64 bytes/entry, chunk 1024
                                 (legacy name; equivalent to canonical "Time")
                { Time=0.0,     SliceName="/t=000000000.0000000000" }
                { Time=3.775,   SliceName="/t=000000003.7750000000" }
                ...
                { Time=3533.4,  SliceName="/t=000003533.4000000000" }
            t=000000000.0000000000  -> /t=000000000.0000000000
            ...  (soft links: redundant with TimeTable, serve tool/human access)

    Fields/
        Positions/
            Carpet  -> /TableOfContents/Grids/Carpet
        WEYLSCAL4::Psi4R/
            Carpet  -> /TableOfContents/Grids/Carpet
        WEYLSCAL4::Psi4I/
            Carpet  -> /TableOfContents/Grids/Carpet

    Parameters/
        Time/
            F5::Time           (named type: double + TimeUnits=F5_TIME_UNITLESS)
            Attribute: Units   "M"   (ADM mass of the system)
```

---

# 8. Summary of Normative Elements

| Element | Status |
|---|---|
| Named type approach for parameter attributes (§2.2) | RECOMMENDED |
| F5_TimeUnits enum: suggestive open-ended registry (§2.3) | Non-normative |
| Units: string annotation as minimum (§2.4) | RECOMMENDED |
| TypeInfo enum values as specified (§3.1) | Normative |
| Multiple TypeInfo types for grouping (§3.3) | OPTIONAL |
| TypeInfo URL pointing to versioned spec subfolder (§3.4) | RECOMMENDED |
| TypeInfo version int[3] attribute (§3.4) | RECOMMENDED |
| Power-of-two entry size for F5::TimeTable (§4.1) | RECOMMENDED |
| Chunk size as power-of-two multiple (§4.1) | RECOMMENDED |
| Write protocol: append without sorting (§4.2) | Normative |
| Soft link per slice in Grid ToC (§4.2) | Normative |
| Field reverse lookup structure (§4.3) | Normative |
| Strict parameter name matching: ToC name = attribute name (§5.2) | Normative |
| Different parameter-space dimensionalities = different Grids (§5.3) | Normative |
| Parameters/ subgroup per dimension (§5) | RECOMMENDED |
| All Parameters/ content optional (§5.5) | Normative |
| ToC presence: readers MUST NOT require it (§6.2) | Normative |
| ToC consistency (§6.3) | Normative |

---

# 9. Literature and Background

**Time semantics in HDF5:**
Buleu, A.E.; Advisor: Benger, W.: "An Ontological Scheme for Specifying Time in
HDF5." Proceedings of The National Conference on Undergraduate Research (NCUR) 2007,
Dominican University of California, San Rafael, California, April 12-14, 2007.
Center for Computation and Technology, Louisiana State University.

**Units standards for reference:**
CGNS (CFD General Notation System): https://cgns.github.io — comprehensive unit
coverage for aerospace/CFD, enumeration-based approach.
C++ mp-units (P1935R5): https://mpusz.github.io/mp-units/ — proposed C++29 standard
for physical quantities and units, type-safe and SI-complete.

**Hurricane Katrina visualization (motivating application):**
Venkataraman, S., Benger, W., Long, A., Jeong, B., Renambot, L.: "Visualizing
Hurricane Katrina: large data management, rendering and display challenges."
GRAPHITE 2006, Malaysia, pp. 209-212.

**Carpet AMR framework (motivating application for the ToC):**
Schnetter, E., Hawley, S.H., Hawke, I.: "Evolutions in 3D numerical relativity using
fixed mesh refinement." Classical and Quantum Gravity 21(6), pp. 1465-1488 (2004).

**F5 data model:**
Benger, W. (2005): PhD thesis, FU Berlin / ZIB.
https://www.fiberbundle.net/papers/TensorFieldViz.pdf

Benger, W. et al. (2009): GraVisMa 2009, Plzen, pp. 80-89.
http://gravisma.zcu.cz/GraVisMa-2009/Papers_2009/!_2009_GraVisMa_proceedings-FINAL.pdf

---

## Usage Restriction

This specification and all documents in the F5 specification series are published for
academic and personal use. Use of this specification, or any implementation derived
from it, by employees of military or defense-related organizations, or within
facilities producing weapons or conducting research on weapon design, or for any other
military purpose, is explicitly prohibited and contrary to the intent of the authors.

This restriction follows the spirit and terms of the light++ license under which the
original F5 reference implementation was published. The rationale is stated there
directly: software is technology, technology conveys power, and the inventor bears
responsibility for deciding to whom that power is granted. This specification was
developed to advance scientific understanding — not to enable harm.

See: https://www.fiberbundle.net/doc/copyright.html

---

# **End of Document**

---
