---
# **F5 MASTER SPECIFICATION **  
---

# 0. Introduction and Scope

This document defines the canonical F5 layout model for reading, validating, traversing, and interpreting F5 HDF5 files. It is the authoritative specification and supersedes all heuristics, naming conventions, or legacy assumptions. All semantics are explicit and identity‑based.

The F5 model is:

- deterministic  
- backend‑agnostic  
- semantically explicit  
- time‑aware  
- fragment‑aware  
- chart‑aware  
- index‑space‑aware  
- geometry‑driven rather than storage‑driven  

This specification is intended for:

- code generators  
- validators  
- readers and writers  
- tooling authors  
- researchers  
- anyone implementing F5 semantics  

The canonical hierarchy is:

Timeslice  
→ Grid  
→ Skeleton  
→ Representation  
→ Field  
→ Fragment datasets

Each level has strict semantics and cannot be reordered or extended with additional hierarchy layers.

---

# 1. Canonical Hierarchy

The F5 hierarchy is structurally fixed:

Timeslice  
  └── Grid  
        └── Skeleton  
              └── Representation  
                    └── Field  
                          └── Fragment datasets

Each level has a specific role:

- Timeslice: temporal grouping, identified by a scalar Time attribute
- Grid: collection of skeletons representing one physical domain
- Skeleton: topological structure (vertices, edges, faces, cells, etc.) 
- Representation: coordinate or relative representation of a skeleton
- Field: data defined over the skeleton’s index space
- Fragment: contiguous or structured subset of a field

Additional hierarchy levels MUST NOT be inserted.

Group names do not define semantics unless explicitly stated.

---

# 2. Timeslices

## 2.1 Definition

A root‑level group is a timeslice if and only if it has a scalar attribute named:

Time

The attribute MUST be convertible to a double‑precision real.  
Group names are irrelevant.

## 2.2 Acceptable Time values

The Time attribute MAY be stored as:

- integer  
- floating‑point  
- text that parses unambiguously to a floating‑point value  

It MUST be scalar.  
Non‑scalar values (arrays, compound types) are malformed.

## 2.3 Behavior on malformed or missing Time

- Missing Time → group is not a timeslice
- Presence of a Time attribute makes the group a candidate timeslice; candidate timeslices are validated per 12.1.1.
- File processing MUST not abort solely due to malformed timeslice attributes, see 12.1.2 for propagation rules.

## 2.4 Ordering

Valid timeslices are ordered by ascending numeric Time.  
Tie‑breaking (if needed): lexicographic group name.

## 2.5 Future generalization

The model MAY later support multidimensional parameter slices.  
This is not implemented and MUST not be assumed.

---

# 3. Timeslice Merging

If multiple root groups convert to the same numeric Time, they represent the same physical timeslice and MUST be merged.

## 3.1 Merge semantics

### Identification
Groups with identical converted Time values form a merge set.

### Logical identity
The merged timeslice uses:

- the numeric Time value  
- a canonical name = lexicographically smallest group name in the merge set  

### Children union
Child groups are merged recursively:

- unique names → included directly  
- duplicate names → recursively merged  

### Attribute resolution
If an attribute appears in multiple merged groups:

- if semantically equal → keep  
- if different → warning; choose value from lexicographically smallest group  

### Dataset merging
If datasets share the same name:

- if dataspaces or semantic types differ → fatal error
- otherwise → treat as the same logical dataset

### Diagnostics
Record:

- merged group names  
- attribute conflicts  
- fatal inconsistencies  

## 3.2 Ordering after merge

Merged logical timeslices are ordered by ascending numeric Time.

---

# 4. Grids

A Grid groups skeletons that belong to the same physical domain within a timeslice.

## 4.1 Identification

A Grid MAY define the attribute:

F5::GridID

If present, this is the canonical identifier for the grid.  
If absent, the grid’s group name is used as a fallback identifier, and a warning is recommended.

Grids with the same identifier across timeslices represent the same physical grid evolving in time.

## 4.2 Structure

A Grid MUST contain at least one Skeleton subgroup.  
A Grid MAY also contain a Charts subgroup defining local charts.

## 4.3 Merging grids across files

When loading multiple files:

- Grids with the same identifier at the same timeslice MUST be merged.  
- Child skeletons, representations, fields, and datasets are merged by name.  
- Identical dataset paths refer to the same logical dataset; if the datasets differ in dataspace or type, this is a fatal error.  
- Fragment datasets without offsets are concatenated in file load order.  
- Overlapping fragments produce a warning; last file wins.

## 4.4 Deterministic traversal

Traversal order of grids is:

1. by F5::GridID if present  
2. otherwise lexicographic by group name  

This ordering is for traversal only; it has no semantic meaning.

---

# 5. Skeletons

A Skeleton describes a topological structure such as vertices, edges, faces, or cells.

## 5.1 Required attributes

A Skeleton MUST define:

F5::SkeletonDimensionality   (int)  	mandatory
F5::rank                     (int)  	recommended
IndexDepth                   (int)		mandatory

Missing mandatory attributes are a fatal error.
Name‑based inference is permitted only when importing non‑F5 or legacy formats that rely on conventional naming (e.g., “Points”, “Vertices”, “Triangles”).
This mechanism MUST never be applied to native F5 files.

Writers of F5 files MUST not rely on naming conventions to convey semantics.
Although names carry no semantics in the F5 model, writers of native F5 files are strongly encouraged to choose human‑interpretable names for Skeletons and Representations.
This improves readability and usability without introducing semantic meaning.


If  F5::rank is absent, readers MUST infer rank from F5::SkeletonDimensionality.
Missing rank is a warning, not an error.

## 5.2 Optional attributes

FiberLib::FragmentLayout     (int vector)  
FiberLib::NumberOfFragments  (int)  
Refinement                   (int vector)

Refinement (int vector) describes per‑dimension refinement factors.
Its length MUST equal F5::SkeletonDimensionality.
Each element specifies the refinement factor along that dimension.
The Skeleton’s refinement level is defined as max(Refinement).
If the Refinement attribute is absent, the Skeleton’s refinement level is 0.

## 5.3 Semantics

- IndexDepth: number of indexing traversals required to reach vertices.  
- SkeletonDimensionality: geometric or embedding dimension.  
- rank: a structural descriptor of the Skeleton’s dimensionality and refinement behavior.
It is redundant with SkeletonDimensionality but useful for consistency checking and for readers ingesting non‑HDF5 sources.
rank is an advisory attribute for writers and readers; it carries no semantic meaning for interpretation. Readers MAY use rank for traversal or presentation, but MUST not rely on it for semantic interpretation.


## 5.4 Structural rules

- Skeletons are not nested.  
- A Grid MAY contain multiple Skeletons.  
- Missing optional attributes produce warnings, not errors.  
- Skeletons define the index space for all fields under their representations.

## 5.5 Deterministic ordering

Skeleton ordering has no semantic meaning and is not required for any F5 operation. Readers MAY traverse Skeletons in any order. For reproducibility, readers that choose a deterministic traversal SHOULD use the following keys in order: IndexDepth, SkeletonDimensionality, Refinement, lexicographic group name.

## 5.6 Examples

Vertices: IndexDepth = 0  
Edges: IndexDepth = 1, SkeletonDimensionality = 1  
Faces: IndexDepth = 1, SkeletonDimensionality = 2  
Cells: IndexDepth = 1, SkeletonDimensionality = 3  
Higher nesting: IndexDepth = 2 or more

---

# 6. Representations and Charts

A Representation is a subgroup of a Skeleton that defines how the skeleton’s elements are interpreted geometrically or relationally.

A subgroup of a Skeleton is a Representation if its name matches either:

1. a Chart name (coordinate representation), or 
2. another Skeleton name (relative representation).

If the relative representation relates to a Skeleton into another Grid or Timeslice, then rules apply according to 6.5, i.e. its group name is overridden by the F5::Reference attribute.

If a subgroup name matches both a Chart name and a Skeleton name, the Chart match takes precedence 
and the subgroup is interpreted as a coordinate representation.
A warning SHOULD be issued.

Naming a Chart identically to a Skeleton (or vice versa) is a modeling error and MUST be prevented by writers.
If such a conflict appears in an HDF5 file, the file is malformed.
Readers MUST treat the subgroup as a coordinate representation (Chart precedence) and issue a warning.

Representations MAY reference Skeletons that appear later in the file or in other merged files.
Resolution of relative representations occurs after all Skeletons in the Grid have been loaded.
If a referenced Skeleton cannot be resolved at that stage, this is a fatal error.

Subgroups of a Skeleton that match neither a Chart name nor a Skeleton name are ignored with a warning, unless they contain an F5::Reference attribute (see above), in which case they are treated as explicit references.

HDF5 is a random-access hierarchical store.
Readers MUST not assume any ordering of groups or datasets.
Skeletons MUST be resolved independently of traversal order, see 6.6.

Recovery mode is a non-normative, reader-controlled error-tolerance mode in which readers MAY attempt best-effort recovery from modeling errors (for example, chart/skeleton name conflicts). Recovery mode is not part of the F5 normative model; readers MUST document when they operate in recovery mode and record diagnostics.


## 6.1 Coordinate Representations

A coordinate representation maps skeleton elements to coordinates in a Chart.

### 6.1.1 Charts

Charts live under:

/Charts/

Charts define:

- coordinate systems  
- named datatypes  
- precision variants  
- optional default datatype  

### 6.1.2 Local Charts

Each Grid MAY define local charts under:

<Grid>/Charts/

Local charts MUST contain a GlobalChart attribute pointing to the corresponding chart under /Charts/.

### 6.1.3 Transformations

Transformations between charts are represented as subgroups.  
Both directions MUST be present for a complete transformation pair.

## 6.2 Relative Representations

A relative representation maps skeleton elements to indices of another skeleton.

Example:

Faces/Points

Meaning: faces are defined by indices into points.

Relative representations define connectivity but do not define geometry.

## 6.3 Units and axes

Units and axis order come from named datatypes, not from representations.  
Representations do not define units.

## 6.4 Default Chart (clarified)

If a Grid does not define any Chart:

- A default "StandardCartesianChart3D" chart MUST be assumed.  
- This behavior is mandatory for backward compatibility.  
- A warning is recommended but not required.  
- Explicit charts are strongly preferred.

## 6.5 Cross-Grid / Cross-Timeslice Skeleton references

F5::Reference (optional attribute on a Representation group) -  an HDF5 object reference or path that points to the target Skeleton group (which MAY be in another Grid or Timeslice). When present, the Representation is a relative representation to the referenced Skeleton regardless of the Representation's name. Cross-timeslice or cross-grid references MUST use F5::Reference.
When F5::Reference is present, name-matching is not used for Representation type determination.
A writer SHOULD use a name that hints at the destination Skeleton, e.g. "Points_T=0" -> for reference to /T=0/<Grid>/Points . This is merely for human-readability.

## 6.5.1 Representation References

F5::Reference either an HDF5 object reference or a string attribute.
A string attribute containing a full path is recommended because it allows easier name resolution for readers.
An HDF5 object reference (H5Rcreate_object(), H5R_OBJECT) MAY be used to better utilize the underlying HDF5 capabilities, but requires readers to identify Skeletons via a lookup table from HDF5 IDs rather than HDF5 path.
A path in the string attribute MUST always be absolute. HDF5 does not support the notion of a parent group such as ".." like in a filesystem, thus relative links cannot be modeled here (Skeleton groups are always parental to Representation groups).


## 6.6 Skeleton discovery and resolution strategy
## 6.6.1 Local discovery requirement
Readers MUST collect and index all Skeletons that are children of the same Grid before resolving Representations that reference Skeletons within that Grid.

## 6.6.2 Cross-Grid and cross-Timeslice references
Representations MAY reference Skeletons in other Grids or Timeslices via explicit references (see F5::Reference). When a Representation references a Skeleton outside its local Grid or Timeslice, readers MUST resolve that reference before treating the Representation as valid.

## 6.6.3 On-demand global discovery
Readers MAY discover Skeletons across Grids and Timeslices lazily (on demand) rather than scanning the entire file(s) up front. On-demand discovery is the recommended default for large datasets because it avoids unnecessary I/O and memory use.

## 6.6.4 Optional eager discovery mode
Readers MAY offer an optional eager (pre-scan) mode that enumerates all Skeletons across all Grids and Timeslices before resolving Representations. This mode is permitted but not required; it is intended for tools that prioritize global analysis or human-readable diagnostics over minimal I/O.

## 6.6.5 Hybrid strategy 
Readers MAY implement a hybrid strategy: perform a light metadata pass to collect skeleton identifiers and cheap attributes, then perform full discovery only for Skeletons that are actually referenced or requested by the application.

## 6.6.6 Implementation guidance
## 6.6.6.1 Dependency graph
Readers that resolve cross-references SHOULD build a dependency graph of Skeletons and Representations. 
Use the graph to:
- determine resolution order,
- detect missing targets, and
- identify cycles in derivation or reference chains.

## 6.6.6.2 Cycle detection 
Readers MUST detect cycles in cross-Skeleton or cross-Representation references. On cycle detection, readers MUST abort the resolution chain for the cycle, mark the involved entities as unresolved/partial, and emit a diagnostic describing the cycle.

## 6.6.6.3 Caching and memory management 
Readers that perform on-demand discovery SHOULD cache resolved Skeleton metadata and index structures to avoid repeated I/O. Caches SHOULD be bounded and evictable to limit memory usage for very large files.

## 6.6.6.4 Diagnostics and fallbacks 
If a referenced Skeleton cannot be resolved (missing, invalid, or in an inaccessible file), readers MUST:
- treat the referencing Representation as invalid or partial according to the fatal-error propagation rules in 12.1.2, and
- emit a clear diagnostic indicating the unresolved reference and its origin.
Readers MAY provide configurable fallback behavior (for example, best-effort local derivation) but such fallbacks are implementation choices and MUST be documented by the reader.

## 6.6.6.5 Non-Normative Performance considerations
- Prefer on-demand discovery for large datasets and interactive use.
- Use eager discovery for batch analysis or when the application explicitly requests a global view.
- When resolving references across files, prefer metadata-only operations first (existence checks, attributes) before loading large datasets.

## 6.6.6.6 Normative summary
Readers MUST collect local Grid Skeletons before resolving local Representations.
Readers MUST resolve explicit cross-Grid or cross-Timeslice references before accepting a Representation as valid.
Readers MAY perform global Skeleton discovery eagerly, but SHOULD default to on-demand discovery for efficiency.
Readers MUST detect and handle cycles, cache metadata prudently, and emit diagnostics for unresolved references.


---

# 7. Fields

A Field is a child of a Representation and attaches data to the skeleton’s index space.  
Fields describe *what* is stored, not *how* it is geometrically interpreted — geometry comes from the Positions field.

## 7.1 Identification

Fields are semantically identified by their datatype.
If multiple fields in the same Representation share the same datatype, their field names MAY be used to disambiguate. However, when the datatype alone is insufficient or unavailable to determine algebraic or geometric interpretation, TypeInfo is required (see 8.5).

Field names have no semantic meaning within F5 (Positions is the sole exception, as F5 assigns it normative structural meaning). They exist only to distinguish multiple fields of identical datatype. F5 is agnostic to any additional semantics an application may attach to field names; applications MAY use field names to encode application-specific meaning such as 'Velocity', 'Temperature', or 'Color', and F5 tools MUST neither require nor reject such names.

### 7.1.1 Positions
Positions is the only field with special semantics.

In coordinate representations, Positions MUST use the coordinate datatype defined by the Chart and encode geometric coordinates.

In relative representations, Positions MUST be an integer array whose elements index into the target Skeleton’s index space.
The arity of each entry (e.g., 2 for edges, 3 for triangles) is determined by the topological structure of the target Skeleton.

The two uses of Positions are distinguished solely by the type of Representation (coordinate vs. relative).

### 7.1.2 Requirement rule for omitted Positions and Fields

A Representation MUST contain exactly one Positions field unless the Positions values are intentionally omitted. Omission of Positions is permitted and is not by itself a fatal error.

### 7.1.2.1 Reader obligations

If a Representation contains a Positions field, readers use it as the authoritative geometric embedding.

If Positions is omitted, readers MUST treat the Representation as potentially partial and MUST not assume geometry is available.

Readers MAY attempt to derive omitted Positions using any heuristics or methods they implement. Derivation is an implementation choice. Readers that cannot or will not derive Positions MUST treat the Representation as partial and MUST emit a diagnostic indicating the missing geometry.

Readers that derive Positions MAY do so at any granularity including per-Representation, per-fragment, or per-element. Readers that compute and cache derived values SHOULD record provenance (human-readable string attribute describing the performed data derivation action) and a timestamp (7.1.2.6) on the cached values.

### 7.1.2.2 Cycle detection and safety

Readers that attempt derivation MUST detect and break cycles in derivation dependencies. If a derivation chain forms a cycle, the reader MUST stop derivation, treat the involved Representations as partial, and emit a diagnostic describing the cycle.

### 7.1.2.3 Preference guidance for readers

When multiple derivation paths are available, readers SHOULD prefer methods that maximize numerical stability, then minimize computational cost. This preference is guidance only and not normative.

### 7.1.2.4 Interoperability principle

Interoperability is achieved by conservative defaults: if Positions are absent and no reader-supported derivation exists, treat the Representation as partial rather than attempting speculative, cross-file operations.

### 7.1.2.5 Non-normative guidance
Examples of reader strategies include chart transformations, indirection mappings, refinement inheritance, temporal interpolation, and procedural generation. Implementations will vary by application and numerical requirements.

Caching: readers that compute derived values are encouraged to cache them locally and record a timestamp (7.1.2.6) to aid downstream tools.

Diagnostics: readers SHOULD provide clear diagnostics indicating whether a Representation is usable for a requested operation and why (missing Positions, unsupported derivation method, cycle detected).

### 7.1.2.6 Timestamps

Use the HDF5 timestamp property on groups and datasets (e.g. H5O_info2_t via H5Oget_info3() and H5O_INFO_TIME ).
When supporting another file format, use attributes or another file-format specific storage method. 


## 7.2 Field contents

A Field MAY contain:

- a single dataset (contiguous field)  
- multiple datasets (fragmented field)  
- subgroups representing separated compound components  
- attributes defining procedural fields  

Fields MAY mix these forms as long as semantics remain consistent.

## 7.3 Field–Skeleton relationship

A Field’s index space MUST match the Skeleton’s index space.  
This is enforced through:

- fragment offsets  
- fragment sizes  
- procedural definitions  
- or implicit coverage rules  

A Field MAY be partial (see Section 9).

## 7.4 Time‑dependence

A Field MAY be:

- time‑independent (same dataset identity across timeslices)  
- time‑dependent (new dataset identity at a timeslice)  
- partially time‑dependent (some fragments change, others remain linked)

Symbolic links MUST be used to express time‑independence.

---

# 8. Field Types

F5 supports several field types, each with distinct semantics.

## 8.1 Contiguous Field

A contiguous field consists of a single dataset containing all values.

The dataset’s size defines the field’s size.

## 8.2 Fragmented Contiguous Field

A fragmented field consists of multiple datasets, each representing a fragment.

Each fragment MAY define:

- offset (index‑space offset)
- Range
- Fiber::NumericalShift
- CellSize 

Fragment names have **no semantic meaning**.

Fragment placement is determined solely by:

1. the fragment’s offset attribute (if present), and 
2. the coordinates in the Positions field.

Traversal order of fragments has **no semantic effect**.

## 8.3 Separated Compound Field

A compound datatype MAY be stored as separate datasets:

x  
y  
z

Each component MAY be:

- contiguous  
- fragmented  
- partially fragmented  

All components MUST share consistent fragment attributes.

## 8.4 Procedural Fields

Procedural fields define values algorithmically.

### UniformSampling

Attributes:

- base  
- offset  

Value at index i:

base + offset * i

### DirectProduct

Defined by N one‑dimensional arrays.  
The field value is the **Cartesian product** of these arrays.

This is used to construct **rectilinear grids** (non‑uniform regular grids).

### FragmentedUniformSampling

UniformSampling applied per fragment, with fragment‑level attributes.

## 8.5 TypeInfo

TypeInfo is required when the datatype alone is insufficient or unavailable to determine the field's algebraic or geometric interpretation; this includes cases where multiple fields share the same low-level datatype and names alone are insufficient for unambiguous interpretation.

Examples include:
– multiple fields sharing the same datatype,
– tensor fields whose rank or variance cannot be inferred from the datatype,
– fields participating in chart transformations,
– compound datatypes lacking explicit component semantics
- field stored as group rather than a dataset, e.g. separated compound layout

It is recommended for clarity.

A Field MAY exist without any datasets or fragments only if TypeInfo is present.
In this case, TypeInfo defines the field’s datatype and algebraic meaning.
Such a field is considered empty but well-typed, and fragments MAY be added later.
A field with no datasets and no TypeInfo is malformed.

---

# 9. Fragment Semantics

Fragments are the fundamental unit of partial storage.

## 9.1 Fragment identity

Fragment identity is determined by **dataset object identity**, not by:

- name  
- content  
- order  
- path  

Two fragments are identical only if they reference the same HDF5 dataset object.

## 9.2 Fragment ordering

Fragment names are irrelevant.  
Traversal order has no semantic meaning.

Readers MUST treat fragments as **random‑access containers**.

Any geometric or data‑processing algorithm MUST derive ordering from **coordinates**, not from fragment order.

## 9.3 Fragment attributes

Fragment attributes define placement and interpretation.

If multiple fields define fragment attributes:

- attributes are merged  
- inconsistent attributes produce a warning  
- file load order determines precedence  
- inconsistent attributes across files are an error condition

## 9.4 Partial fields

A field is implicitly partial if some regions of the index space are not covered by any fragment.

Querying a location with no fragment coverage yields a default value:

- zero, or  
- the HDF5 fill value (if defined)

No explicit “partial” marker is required.

---

# 10. Time‑Dependence and Identity Rules

Time‑dependence in F5 is defined strictly in terms of **dataset object identity**, not content equality.  
This ensures deterministic interpretation across files, writers, and timeslices.

## 10.1 First occurrence

The first occurrence of a field is the earliest timeslice containing a **real dataset object** for that field.

Symbolic links do not count as first occurrences; they refer back to an earlier dataset.

## 10.2 Time‑independent fields

A field is time‑independent over an interval if all timeslices reference the **same dataset object identity**.

This MUST be expressed using **symbolic links**.

Writers MUST use symbolic links to express identity.
Copying a dataset destroys identity information and MUST not be used to express time‑independence.
If a writer copies a dataset intentionally, the reader will treat it as a distinct dataset with distinct semantics.

## 10.3 Time‑dependent fields

A field is time‑dependent at timeslice T if it contains a **new dataset object** at that timeslice.

Time‑dependence MAY be:

- full (entire field changes)  
- partial (only some fragments change)  

## 10.4 Fragment‑level time‑dependence

Fragments MAY be time‑dependent independently of each other.

Examples:

- Fragment 0 is linked across timeslices → time‑independent  
- Fragment 1 is replaced at timeslice T → time‑dependent  

This allows efficient incremental updates.

## 10.5 Forbidden duplication

Duplicating a dataset instead of linking is forbidden because:

- it destroys identity information  
- it breaks time‑independence detection  
- it introduces ambiguity  

Readers MUST treat every dataset object as semantically distinct unless it is literally the same HDF5 object (via link).
Readers cannot detect whether two datasets were copied intentionally or accidentally.
Identity is determined solely by HDF5 object identity.

Writers MUST use symbolic links to express identity.

---

# 11. Skeleton Index‑Space Rules

Skeletons define the index space for all fields under their representations.  
Skeletons themselves do not have an intrinsic size; their size is derived from fields.

## 11.1 Field size

A field’s size is determined by its internal structure:

* contiguous field → dataset size
* fragmented field → sum of fragment sizes
* procedural field → implied size from attributes

These sizes contribute to the skeleton’s index‑space coverage.

## 11.2 Skeleton size

Skeleton size is defined as the union of index coverage across all fields attached to the Skeleton.
Coverage MAY come from:
– contiguous datasets,
– fragmented datasets,
– procedural definitions,
– or any combination thereof.

### 11.2.1 Rules

An unfragmented field defines the full Skeleton index space.
If multiple unfragmented fields exist, they MUST all have identical size; otherwise the Skeleton is invalid.

Fragmented fields MAY cover any subset of this index space.

If no unfragmented field exists, the Skeleton index space is defined as the union of coverage across all fragmented or procedural fields.

If an unfragmented field exists on a Skeleton, all fragmented fields on that Skeleton MUST cover only indices within the unfragmented field's index space. Any fragment that defines coverage outside the unfragmented field's index space is a fatal error. 


Empty fields contribute no coverage.

This rule ensures:

- consistent index space  
- compatibility across fields  
- predictable behavior for partial fields  

## 11.3 Consistency requirements

All fields attached to a skeleton MUST:

- share the same index space
- provide values for all indices unless partial
- use consistent fragment attributes
- use consistent fragment offsets
- use consistent procedural definitions

If a field is partial, missing regions MUST return default values (see Section 9).

Partiality is a structural property: a field is partial if its fragments or datasets do not cover the full Skeleton index space.
The F5 model does not distinguish intentional from accidental partiality; readers MUST treat all partial fields uniformly.
Writers are responsible for ensuring that partial fields are semantically correct.


## 11.4 Geometry determines ordering

Ordering of elements in the index space is determined by:

- coordinates in the Positions field
- fragment offsets
- procedural definitions

Fragment names and storage order have no effect on index‑space ordering.

---

# 12. Validation Rules

Validation ensures that F5 files are semantically consistent and safe to interpret.

## 12.1 Fatal errors

The following conditions are fatal:
- missing required skeleton attributes
- incompatible datasets during merge
- conflicting dataspaces
- inconsistent fragment attributes across files

### 12.1.1 Timeslice validation
A group that contains a Time attribute is a candidate timeslice. A candidate timeslice becomes a valid timeslice only after its Time attribute is successfully parsed as a scalar real.

Candidate timeslice with a non-scalar or non-convertible Time attribute -> fatal error for that candidate timeslice (see 12.1.2 for propagation).

A group with no Time attribute -> not a timeslice; ignored without error.

### 12.1.2 Fatal error propagation
Fatal errors are local to the entity in which they occur, but propagate along all structural and referential dependencies.
A fatal error invalidates the affected entity and all entities that depend on it, regardless of where they appear in the hierarchy or across timeslices.

The fatal-error propagation rule applies to references resolved via F5::Reference. If a referenced Skeleton in another Timeslice or Grid is invalid, any Representation that references it via F5::Reference is invalid.

Examples:
- Invalid Skeleton → all Representations, Fields, and Fragments referencing that Skeleton are invalid, even across Grids or Timeslices
- Invalid Representation → all Fields and Fragments under that Representation are invalid
- Invalid Grid → only that Grid within its Timeslice is invalid, and all references to its Skeletons from other Grids or Timeslices are invalid
- Invalid Timeslice → only that Timeslice is invalid, and all references to its Grids or Skeletons from other Timeslices are invalid

### 12.1.3 Continuation rule
Readers MUST continue processing all unaffected entities.
A fatal error MUST not abort processing of the entire file unless the root structure itself is malformed.


## 12.2 Warnings

Warnings SHOULD be issued for:

- missing optional attributes  
- fallback to group names for grid identification  
- ignored malformed timeslices  
- default chart assumption  
- fragment attribute conflicts resolved by file load order  

Warnings do not abort processing.

## 12.3 Diagnostics

Readers SHOULD record:

- merge operations  
- attribute conflicts  
- overlapping fragments  
- time‑dependence intervals  
- fallback behaviors  
- missing or partial fields  

Diagnostics are not part of the file format but are essential for tooling.

---

# 13. Deterministic Traversal Rules

Traversal order is defined for reproducibility but has **no semantic meaning**.
Geometry and coordinates determine semantics, not traversal order.

Traversal order:

1. Timeslices by numeric Time
2. Grids by F5::GridID or group name
3. Skeletons by IndexDepth, SkeletonDimensionality, Refinement, then name (recommended for reproducible traversal only; traversal order has no semantic meaning).
4. Representations by name
5. Fields by datatype, name
6. Fragments by arbitrary order (names irrelevant)


## 13.1 Fragment traversal clarification

Because fragment names are irrelevant and geometry determines ordering:

- traversal order of fragments MUST not affect results  
- algorithms MUST treat fragments as random‑access containers  
- coordinate‑based queries MUST be used for geometric operations  

This is a core F5 principle:  
**geometry determines meaning; storage layout does not.**

---

# 14. Reader Expectations

Readers (and tools implementing this specification) MUST adhere to the following behavioral requirements.

## 14.1 Timeslice handling

Readers MUST:

- identify candidate timeslices by the presence of a Time attribute and validate them using 12.1.1.
- merge timeslices with identical numeric Time values  
- order timeslices by numeric Time  

## 14.2 Grid handling

Readers MUST:

- identify grids by F5::GridID when present  
- fall back to group name when absent  
- merge grids across files deterministically  
- treat grid ordering as traversal‑only, not semantic  

## 14.3 Skeleton handling

Readers MUST:

- validate required skeleton attributes
- treat missing optional attributes as warnings
- derive skeleton size from fields
- enforce consistent index‑space semantics

## 14.4 Representation handling

Readers MUST:

- identify coordinate representations via chart names
- identify relative representations via skeleton names
- resolve local charts via GlobalChart attributes
- assume default chart if none is defined (StandardCartesianChart3D)
- treat chart transformations as optional but recommended

## 14.5 Field handling

Readers MUST:

- identify fields by datatype  
- support contiguous, fragmented, separated compound, and procedural fields
- treat Representations with absent Positions as partial per 7.1.2.
- treat fragment names as irrelevant
- treat fragments as random‑access containers
- use coordinates to determine geometric ordering
- support partial fields and default values

## 14.6 Time‑dependence handling

Readers MUST:

- detect time‑independence via symbolic links  
- treat dataset duplication as semantic change  
- support fragment‑level time‑dependence  
- track identity across timeslices  

## 14.7 Fragment handling

Readers MUST:

- ignore fragment names  
- use fragment offsets and coordinates for placement  
- merge fragment attributes using file load order  
- warn on inconsistent attributes  
- treat overlapping fragments as warnings  

## 14.8 Geometry‑driven semantics

Readers MUST:

- derive ordering from coordinates, not storage order  
- treat geometry as the authoritative source of meaning  
- ensure that algorithms produce identical results regardless of fragment order  

This is a core F5 principle.

---

# 15. Summary of the Canonical F5 Model

This section summarizes the entire specification in a concise, normative list.

## 15.1 Structural model

- Timeslices identified by scalar Time  
- Timeslices with same numeric Time merged  
- Grids identified by F5::GridID or group name  
- Skeletons define topology and index space  
- Representations define coordinate or relative meaning  
- Fields attach data to skeleton index space  
- Fragments store partial data  

## 15.2 Geometry‑driven semantics

- Geometry determines ordering  
- Fragment names are irrelevant  
- Storage layout has no semantic meaning  
- Readers MUST use coordinates for all geometric operations  

## 15.3 Field types

- contiguous  
- fragmented  
- separated compound  
- procedural (UniformSampling, DirectProduct, FragmentedUniformSampling)

## 15.4 Fragment semantics

- identity is dataset identity  
- ordering is irrelevant  
- offsets and coordinates determine placement  
- partial fields allowed  
- default values for uncovered regions  

## 15.5 Time‑dependence

- symbolic links express time‑independence  
- dataset duplication expresses change  
- fragment‑level time‑dependence supported  

## 15.6 Validation

- required attributes enforced  
- optional attributes warn  
- Fragment attribute conflicts:
   within a single file → warning
   across multiple files → fatal error
- deterministic merging required  

## 15.7 Charts

- charts define coordinate systems  
- local charts reference global charts  
- default "StandardCartesianChart3D" chart assumed if none defined  

---

# 16. Integrated Clarifications (OQ‑1 to OQ‑5)

All open questions have been resolved and integrated into the specification.  
For completeness, they are restated here.

## OQ‑1: Fragment names and ordering

- Fragment names have no semantic meaning.  
- Ordering of fragments is irrelevant.  
- Placement is determined solely by fragment offsets and coordinates.  
- Geometry determines ordering, not storage layout.  
- Deterministic traversal is unnecessary for semantics.

## OQ‑2: Partial fields

- Partiality is implicit from missing fragments.  
- Querying uncovered regions yields default values (zero or HDF5 fill value).  
- No explicit partial marker is required.

## OQ‑3: DirectProduct semantics

- DirectProduct uses the Cartesian product of component arrays.  
- Used to construct rectilinear (non‑uniform regular) grids.

## OQ‑4: Fragment attribute merging

- File load order determines precedence.  
- Inconsistent fragment attributes produce warnings.  
- Inconsistent attributes across files are an error.

## OQ‑5: Default chart

- If no chart is defined, assume a default "StandardCartesianChart3D" chart.  
- This is mandatory for backward compatibility.  
- A warning is recommended but not required.  
- Explicit charts are strongly preferred.

---

# 17. Conceptual Note: “How is it?” vs. “What is it?”

The F5 model does not classify datasets by predefined geometric or topological types.
Instead, it describes how a dataset is structured, not what it is.

Applications MUST not ask:

“Is this a triangular surface?”

but instead:

“Does this dataset have the properties of a triangular surface?”

This distinction is essential:

A triangular surface is also a point cloud.
It MAY be part of a refinement hierarchy.
It MAY coexist with line or cell structures on the same vertices.
It MAY represent only a subset of a larger domain.

F5 avoids implicit assumptions and predefined categories.
The model exposes structure; applications infer meaning from that structure.

This design philosophy is central to the expressive power of F5.


# **18. Structural Principles for Hierarchical Refinement**

This appendix describes the structural principles that allow hierarchical refinement to be expressed within the F5 model. It introduces no new semantics beyond those already defined in Sections 1–17. Instead, it clarifies how the existing concepts—Skeletons, Representations, Fields, IndexDepth, and Fragments—combine to express refinement structures in a fully general and topology‑preserving manner.

The purpose of this appendix is to provide a design guide. It does not prescribe any specific refinement scheme, naming convention, or data layout. All refinement structures MUST be derivable from the core F5 rules.

---

## **18.1 Refinement as a Topological Relation**

Refinement is a relation between topological entities.  
In F5, all topological entities are represented by **Skeletons**, and all relations between Skeletons are represented by **Representations**.

Therefore:

- A refinement hierarchy MUST be expressed as a sequence of Skeletons.
- Refinement relations MUST be expressed as Representations between these Skeletons.
- No additional hierarchy levels or special metadata are required.

This follows directly from the canonical hierarchy:

Timeslice → Grid → Skeleton → Representation → Field → Fragment datasets

---

## **18.2 Fragments as Topological Entities**

Fragments are subsets of a field’s index space.  
Topologically, a fragment corresponds to a set of cells, and a cell corresponds to a set of vertices.
Fragments are not topological entities themselves; they are represented as topological entities only when modeled via Skeletons with IndexDepth ≥ 2.

Thus, refinement tiles MAY be represented by Skeletons whose **IndexDepth** reflects their topological structure:

- IndexDepth = 0 → vertices (points)  
- IndexDepth = 1 → cells (sets of vertices)  
- IndexDepth = 2 → sets of cells (fragments)  
- IndexDepth = 3 → sets of fragments, and so on  

This allows refinement tiles to be treated as first‑class topological entities.

---

## **18.3 Refinement Levels as Skeletons**

Each refinement level is represented by a distinct Skeleton.  
These Skeletons:

- share the same embedding dimension,  
- differ in their index spaces,  
- MAY differ in IndexDepth depending on the refinement scheme,  
- MAY define optional Refinement attributes to indicate level.

The F5 model does not constrain the number of refinement levels or their structure.

---

## **18.4 Refinement Relations as Relative Representations**

A refinement relation between two Skeletons is expressed as a **relative representation**.

A subgroup of a Skeleton is a *refinement* Representation if its name matches another Skeleton name within the same Grid, or via the 6.5 cross-grid or cross-timeslice references.

Thus, refinement relations are expressed structurally as:

```
Skeleton_L / Skeleton_{L+1}
```

This representation has the same index space as `Skeleton_L`.  
Its Fields describe how each element of `Skeleton_L` relates to elements of `Skeleton_{L+1}`.

No new keywords or attributes are required.

A physically important example is AMR (Adaptive Mesh Refinement) governed by the Courant-Friedrichs-Lewy (CFL) condition. The CFL condition constrains the time step at each refinement level to be proportional to the cell size at that level; finer levels therefore advance at smaller time steps than coarser ones. As a result, the coarse-level refinement hierarchy Representation is partially time-dependent: it references a fine-level Skeleton that exists at a different (earlier) timeslice than the coarse level's current timeslice. Such inter-level refinement relations MUST use F5::Reference (see 6.5) pointing to the fine-level Skeleton at the appropriate timeslice. The refinement Representation itself thus changes only as often as the coarse level advances, not at every fine sub-step - a natural expression of partial time-dependence within the F5 model.



---

## **18.5 Fields on Refinement Representations**

Every Representation MAY contain Fields.  
Fields on a refinement representation:

- have the same index space as the parent Skeleton,  
- MAY be contiguous, fragmented, or procedural,  
- MAY contain identifiers, indices, or other values referencing the child Skeleton.

The F5 model does not prescribe the datatype or structure of these fields.  
Their semantics follow from the Representation they belong to.

---

## **18.6 Fragment‑Level Refinement**

When refinement is defined at the fragment (tile) level:

- fragments are represented as elements of a Skeleton with appropriate IndexDepth,  
- refinement relations are expressed as relative representations between fragment Skeletons,  
- Fields on these representations encode the refinement mapping.

Fragment names have no semantic meaning.  
Fragment identity is determined by dataset identity, as defined in Section 9.

---

## **18.7 Refinement Across Different Topological Dimensions**

Refinement MAY occur:

- between fragment Skeletons,  
- between fragment Skeletons and point Skeletons,  
- between fragment Skeletons and higher‑dimensional Skeletons (e.g., triangles, tetrahedra),  
- or between Skeletons of different IndexDepth.

The F5 model imposes no restrictions on the dimensionality of Skeletons participating in refinement relations.

All such relations MUST be expressed structurally through relative representations.

---

## **18.8 Optional Coordinate Embeddings**

Skeletons participating in refinement hierarchies MAY optionally define coordinate representations.
Coordinate representations MUST refer to a chart; relative representations MUST refer to another Skeleton.

These coordinate representations:

use the same chart mechanism as all other coordinate representations,

MAY define Positions fields giving geometric embeddings of refinement elements.

Positions fields MAY be omitted when their information is derivable from other Representations, indirection mappings, refinement relationships, or chart transformations, consistent with the general Positions rule in Section 7.1.

The F5 model does not require explicit coordinate embeddings for refinement structures.

---

## **18.9 No Special Keywords or Reserved Names**

The F5 model does not introduce any special keywords, attribute names, or reserved identifiers for refinement.

All refinement structures MUST be expressible using:

- Skeletons  
- Representations  
- Fields  
- IndexDepth  
- Fragments  

No additional constructs are permitted.

---

## **18.10 Derivability from Core Principles**

All refinement structures MUST be derivable from the following core principles:

1. **Skeletons define topological entities.**  
2. **Representations define relations between Skeletons.**  
3. **Fields attach data to Skeleton index spaces.**  
4. **Fragments partition field index spaces.**  
5. **IndexDepth expresses nested topological structure.**  
6. **No semantics are encoded in names.**  
7. **All semantics arise from structure.**

These principles are sufficient to express:

- hierarchical point clouds,  
- hierarchical meshes,  
- AMR refinement,  
- multi‑resolution grids,  
- and mixed‑dimensional refinement structures.

No additional rules are required.
These principles are sufficient for an AI or human reader to derive refinement structures without explicit examples.

---

# **18.11 IndexDepth Design Guidelines for Scalable Topological Modeling**

This section provides structural guidelines for choosing `IndexDepth` values in a way that ensures long‑term stability, extensibility, and composability of Skeletons. These guidelines do not introduce new semantics; they follow directly from the core principles defined in Sections 1–17.

The purpose of these guidelines is to ensure that Skeletons remain structurally stable even when intermediate topological levels are added or removed, and that tools can reliably interpret Skeletons based solely on their dimensionality and index‑depth signatures.

These guidelines are consistent with the examples in Section 5.6.


---

## **18.11.1 Maximal‑Depth Principle**

If a topological entity MAY participate in multiple nested levels of structure, its Skeleton SHOULD be assigned an `IndexDepth` equal to the **maximum depth** of nesting it MAY ever require, even if some intermediate levels are not currently present.

This ensures that:

- the Skeleton’s index space remains stable over time, 
- intermediate Skeletons (e.g., Edges between Points and Lines) MAY be added or removed without restructuring, 
- higher‑order or refined variants of the entity can be introduced without altering existing Skeletons, 
- tools can reliably identify Skeletons by their `(SkeletonDimensionality, IndexDepth)` pair.

This principle follows from the definition of `IndexDepth` as the number of nested index spaces between a Skeleton and its vertices.

---

## **18.11.2 Optional Intermediate Skeletons**

Intermediate topological Skeletons (e.g., Edges between Points and Lines) are optional.  
Their presence or absence MUST not require restructuring of Skeletons at higher index depths.

A Skeleton defined with a maximal `IndexDepth` MAY coexist with or without intermediate Skeletons.
If intermediate Skeletons are introduced later, they simply occupy the appropriate index‑depth level without affecting existing structures.

---

## **18.11.3 Stability of Index Spaces**

Skeletons SHOULD be designed so that their index spaces do not change when:

- intermediate Skeletons are added or removed,  
- refinement structures are introduced,  
- higher‑order elements are added,  
- additional Representations are defined.

Assigning a maximal `IndexDepth` ensures that the index space of a Skeleton remains stable and predictable.

---

## **18.11.4 Uniformity Across Datasets**

For a given topological entity type (e.g., lines, surfaces, volumes), all datasets SHOULD use the same `(SkeletonDimensionality, IndexDepth)` pair, regardless of whether intermediate levels are present.

This uniformity enables:

- consistent interpretation across datasets,  
- predictable traversal logic,  
- deterministic code generation,  
- compatibility with refinement structures.

---

## **18.11.5 Compatibility With Refinement Structures**

Refinement structures (Section 18.1–18.10) rely on stable index‑depth assignments.  
Skeletons representing refined entities MUST be able to participate in relative representations without requiring redefinition of their index spaces.

Using maximal index depth ensures that refinement relations can be expressed structurally, even when intermediate levels are absent.

---

## **18.11.6 Derivability From Core Principles**

The Maximal‑Depth Principle is not an additional rule; it follows directly from:

1. Skeletons represent topological entities.  
2. IndexDepth expresses nested topological structure.  
3. Representations express relations between Skeletons.  
4. Skeleton identity MUST be stable across Representations.  
5. No semantics are encoded in names.  
6. All semantics arise from structure.

These principles imply that Skeletons SHOULD be assigned index depths that remain valid under all future structural extensions.


# **End of Section 18.11**

---

# **End of Section 18**

---

# **19. Mathematical Domains Required for Fitting a Dataset into the F5 Layout**

This section provides the mathematical context that underlies the F5 model.
These concepts are not additional requirements of the format; they explain the structures defined in Sections 1–18.

The domains are listed in order of conceptual priority.

---

## **19.1 Differential Geometry**

Differential geometry provides the foundational language for:

- coordinate charts  
- tangent and cotangent spaces  
- vector and covector fields  
- tensor fields  
- coordinate transformations  
- geometric embeddings  
- metric‑dependent quantities  
- geometric invariants  

In the F5 model:

- **coordinate representations** correspond to charts on a manifold,  
- **Positions fields** provide embeddings of skeleton elements into a chart,  
- **tensor‑valued fields** MUST be stored in coordinate representations because they transform under chart changes,  
- **named datatypes** encode tensorial transformation rules.

Differential geometry is therefore essential for:

- interpreting coordinate‑dependent fields,  
- understanding how fields transform between charts,  
- ensuring that geometric quantities are stored in the correct representation.

---

## **19.2 Topology**

Topology provides the structural foundation for:

- vertices, edges, faces, and cells  
- nested index spaces (IndexDepth)  
- connectivity relations  
- refinement relations  
- cell complexes  
- adjacency and incidence  
- partial coverings and fragment partitions  

In the F5 model:

- **Skeletons** represent topological entities,  
- **IndexDepth** expresses nested topological structure,  
- **relative representations** express incidence relations between skeletons,  
- **refinement structures** are topological relations between skeletons at different levels.

Topology determines:

- the structure of Skeletons,  
- the meaning of Representations,  
- the interpretation of Fields as functions on index spaces.

---

## **19.3 Fiber‑Bundle Theory**

Fiber‑bundle theory provides the unifying mathematical framework for the F5 model.

A fiber bundle consists of:

- a **base space** (the index space of a Skeleton),  
- a **fiber** (the datatype of a Field),  
- a **projection** (the attachment of field values to indices),  
- and optional **connections** (e.g., chart transformations).

In the F5 model:

- every Field is a **section of a fiber bundle**,  
- the **fiber** is defined by the Field’s named datatype,  
- the **base** is the Skeleton’s index space,  
- **coordinate representations** correspond to local trivializations of the bundle,  
- **chart transformations** correspond to transition functions.

This perspective explains:

- why Fields attach to Skeletons,  
- why tensor fields MUST live in coordinate representations,  
- why named datatypes MUST encode transformation rules,  
- why geometry and topology are strictly separated.

Fiber‑bundle theory is the conceptual backbone of the entire F5 design.

---

## **19.4 Geometric Algebra and Tensor Algebra**

Geometric algebra (or classical tensor algebra) provides the mathematical language for typing fields.

Required concepts include:

- tangent vectors  
- cotangent vectors  
- multivectors  
- differential forms  
- metric‑dependent and metric‑independent quantities  
- transformation rules under chart changes  
- basis representations  
- tensor rank and variance  

In the F5 model:

- **named datatypes** encode the algebraic type of a field,  
- **coordinate representations** provide the basis in which components are stored,  
- **chart transformations** define how components transform.

This domain is essential for:

- distinguishing vectors from covectors,  
- distinguishing tensors of different rank,  
- ensuring correct transformation behavior,  
- interpreting physical quantities correctly.

---

## **19.5 Geometry**

Geometry provides:

- embeddings into ℝⁿ  
- metric interpretation  
- geometric queries  
- geometric ordering  
- spatial reasoning  

In the F5 model:

- geometry is introduced only through coordinate representations,  
- Positions fields define embeddings,  
- geometry determines ordering and spatial queries,  
- storage layout has no geometric meaning.

---

## **19.6 Index‑Space Theory**

Index‑space theory covers:

- discrete index sets  
- nested index spaces (IndexDepth)  
- mappings between index spaces  
- partial coverage  
- fragment offsets  
- procedural fields  

This domain is essential for:

- interpreting Skeletons,  
- understanding relative representations,  
- handling fragmented fields,  
- interpreting refinement structures.

---

## **19.7 Set‑Theoretic Semantics**

The F5 model is fundamentally set‑theoretic:

- Skeletons define sets,
- Representations define functions between sets,
- Fields define functions from index sets to values,
- Fragments define partitions of sets.

This domain is required for:

- understanding identity,
- interpreting refinement mappings,
- handling partial fields,
- merging datasets.

---

## **19.8 Identity Theory**

Identity in F5 is defined by **HDF5 object identity**, not content.

This domain covers:

- object identity vs equality
- symbolic links
- identity propagation
- time‑dependence semantics

Identity theory is essential for:

- time‑dependent fields,
- fragment‑level updates,
- merging across files.

---

## **19.9 Optional Domains**

Depending on the dataset, additional mathematical domains MAY be relevant:

- measure theory (densities, integrals)  
- graph theory (connectivity queries)  
- algebraic topology (homology, cohomology)  
- numerical analysis (interpolation, discretization)  

These domains are not required by the F5 model but MAY be used by applications.

---

## **19.10 Derivability**

All mathematical structures required to fit a dataset into the F5 layout are derivable from:

1. differential geometry
2. topology
3. fiber‑bundle theory
4. geometric/tensor algebra
5. index‑space theory
6. identity theory

No additional mathematical assumptions are required.

---

# **End of Section 19**

---
