diff --git a/doc/html/FF-IH_FileGroup.gif b/doc/html/FF-IH_FileGroup.gif new file mode 100644 index 0000000000..b0d76f5071 Binary files /dev/null and b/doc/html/FF-IH_FileGroup.gif differ diff --git a/doc/html/FF-IH_FileObject.gif b/doc/html/FF-IH_FileObject.gif new file mode 100644 index 0000000000..8eba623b1d Binary files /dev/null and b/doc/html/FF-IH_FileObject.gif differ diff --git a/doc/html/H5.format.html b/doc/html/H5.format.html index afcd4442b2..8c0d8b208a 100644 --- a/doc/html/H5.format.html +++ b/doc/html/H5.format.html @@ -1,144 +1,169 @@
+Other HDF5 documents and links +Introduction to HDF5 + |
++ |
+HDF5 User Guide +HDF5 Reference Manual + |
The format of a HDF5 file on disk encompasses several - key ideas of the current HDF4 & AIO file formats as well as - addressing some short-comings therein. The new format is +
+ + ![]() | ||
+ Figure 1: Relationships among the HDF5 root group, other groups, and objects
+ + | ||
+ ![]() | ||
+ Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces
+ + |
The format of an HDF5 file on disk encompasses several + key ideas of the HDF4 and AIO file formats as well as + addressing some shortcomings therein. The new format is more self-describing than the HDF4 format and is more uniformly applied to data objects in the file. -
An HDF5 file can be thought of as a directed graph. - The nodes of this graph are the higher-level HDF5 objects, - including groups, datasets, datatypes, and dataspaces. - This document describes the lower-level data objects used by - the HDF5 library to represent those higher-level objects and - their properties. +
An HDF5 file appears to the user as a directed graph. + The nodes of this graph are the higher-level HDF5 objects + that are exposed by the HDF5 APIs: -
At the lowest level, an HDF5 file is made up of the following - objects:
At the lowest level, as information is actually written to the disk, + an HDF5 file is made up of the following objects: +
This document describes the lower-level data objects; + the higher-level objects and their properties are described + in the HDF5 User's Guide. + + -
The file driver information block is an optional region of the + +
The file driver information block is an optional region of the file which contains information needed by the file driver in - order to reopen a file. The format of the driver information + order to reopen a file. The format of the file driver information block is: - +
byte | -byte | -byte | -byte | +||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
byte | +byte | +byte | +byte | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Version | Reserved (zero) | @@ -540,12 +568,12 @@ each high-level object.
Field Name | -Description | -||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Field Name | +Description | +||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Version | The version number of the driver information block. The file format documented here is version zero. | @@ -563,14 +591,15 @@ each high-level object. termination which identifies the driver and version number of the Driver Information block. The predefined drivers supplied with the HDF5 library are identified by the - letters "NCSA" followed by the first four characters of + letters||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
byte | -byte | -byte | -byte | - -
---|---|---|---|
Node Signature | - -|||
Node Type | -Node Level | -Entries Used | - -|
Address of Left Sibling | - -|||
Address of Right Sibling | - -|||
Key 0 (variable size) | - -|||
Address of Child 0 | - -|||
Key 1 (variable size) | - -|||
Address of Child 1 | - -|||
... | - -|||
Key 2K (variable size) | - -|||
Address of Child 2K | - -|||
Key 2K+1 (variable size) | -
-
Field Name | -Description | -
---|---|
Node Signature | -The value ASCII 'TREE' is used to indicate the - beginning of a B-link tree node. This gives file - consistency checking utilities a better chance of - reconstructing a damaged file. | -
Node Type | -Each B-link tree points to a particular type of data.
- This field indicates the type of data as well as
- implying the maximum degree K of the tree and
- the size of each Key field.
- -
|
-
Node Level | -The node level indicates the level at which this node - appears in the tree (leaf nodes are at level zero). Not - only does the level indicate whether child pointers - point to sub-trees or to data, but it can also be used - to help file consistency checking utilities reconstruct - damanged trees. | -
Entries Used | -This determines the number of children to which this - node points. All nodes of a particular type of tree - have the same maximum degree, but most nodes will point - to less than that number of children. The valid child - pointers and keys appear at the beginning of the node - and the unused pointers and keys appear at the end of - the node. The unused pointers and keys have undefined - values. | -
Address of Left Sibling | -This is the file address of the left sibling of the - current node relative to the boot block. If the current - node is the left-most node at this level then this field - is the undefined address (all bits set). | -
Address of Right Sibling | -This is the file address of the right sibling of the - current node relative to the boot block. If the current - node is the right-most node at this level then this - field is the undefined address (all bits set). | -
Keys and Child Pointers | -Each tree has 2K+1 keys with 2K - child pointers interleaved between the keys. The number - of keys and child pointers actually containing valid - values is determined by the `Entries Used' field. If - that field is N then the B-link tree contains - N child pointers and N+1 keys. | -
Key | -The format and size of the key values is determined by - the type of data to which this tree points. The keys are - ordered and are boundaries for the contents of the child - pointer. That is, the key values represented by child - N fall between Key N and Key - N+1. Whether the interval is open or closed on - each end is determined by the type of data to which the - tree points. | -
Address of Children | -The tree node contains file addresses of subtrees or - data depending on the node level (0 implies data - addresses). | -
A symbol table is a group internal to the file that allows - arbitrary nesting of objects (including other symbol - tables). A symbol table maps a set of names to a set of file - address relative to the file boot block. Certain meta data - for an object to which the symbol table points can be cached - in the symbol table in addition to (or in place of?) the - object header. - -
An HDF5 object name space can be stored hierarchically by - partitioning the name into components and storing each - component in a symbol table. The symbol table entry for a - non-ultimate component points to the symbol table containing - the next component. The symbol table entry for the last - component points to the object being named. - -
A symbol table is a collection of symbol table nodes pointed - to by a B-link tree. Each symbol table node contains entries - for one or more symbols. If an attempt is made to add a - symbol to an already full symbol table node containing - 2K entries, then the node is split and one node - contains K symbols and the other contains - K+1 symbols. - -
-
byte | -byte | -byte | -byte | - -
---|---|---|---|
Node Signature | - -|||
Version Number | -Reserved for Future Use | -Number of Symbols | - -|
Symbol Table Entries |
-
-
Field Name | -Description | -
---|---|
Node Signature | -The value ASCII 'SNOD' is used to indicate the - beginning of a symbol table node. This gives file - consistency checking utilities a better chance of - reconstructing a damaged file. | -
Version Number | -The version number for the symbol table node. This - document describes version 1. | -
Number of Symbols | -Although all symbol table nodes have the same length, - most contain fewer than the maximum possible number of - symbol entries. This field indicates how many entries - contain valid data. The valid entries are packed at the - beginning of the symbol table node while the remaining - entries contain undefined values. | -
Symbol Table Entries | -Each symbol has an entry in the symbol table node. - The format of the entry is described below. | -
Each symbol table entry in a symbol table node is designed to allow - for very fast browsing of commonly stored scientific objects. - Toward that design goal, the format of the symbol-table entries - includes space for caching certain constant meta data from the - object header. - -
-
byte | -byte | -byte | -byte | -
---|---|---|---|
Name Offset (<size> bytes) | -|||
Object Header Address | -|||
Symbol-Type | -|||
Reserved | -|||
Scratch-pad Space (16 bytes) |
-
-
Field Name | -Description | -
---|---|
Name Offset | -This is the byte offset into the symbol table local - heap for the name of the symbol. The name is null - terminated. | -
Object Header Address | -Every object has an object header which serves as a - permanent home for the object's meta data. In addition - to appearing in the object header, the meta data can be - cached in the scratch-pad space. | -
Symbol-Type | -The symbol type is determined from the object header.
- It also determines the format for the scratch-pad space.
- The value zero indicates that no object header meta data
- is cached in the symbol table entry.
- -
|
-
Reserved | -These for bytes are present so that the scratch pad - space is aligned on an eight-byte boundary. They are - always set to zero. | -
Scratch-Pad Space | -This space is used for different purposes, depending - on the value of the Symbol Type field. Any meta-data - about a dataset object represented in the scratch-pad - space is duplicated in the object header for that - dataset. Furthermore, no data is cached in the symbol - table entry scratch-pad space if the object header for - the symbol table entry has a link count greater than - one. | -
The symbol table entry scratch-pad space is formatted - according to the value of the Symbol Type field. If the - Symbol Type field has the value zero then no information is - stored in the scratch pad space. - -
If the Symbol Type field is one, then the scratch pad space - contains cached meta data for another symbol table with the format: - -
-
byte | -byte | -byte | -byte | - -
---|---|---|---|
Address of B-tree | - -|||
Address of Name Heap | -
-
Field Name | -Description | -
---|---|
Address of B-tree | -This is the file address for the symbol table's - B-tree. | -
Address of Name Heap | -This is the file address for the symbol table's local - heap that stores the symbol names. | -
-
byte | -byte | -byte | -byte | -
---|---|---|---|
Offset to Link Value | -
-
Field Name | -Description | -
---|---|
Offset to Link Value | -The value of a symbolic link (that is, the name of the - thing to which it points) is stored in the local heap. - This field is the 4-byte offset into the local heap for - the start of the link value, which is null terminated. | -
A heap is a collection of small heap objects. Objects can be - inserted and removed from the heap at any time and the address - of a heap doesn't change once the heap is created. Note: this - is the "local" version of the heap mostly intended for the - storage of names in a symbol table. The storage of small - objects in a global heap is described below. - -
-
byte | -byte | -byte | -byte | -
---|---|---|---|
Heap Signature | -|||
Reserved (zero) | -|||
Data Segment Size | -|||
Offset to Head of Free-list (<size> bytes) | -|||
Address of Data Segment | -
-
Field Name | -Description | -
---|---|
Heap Signature | -The valid ASCII 'HEAP' is used to indicate the - beginning of a heap. This gives file consistency - checking utilities a better chance of reconstructing a - damaged file. | -
Data Segment Size | -The total amount of disk memory allocated for the heap - data. This may be larger than the amount of space - required by the object stored in the heap. The extra - unused space holds a linked list of free blocks. | -
Offset to Head of Free-list | -This is the offset within the heap data segment of the - first free block (or all 0xff bytes if there is no free - block). The free block contains <size> bytes that - are the offset of the next free chunk (or all 0xff bytes - if this is the last free chunk) followed by <size> - bytes that store the size of this free chunk. | -
Address of Data Segment | -The data segment originally starts immediately after - the heap header, but if the data segment must grow as a - result of adding more objects, then the data segment may - be relocated to another part of the file. | -
Objects within the heap should be aligned on an 8-byte boundary. - -
Each HDF5 file has a global heap which stores various types of - information which is typically shared between datasets. The - global heap was designed to satisfy these goals: - -
The implementation of the heap makes use of the memory - management already available at the file level and combines that - with a new top-level object called a collection to - achieve Goal B. The global heap is the set of all collections. - Each global heap object belongs to exactly one collection and - each collection contains one or more global heap objects. For - the purposes of disk I/O and caching, a collection is treated as - an atomic object. -
-
byte | +byte | +byte | +byte | + +
---|---|---|---|
Node Signature | + +|||
Node Type | +Node Level | +Entries Used | + +|
Address of Left Sibling | + +|||
Address of Right Sibling | + +|||
Key 0 (variable size) | + +|||
Address of Child 0 | + +|||
Key 1 (variable size) | + +|||
Address of Child 1 | + +|||
... | + +|||
Key 2K (variable size) | + +|||
Address of Child 2K | + +|||
Key 2K+1 (variable size) | +
+
Field Name | +Description | +||||||||
---|---|---|---|---|---|---|---|---|---|
Node Signature | +The ASCII character string TREE is
+ used to indicate the
+ beginning of a B-link tree node. This gives file
+ consistency checking utilities a better chance of
+ reconstructing a damaged file. |
+ ||||||||
Node Type | +Each B-link tree points to a particular type of data.
+ This field indicates the type of data as well as
+ implying the maximum degree K of the tree and
+ the size of each Key field.
+ +
|
+ ||||||||
Node Level | +The node level indicates the level at which this node + appears in the tree (leaf nodes are at level zero). Not + only does the level indicate whether child pointers + point to sub-trees or to data, but it can also be used + to help file consistency checking utilities reconstruct + damanged trees. | +||||||||
Entries Used | +This determines the number of children to which this + node points. All nodes of a particular type of tree + have the same maximum degree, but most nodes will point + to less than that number of children. The valid child + pointers and keys appear at the beginning of the node + and the unused pointers and keys appear at the end of + the node. The unused pointers and keys have undefined + values. | +||||||||
Address of Left Sibling | +This is the file address of the left sibling of the + current node relative to the super block. If the current + node is the left-most node at this level then this field + is the undefined address (all bits set). | +||||||||
Address of Right Sibling | +This is the file address of the right sibling of the + current node relative to the super block. If the current + node is the right-most node at this level then this + field is the undefined address (all bits set). | +||||||||
Keys and Child Pointers | +Each tree has 2K+1 keys with 2K + child pointers interleaved between the keys. The number + of keys and child pointers actually containing valid + values is determined by the Entries Used field. If + that field is N then the B-link tree contains + N child pointers and N+1 keys. | +||||||||
Key | +The format and size of the key values is determined by
+ the type of data to which this tree points. The keys are
+ ordered and are boundaries for the contents of the child
+ pointer; that is, the key values represented by child
+ N fall between Key N and Key
+ N+1. Whether the interval is open or closed on
+ each end is determined by the type of data to which the
+ tree points.
+ + The format of the key depends on the node type. + For nodes of node type 1, the key is formatted as follows: +
+ For nodes of node type 0, the key is formatted as follows: +
|
+ ||||||||
Child Pointers | +The tree node contains file addresses of subtrees or + data depending on the node level. Nodes at Level 0 point + to data addresses, either data chunk or group nodes. + Nodes at non-zero levels point to other nodes of the + same B-tree. | +
+ Each B-tree node looks like this: + +
key[0] | + | child[0] | + | key[1] | + | child[1] | + | key[2] | + | ... | + | ... | + | key[N-1] | + | child[N-1] | + | key[N] | +
The following question must next be answered: + "Is the value described by key[i] contained in + child[i-1] or in child[i]?" + The answer depends on the type of tree. + In trees for groups (node type 0) the object described by + key[i] is the greatest object contained in + child[i-1] while in chunk trees (node type 1) the + chunk described by key[i] is the least chunk in + child[i]. + +
That means that key[0] for group trees is sometimes unused; + it points to offset zero in the heap, which is always the + empty string and compares as "less-than" any valid object name. + +
And key[N] for chunk trees is sometimes unused; + it contains a chunk offset which compares as "greater-than" + any other chunk offset and has a chunk byte size of zero + to indicate that it is not actually allocated. + + +
A group is an object internal to the file that allows + arbitrary nesting of objects (including other groups). + A group maps a set of names to a set of file + address relative to the base address. Certain meta data + for an object to which the group points can be duplicated + in the group symbol table in addition to the object header. + +
An HDF5 object name space can be stored hierarchically by + partitioning the name into components and storing each + component in a group. The group entry for a + non-ultimate component points to the group containing + the next component. The group entry for the last + component points to the object being named. + +
A group is a collection of group nodes pointed + to by a B-link tree. Each group node contains entries + for one or more symbols. If an attempt is made to add a + symbol to an already full group node containing + 2K entries, then the node is split and one node + contains K symbols and the other contains + K+1 symbols. + +
+
byte | +byte | +byte | +byte | + +
---|---|---|---|
Node Signature | + +|||
Version Number | +Reserved for Future Use | +Number of Symbols | + +|
Group Entries |
+
+
Field Name | +Description | +
---|---|
Node Signature | +The ASCII character string SNOD is
+ used to indicate the
+ beginning of a group node. This gives file
+ consistency checking utilities a better chance of
+ reconstructing a damaged file. |
+
Version Number | +The version number for the group node. This + document describes version 1. | +
Number of Symbols | +Although all group nodes have the same length, + most contain fewer than the maximum possible number of + symbol entries. This field indicates how many entries + contain valid data. The valid entries are packed at the + beginning of the group node while the remaining + entries contain undefined values. | +
Group Entries | +Each symbol has an entry in the group node. + The format of the entry is described below. | +
Each group entry in a group node is designed + to allow for very fast browsing of stored objects. + Toward that design goal, the group entries + include space for caching certain constant meta data from the + object header. + +
+
byte | +byte | +byte | +byte | +
---|---|---|---|
Name Offset (<size> bytes) | +|||
Object Header Address | +|||
Cache Type | +|||
Reserved | +|||
Scratch-pad Space (16 bytes) |
+
+
Field Name | +Description | +
---|---|
Name Offset | +This is the byte offset into the group local + heap for the name of the object. The name is null + terminated. | +
Object Header Address | +Every object has an object header which serves as a + permanent location for the object's meta data. In addition + to appearing in the object header, some meta data can be + cached in the scratch-pad space. | +
Cache Type | +The cache type is determined from the object header.
+ It also determines the format for the scratch-pad space.
+ +
|
+
Reserved | +These four bytes are present so that the scratch-pad + space is aligned on an eight-byte boundary. They are + always set to zero. | +
Scratch-pad Space | +This space is used for different purposes, depending + on the value of the Cache Type field. Any meta-data + about a dataset object represented in the scratch-pad + space is duplicated in the object header for that + dataset. This meta data can include the datatype + and the size of the dataspace for a dataset whose datatype + is atomic and whose dataspace is fixed and less than + four dimensions. + Furthermore, no data is cached in the group + entry scratch-pad space if the object header for + the group entry has a link count greater than + one. | +
The group entry scratch-pad space is formatted + according to the value in the Cache Type field. + +
If the Cache Type field contains the value zero
+ (0
) then no information is
+ stored in the scratch-pad space.
+
+
If the Cache Type field contains the value one
+ (1
), then the scratch-pad space
+ contains cached meta data for another object header
+ in the following format:
+
+
+
byte | +byte | +byte | +byte | + +
---|---|---|---|
Address of B-tree | + +|||
Address of Name Heap | +
+
Field Name | +Description | +
---|---|
Address of B-tree | +This is the file address for the root of the + group's B-tree. | +
Address of Name Heap | +This is the file address for the group's local + heap, in which are stored the symbol names. | +
If the Cache Type field contains the value two
+ (2
), then the scratch-pad space
+ contains cached meta data for another symbolic link
+ in the following format:
+
+
+
byte | +byte | +byte | +byte | +
---|---|---|---|
Offset to Link Value | +
+
Field Name | +Description | +
---|---|
Offset to Link Value | +The value of a symbolic link (that is, the name of the + thing to which it points) is stored in the local heap. + This field is the 4-byte offset into the local heap for + the start of the link value, which is null terminated. | +
A heap is a collection of small heap objects. Objects can be + inserted and removed from the heap at any time. + The address of a heap does not change once the heap is created. + References to objects are stored in the group table; + the names of those objects are stored in the local heap. + +
+
byte | +byte | +byte | +byte | +
---|---|---|---|
Heap Signature | +|||
Reserved (zero) | +|||
Data Segment Size | +|||
Offset to Head of Free-list (<size> bytes) | +|||
Address of Data Segment | +
+
Field Name | +Description | +
---|---|
Heap Signature | +The ASCII character string HEAP
+ is used to indicate the
+ beginning of a heap. This gives file consistency
+ checking utilities a better chance of reconstructing a
+ damaged file. |
+
Data Segment Size | +The total amount of disk memory allocated for the heap + data. This may be larger than the amount of space + required by the object stored in the heap. The extra + unused space holds a linked list of free blocks. | +
Offset to Head of Free-list | +This is the offset within the heap data segment of the + first free block (or all 0xff bytes if there is no free + block). The free block contains <size> bytes that + are the offset of the next free chunk (or all 0xff bytes + if this is the last free chunk) followed by <size> + bytes that store the size of this free chunk. | +
Address of Data Segment | +The data segment originally starts immediately after + the heap header, but if the data segment must grow as a + result of adding more objects, then the data segment may + be relocated, in its entirety, to another part of the + file. | +
Objects within the heap should be aligned on an 8-byte boundary. + +
Each HDF5 file has a global heap which stores various types of + information which is typically shared between datasets. The + global heap was designed to satisfy these goals: + +
The implementation of the heap makes use of the memory + management already available at the file level and combines that + with a new top-level object called a collection to + achieve Goal B. The global heap is the set of all collections. + Each global heap object belongs to exactly one collection and + each collection contains one or more global heap objects. For + the purposes of disk I/O and caching, a collection is treated as + an atomic object. + +
+
byte | +byte | +byte | +byte | +
---|---|---|---|
Magic Number | +|||
Version | +Reserved | + + +||
Collection Size | +|||
Global Heap Object 1 + (described below) |
+ |||
Global Heap Object 2 |
+ |||
... |
+ |||
Global Heap Object N |
+ |||
Global Heap Object 0 (free space) |
+
+
Field Name | +Description | +
---|---|
Magic Number | +The magic number for global heap collections are the
+ four bytes G , C , O ,
+ and L . |
+
Version | +Each collection has its own version number so that new + collections can be added to old files. This document + describes version zero of the collections. + |
Collection Data Size | +This is the size in bytes of the entire collection + including this field. The default (and minimum) + collection size is 4096 bytes which is a typical file + system block size and which allows for 170 16-byte heap + objects plus their overhead. | +
Object 1 through N | +The objects are stored in any order with no + intervening unused space. | +
Object 0 | +Object 0 (zero), when present, represents the free space in + the collection. Free space always appears at the end of + the collection. If the free space is too small to store + the header for Object 0 (described below) then the + header is implied and the collection contains no free space. + |
+
byte | +byte | +byte | +byte | +
---|---|---|---|
Object ID | +Reference Count | +||
Reserved | +|||
Object Data Size | +|||
Object Data |
+
+
Field Name | +Description | +
---|---|
Object ID | +Each object has a unique identification number within a
+ collection. The identification numbers are chosen so that
+ new objects have the smallest value possible with the
+ exception that the identifier 0 always refers to the
+ object which represents all free space within the
+ collection. |
+
Reference Count | +All heap objects have a reference count field. An + object which is referenced from some other part of the + file will have a positive reference count. The reference + count for Object 0 is always zero. | +
Reserved | +Zero padding to align next field on an 8-byte + boundary. | +
Object Size | This is the size of the the fields + above plus the object data stored for the object. The + actual storage size is rounded up to a multiple of + eight. | +
Object Data | +The object data is treated as a one-dimensional array + of bytes to be interpreted by the caller. | +
The Free-space Index is a collection of blocks of data, + dispersed throughout the file, which are currently not used by + any file objects. + +
The super block contains a pointer to root of the free-space description;
+ that pointer is currently (i.e., in HDF5 Release 1.2) required
+ to be the undefined address 0xfff...ff
.
+
+
The free-sapce index is not otherwise publicly defined at this time. + + + -
Data objects contain the real information in the file. These objects compose the scientific data and other information which @@ -1500,7 +1661,7 @@ each high-level object.
0
1
2-7
The following is a list of currently defined header messages:
The Simple Dimensionality message describes the number +
The Simple Dataspace message describes the number of dimensions and size of each dimension that the data object has. This message is only used for datasets which have a - simple, rectilinear grid layout, datasets requiring a more - complex layout (irregularly or unstructured grids, etc) must use - the Data-Space message for expressing the space the - dataset inhabits. + simple, rectilinear grid layout; datasets requiring a more + complex layout (irregularly structured or unstructured grids, etc.) + must use the Complex Dataspace message for expressing + the space the dataset inhabits. + (Note: The Complex Dataspace functionality is + not yet implemented (as of HDF5 Release 1.2). It is not described + in this document.)
Description | |
---|---|
Version | +This value is used to determine the format of the + Simple Dataspace Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. | +
Dimensionality | This value is the number of dimensions that the data @@ -1766,6 +1943,7 @@ each high-level object. |
The data type message defines the data type for each data point - of a dataset. A data type can describe an atomic type like a +
The datatype message defines the datatype for each data point + of a dataset. A datatype can describe an atomic type like a fixed- or floating-point type or a compound type like a C - struct. A data type does not, however, describe how data points - are combined to produce a dataset. Data types are stored on disk - as a data type message, which is a list of data type classes and + struct. A datatype does not, however, describe how data points + are combined to produce a dataset. Datatypes are stored on disk + as a datatype message, which is a list of datatype classes and their associated properties.
0-15 | Number of Members. This field contains the number - of members defined for the compound data type. The member + of members defined for the compound datatype. The member definitions are listed in the Properties field of the data type message. |
The Properties field of a compound data type is a list of the - member definitions of the compound data type. The member +
The Properties field of a compound datatype is a list of the + member definitions of the compound datatype. The member definitions appear one after another with no intervening bytes. - The member types are described with a recursive data type + The member types are described with a recursive datatype message.
@@ -2536,11 +2719,88 @@ each high-level object.
Data type examples are here. +
+
Bits | +Meaning | +
---|---|
0-15 | +Number of Members. The number of name/value + pairs defined for the enumeration type. | +
16-23 | +Reserved (zero). | +
+
Byte | +Byte | +Byte | +Byte | +
---|---|---|---|
Parent Type |
+ |||
Names |
+ |||
Values |
+
Parent Type: | +Each enumeration type is based on some parent type, + usually an integer. The information for that parent type is + described recursively by this field. | +
Names: | +The name for each name/value pair. Each name is + stored as a null terminated ASCII string in a multiple of + eight bytes. The names are in no particular order. | +
Values: | +The list of values in the same order as the names. + The values are packed (no inter-value padding) and the + size of each value is determined by the parent type. | +
The fill value message stores a single data point value which is returned to the application when an uninitialized data point is read from the dataset. The fill value is interpretted with - the same data type as the dataset. If no fill value message is + the same datatype as the dataset. If no fill value message is present then a fill value of all zero is assumed.
@@ -2591,14 +2851,13 @@ each high-level object.
This message indicates that the data for the data object is stored within the current HDF file by including the actual - data within the header data for this message. The data is + data as the header data for this message. The data is stored internally in - the "normal" format, i.e. in one chunk, un-compressed, etc. + the normal format, i.e. in one chunk, uncompressed, etc. -
Note that one and only one of the "Data Storage" headers can be +
Note that one and only one of the Data Storage headers can be stored for each data object.
Format of Data: The message data is actually composed of dataset data, so the format will be determined by the dataset format. +
Purpose and Description: The Attribute message is used to list objects in the HDF file which are used as attributes, or "meta-data" about the current object. An - attribute is a small dataset; it has a name, a data type, a data + attribute is a small dataset; it has a name, a datatype, a data space, and raw data. Since attributes are stored in the object header they must be relatively small (<64kb) and can be associated with any type of object which has an object header @@ -3189,6 +3459,12 @@ each high-level object. version 1 of attribute messages.
Type: 0x000D
Length: varies
@@ -3259,7 +3529,7 @@ each high-level object.
Purpose and Description: The object name or comment is
designed to be a short description of an object. An object name
- is a sequence of non-zero ('\0') ASCII characters with no other
+ is a sequence of non-zero (\0
) ASCII characters with no other
formatting included by the library.
@@ -3298,8 +3568,7 @@ each high-level object.
Type: 0x000E
Length: fixed
@@ -3357,40 +3626,40 @@ each high-level object.
1998
. All fields of this message should be interpreted
as coordinated universal time (UTC)01
and December is 12
.
01
.
00
and 11:00pm is 23
.
00
and
+ the last is 59
.
00
+ and the last is 59
.
The symbol table message is formatted as follows: +
The group message is formatted as follows:
byte | |||||||
---|---|---|---|---|---|---|---|
B-Tree Address | +B-tree Address | ||||||
Heap Address | @@ -3579,7 +3850,7 @@ name heap which are pointed to by this message.
+Other HDF5 documents and links +Introduction to HDF5 + |
++ |
+HDF5 User Guide +HDF5 Reference Manual + |