mirror of
https://github.com/HDFGroup/hdf5.git
synced 2024-11-27 02:10:55 +08:00
498 lines
17 KiB
HTML
498 lines
17 KiB
HTML
<HTML><HEAD>
|
|
<TITLE>HDF5 Tutorial - Creating a Dataset
|
|
</TITLE>
|
|
</HEAD>
|
|
|
|
<body bgcolor="#ffffff">
|
|
|
|
<!-- BEGIN MAIN BODY -->
|
|
|
|
|
|
[ <A HREF="title.html"><I>HDF5 Tutorial Top</I></A> ]
|
|
<H1>
|
|
<BIG><BIG><BIG><FONT COLOR="#c101cd">Creating a Dataset</FONT>
|
|
</BIG></BIG></BIG></H1>
|
|
|
|
<hr noshade size=1>
|
|
|
|
<BODY>
|
|
<H2>Contents:</H2>
|
|
<UL>
|
|
<LI> <A HREF="#def">What is a Dataset</A>?
|
|
<LI> Programming Example
|
|
<UL>
|
|
<LI> <A HREF="#desc">Description</A>
|
|
<LI> <A HREF="#rem">Remarks</A>
|
|
<LI> <A HREF="#fc">File Contents</A>
|
|
<LI> <A HREF="#ddl">Dataset Definition in DDL</A>
|
|
</UL>
|
|
</UL>
|
|
<HR>
|
|
<A NAME="def">
|
|
<H2>What is a Dataset?</h2>
|
|
<P>
|
|
A dataset is a multidimensional array of data elements, together with
|
|
supporting metadata. To create a dataset, the application program must specify
|
|
the location at which to create the dataset, the dataset name, the datatype
|
|
and dataspace of the data array, and the dataset creation property list.
|
|
<P>
|
|
<H3> Datatypes</H3>
|
|
A datatype is a collection of datatype properties, all of which can
|
|
be stored on disk, and which when taken as a whole, provide complete
|
|
information for data conversion to or from that datatype.
|
|
<P>
|
|
There are two categories of datatypes in HDF5: atomic and compound
|
|
datatypes. An <i>atomic datatype</i> is a datatype which cannot be
|
|
decomposed into smaller datatype units at the API level.
|
|
These include the integer, float, date and time, string, bitfield, and
|
|
opaque datatypes.
|
|
A <i>compound datatype</i> is a collection of one or more
|
|
atomic datatypes and/or small arrays of such datatypes.
|
|
<P>
|
|
Figure 5.1 shows the HDF5 datatypes. Some of the HDF5 predefined
|
|
atomic datatypes are listed in Figures 5.2a and 5.2b.
|
|
In this tutorial, we consider only HDF5 predefined integers.
|
|
For further information on datatypes, see
|
|
<a href="../Datatypes.html">The Datatype Interface (H5T)</a> in the
|
|
<cite>HDF5 User's Guide</cite>.
|
|
<P>
|
|
<B>Fig 5.1</B> <I>HDF5 datatypes</I>
|
|
<PRE>
|
|
|
|
+-- integer
|
|
+-- floating point
|
|
+---- atomic ----+-- date and time
|
|
| +-- character string
|
|
HDF5 datatypes --| +-- bitfield
|
|
| +-- opaque
|
|
|
|
|
+---- compound
|
|
|
|
</PRE>
|
|
|
|
<table width="100%" border="0" cellpadding="4">
|
|
<tr><td valign=top>
|
|
|
|
<B>Fig. 5.2a</B> <I>Examples of HDF5 predefined datatypes</I>
|
|
|
|
<table width="95%" border="1" cellpadding="0">
|
|
<tr bgcolor="#ffcc99" bordercolor="#FFFFFF">
|
|
<td width="20%"><b>Datatype</b></td>
|
|
<td width="80%"><b>Description</b></td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_STD_I32LE</code></td>
|
|
<td width="80%">Four-byte, little-endian, signed, two's complement integer</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_STD_U16BE</code></td>
|
|
<td width="80%">Two-byte, big-endian, unsigned integer</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_IEEE_F32BE</code></td>
|
|
<td width="80%">Four-byte, big-endian, IEEE floating point</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_IEEE_F64LE</code></td>
|
|
<td width="80%">Eight-byte, little-endian, IEEE floating point</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_C_S1</code></td>
|
|
<td width="80%">One-byte, null-terminated string of eight-bit characters</td>
|
|
</tr>
|
|
</table>
|
|
|
|
</td><td valign=top>
|
|
|
|
<B>Fig. 5.2b</B> <I>Examples of HDF5 predefined native datatypes</I>
|
|
<table width="95%" border="1" cellpadding="4">
|
|
<tr bgcolor="#ffcc99" bordercolor="#FFFFFF">
|
|
<td width="20%"><b>Native Datatype</b></td>
|
|
<td width="80%"><b>Corresponding C or FORTRAN Type</b></td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><B>C:</B></td>
|
|
<td width="80%"> </td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_INT</code></td>
|
|
<td width="80%">int</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_FLOAT</code></td>
|
|
<td width="80%">float</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_CHAR</code></td>
|
|
<td width="80%">char</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_DOUBLE</code></td>
|
|
<td width="80%">double</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_LDOUBLE</code></td>
|
|
<td width="80%">long double</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><B>FORTRAN:</B></td>
|
|
<td width="80%"> </td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_INT</code></td>
|
|
<td width="80%">integer</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_REAL</code></td>
|
|
<td width="80%">real</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_DOUBLE</code></td>
|
|
<td width="80%">double precision</td>
|
|
</tr>
|
|
<tr bordercolor="#FFFFFF">
|
|
<td bgcolor="#99cccc" width="20%"><code>H5T_NATIVE_CHAR</code></td>
|
|
<td width="80%">character</td>
|
|
</tr>
|
|
</table>
|
|
|
|
</table>
|
|
|
|
<H3> Datasets and Dataspaces</H3>
|
|
|
|
A dataspace describes the dimensionality of the data array. A dataspace
|
|
is either a regular N-dimensional array of data points, called a simple
|
|
dataspace, or a more general collection of data points organized in
|
|
another manner, called a complex dataspace. Figure 5.3 shows HDF5 dataspaces.
|
|
In this tutorial, we only consider simple dataspaces.
|
|
<P>
|
|
<B>Fig 5.3</B> <I>HDF5 dataspaces</I>
|
|
<PRE>
|
|
|
|
+-- simple
|
|
HDF5 dataspaces --|
|
|
+-- complex
|
|
|
|
</PRE>
|
|
The dimensions of a dataset can be fixed (unchanging), or they may be
|
|
unlimited, which means that they are extensible. A dataspace can also
|
|
describe a portion of a dataset, making it possible to do partial I/O
|
|
operations on selections.
|
|
|
|
<h3>Dataset Creation Property Lists</H3>
|
|
|
|
When creating a dataset, HDF5 allows the user to specify how raw data is
|
|
organized and/or compressed on disk. This information is
|
|
stored in a dataset creation property list and passed to the dataset
|
|
interface. The raw data on disk can be stored contiguously (in the same
|
|
linear way that it is organized in memory), partitioned into chunks,
|
|
stored externally, etc. In this tutorial, we use the
|
|
default dataset creation property list; that is, contiguous storage layout
|
|
and no compression are used. For more information about
|
|
dataset creation property lists,
|
|
see <a href="../Datasets.html">The Dataset Interface (H5D)</a>
|
|
in the <cite>HDF5 User's Guide</cite>.
|
|
|
|
<P>
|
|
In HDF5, datatypes and dataspaces are independent objects which are created
|
|
separately from any dataset that they might be attached to. Because of this,
|
|
the creation of a dataset requires definition of the datatype and dataspace.
|
|
In this tutorial, we use HDF5 predefined datatypes (integer) and consider
|
|
only simple dataspaces. Hence, only the creation of dataspace objects is
|
|
needed.
|
|
<P>
|
|
|
|
To create an empty dataset (no data written) the following steps need to be
|
|
taken:
|
|
<OL>
|
|
<LI> Obtain the location identifier where the dataset is to be created.
|
|
<LI> Define the dataset characteristics and the dataset creation property list.
|
|
<UL>
|
|
<LI> Define a datatype.
|
|
<LI> Define a dataspace.
|
|
<LI> Specify the dataset creation property list.
|
|
</UL>
|
|
<LI> Create the dataset.
|
|
<LI> Close the datatype, the dataspace, and the property list if necessary.
|
|
<LI> Close the dataset.
|
|
</OL>
|
|
To create a simple dataspace, the calling program must contain a
|
|
call to create and close the dataspace. For example:
|
|
<P>
|
|
<I>C</I>:
|
|
<PRE>
|
|
space_id = H5Screate_simple (rank, dims, maxdims);
|
|
status = H5Sclose (space_id );
|
|
</PRE>
|
|
<I>FORTRAN</I>:
|
|
<PRE>
|
|
CALL h5screate_simple_f (rank, dims, space_id, hdferr, maxdims=max_dims)
|
|
<i>or</i>
|
|
CALL h5screate_simple_f (rank, dims, space_id, hdferr)
|
|
|
|
CALL h5sclose_f (space_id, hdferr)
|
|
</PRE>
|
|
|
|
To create a dataset, the calling program must contain calls to create
|
|
and close the dataset. For example:
|
|
<P>
|
|
<I>C</I>:
|
|
<PRE>
|
|
dset_id = H5Dcreate (hid_t loc_id, const char *name, hid_t type_id,
|
|
hid_t space_id, hid_t creation_prp);
|
|
status = H5Dclose (dset_id);
|
|
</PRE>
|
|
<I>FORTRAN</I>:
|
|
<PRE>
|
|
CALL h5dcreate_f (loc_id, name, type_id, space_id, dset_id, &
|
|
hdferr, creation_prp=creat_plist_id)
|
|
<i>or</i>
|
|
CALL h5dcreate_f (loc_id, name, type_id, space_id, dset_id, hdferr)
|
|
|
|
CALL h5dclose_f (dset_id, hdferr)
|
|
</PRE>
|
|
If using the pre-defined datatypes in FORTRAN, then a call must
|
|
be made to initialize and terminate access to the pre-defined datatypes:
|
|
<PRE>
|
|
CALL h5init_types_f (hdferr)
|
|
CALL h5close_types_f (hdferr)
|
|
</PRE>
|
|
<code>h5init_types_f</code> must be called before any HDF5 library
|
|
subroutine calls are made;
|
|
<code>h5close_types_f</code> must be called after the final HDF5 library
|
|
subroutine call.
|
|
See the programming example below for an illustration of the use of
|
|
these calls.
|
|
|
|
<P>
|
|
<H2> Programming Example</H2>
|
|
<A NAME="desc">
|
|
<H3><U>Description</U></H3>
|
|
The following example shows how to create an empty dataset.
|
|
It creates a file called <code>dset.h5</code> in the C version
|
|
(<code>dsetf.h5</code> in Fortran), defines the dataset dataspace, creates a
|
|
dataset which is a 4x6 integer array, and then closes the dataspace,
|
|
the dataset, and the file. <BR>
|
|
<UL>
|
|
[ <A HREF="examples/h5_crtdat.c">C Example</A> ]
|
|
-- <code>h5_crtdat.c</code><BR>
|
|
[ <A HREF="examples/dsetexample.f90">Fortran Example</A> ]
|
|
-- <code>dsetexample.f90</code><BR>
|
|
[ <A HREF="examples/java/CreateDataset.java">Java Example</A> ]
|
|
-- <code>CreateDataset.java</code>
|
|
</UL>
|
|
|
|
<B>NOTE:</B> To download a tar file of the examples, including a Makefile,
|
|
please go to the <A HREF="references.html">References</A> page of this tutorial.
|
|
|
|
<A NAME="rem">
|
|
<H3><U>Remarks</U></H3>
|
|
<UL>
|
|
<LI><code>H5Screate_simple</code>/<code>h5screate_simple_f</code>
|
|
creates a new simple dataspace and returns a dataspace identifier.
|
|
<PRE>
|
|
<I>C</I>:
|
|
hid_t H5Screate_simple (int rank, const hsize_t * dims,
|
|
const hsize_t * maxdims)
|
|
<I>FORTRAN</I>:
|
|
h5screate_simple_f (rank, dims, space_id, hdferr, maxdims)
|
|
|
|
rank INTEGER
|
|
dims(*) INTEGER(HSIZE_T)
|
|
space_id INTEGER(HID_T)
|
|
hdferr INTEGER
|
|
(Valid values: 0 on success and -1 on failure)
|
|
maxdims(*) INTEGER(HSIZE_T), OPTIONAL
|
|
</PRE>
|
|
<UL>
|
|
<LI> The <I>rank</I> parameter specifies the rank, i.e., the number of
|
|
dimensions, of the dataset.
|
|
|
|
<LI> The <I>dims</I> parameter specifies the size of the dataset.
|
|
|
|
<LI>The <I>maxdims</I> parameter specifies the upper limit on the
|
|
size of the dataset.
|
|
If this parameter is NULL in C (or not specified in FORTRAN),
|
|
then the upper limit is the same as the dimension
|
|
sizes specified by the <I>dims</I> parameter.
|
|
<LI>The function returns the dataspace identifier in C if successful;
|
|
otherwise it returns a negative value.
|
|
In FORTRAN, the dataspace identifier
|
|
is returned in the <I>space_id</I> parameter. If the call is successul
|
|
then a 0 is returned in <I>hdferr</I>; otherwise a -1 is returned.
|
|
</UL>
|
|
<P>
|
|
<LI><code>H5Dcreate</code>/<code>h5dcreate_f</code> creates a dataset
|
|
at the specified location and returns a dataset identifier.
|
|
<PRE>
|
|
<I>C</I>:
|
|
hid_t H5Dcreate (hid_t loc_id, const char *name, hid_t type_id,
|
|
hid_t space_id, hid_t creation_prp)
|
|
<I>FORTRAN</I>:
|
|
h5dcreate_f (loc_id, name, type_id, space_id, dset_id, &
|
|
hdferr, creation_prp)
|
|
|
|
loc_id INTEGER(HID_T)
|
|
name CHARACTER(LEN=*)
|
|
type_id INTEGER(HID_T)
|
|
space_id INTEGER(HID_T)
|
|
dset_id INTEGER(HID_T)
|
|
hdferr INTEGER
|
|
(Valid values: 0 on success and -1 on failure)
|
|
creation_prp INTEGER(HID_T), OPTIONAL
|
|
</PRE>
|
|
<UL>
|
|
<LI> The <I>loc_id</I> parameter is the location identifier.
|
|
<P>
|
|
<LI> The <I>name</I> parameter is the name of the dataset to create.
|
|
|
|
<P>
|
|
<LI> The <I>type_id</I> parameter specifies the datatype identifier.
|
|
|
|
<P>
|
|
<LI> The <I>space_id</I> parameter is the dataspace identifier.
|
|
|
|
<P>
|
|
<LI> The <I>creation_prp</I> parameter specifies the
|
|
dataset creation property list.
|
|
<code>H5P_DEFAULT</code> in C and <code>H5P_DEFAULT_F</code> in FORTRAN
|
|
specify the default dataset creation property list.
|
|
This parameter is optional in FORTRAN; if it is omitted,
|
|
the default dataset creation property list will be used.
|
|
<P>
|
|
<LI> The C function returns the dataset identifier if successful and
|
|
a negative value otherwise. The FORTRAN call returns the
|
|
dataset identifier in <I>dset_id</I>. If it is successful, then 0 is
|
|
returned in <I>hdferr</I>; otherwise a -1 is returned.
|
|
|
|
</UL>
|
|
<P>
|
|
<LI><code>H5Dcreate</code>/<code>h5dcreate_f</code> creates an empty array
|
|
and initializes the data to 0.
|
|
<P>
|
|
<LI> When a dataset is no longer accessed by a program,
|
|
<code>H5Dclose</code>/<code>h5dclose_f</code> must be called to release
|
|
the resource used by the dataset. This call is mandatory.
|
|
<PRE>
|
|
<I>C</I>:
|
|
hid_t H5Dclose (hid_t dset_id)
|
|
<I>FORTRAN</I>:
|
|
h5dclose_f (dset_id, hdferr)
|
|
|
|
dset_id INTEGER(HID_T)
|
|
hdferr INTEGER
|
|
(Valid values: 0 on success and -1 on failure)
|
|
</PRE>
|
|
</UL>
|
|
|
|
<A NAME="fc">
|
|
<H3><U>File Contents</U></H3>
|
|
The contents of the file <code>dset.h5</code> (<code>dsetf.h5</code>
|
|
for FORTRAN) are shown in <B>Figure 5.4</B> and <B>Figures 5.5a </B>
|
|
and <B>5.5b</B>.
|
|
<P>
|
|
<table border="0">
|
|
<tr align=left><td>
|
|
<B>Figure 5.4</B> <I>Contents of <code>dset.h5</code> ( <code>dsetf.h5</code>)</i>
|
|
</td></tr><tr align=center><td>
|
|
<IMG src="img002.gif"> </PRE>
|
|
</td></tr></table>
|
|
|
|
<table width="100%" border="1" cellspacing="4" bordercolor="#FFFFFF">
|
|
<tr bordercolor="#FFFFFF">
|
|
<td width="50%"><b>Figure 5.5a</b> <i><code>dset.h5</code> in DDL</i> </td>
|
|
<td width="50%"><b>Figure 5.5b</b> <i><code>dsetf.h5</code> in DDL</i> </td>
|
|
</tr>
|
|
<tr bordercolor="#000000">
|
|
<td width="35%">
|
|
<PRE>
|
|
HDF5 "dset.h5" {
|
|
GROUP "/" {
|
|
DATASET "dset" {
|
|
DATATYPE { H5T_STD_I32BE }
|
|
DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) }
|
|
DATA {
|
|
0, 0, 0, 0, 0, 0,
|
|
0, 0, 0, 0, 0, 0,
|
|
0, 0, 0, 0, 0, 0,
|
|
0, 0, 0, 0, 0, 0
|
|
}
|
|
}
|
|
}
|
|
}
|
|
</PRE>
|
|
</td>
|
|
<td width="35%">
|
|
<pre>
|
|
HDF5 "dsetf.h5" {
|
|
GROUP "/" {
|
|
DATASET "dset" {
|
|
DATATYPE { H5T_STD_I32BE }
|
|
DATASPACE { SIMPLE ( 6, 4 ) / ( 6, 4 ) }
|
|
DATA {
|
|
0, 0, 0, 0,
|
|
0, 0, 0, 0,
|
|
0, 0, 0, 0,
|
|
0, 0, 0, 0,
|
|
0, 0, 0, 0,
|
|
0, 0, 0, 0
|
|
}
|
|
}
|
|
}
|
|
}
|
|
</pre>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
<p>
|
|
Note in Figures 5.5a and 5.5b that
|
|
<code>H5T_STD_I32BE</code>, a 32-bit Big Endian integer,
|
|
is an HDF atomic datatype.
|
|
|
|
|
|
<A NAME="ddl">
|
|
<h3><U>Dataset Definition in DDL</U></H3>
|
|
The following is the simplified DDL dataset definition:
|
|
<P>
|
|
<B>Fig. 5.6</B> <I>HDF5 Dataset Definition</I>
|
|
<PRE>
|
|
<dataset> ::= DATASET "<dataset_name>" { <datatype>
|
|
<dataspace>
|
|
<data>
|
|
<dataset_attribute>* }
|
|
|
|
<datatype> ::= DATATYPE { <atomic_type> }
|
|
|
|
<dataspace> ::= DATASPACE { SIMPLE <current_dims> / <max_dims> }
|
|
|
|
<dataset_attribute> ::= <attribute>
|
|
</PRE>
|
|
|
|
|
|
<!-- BEGIN FOOTER INFO -->
|
|
|
|
<P><hr noshade size=1>
|
|
<font face="arial,helvetica" size="-1">
|
|
<a href="http://www.ncsa.uiuc.edu/"><img border=0
|
|
src="footer-ncsalogo.gif"
|
|
width=78 height=27 alt="NCSA"><br>
|
|
The National Center for Supercomputing Applications</A><br>
|
|
<a href="http://www.uiuc.edu/">University of Illinois
|
|
at Urbana-Champaign</a><br>
|
|
<br>
|
|
<!-- <A HREF="helpdesk.mail.html"> -->
|
|
<A HREF="mailto:hdfhelp@@ncsa.uiuc.edu">
|
|
hdfhelp@@ncsa.uiuc.edu</A>
|
|
<br>
|
|
<BR> <H6>Last Modified: June 22, 2001</H6><BR>
|
|
<!-- modified by Barbara Jones - bljones@@ncsa.uiuc.edu -->
|
|
<!-- modified by Frank Baker - fbaker@@ncsa.uiuc.edu -->
|
|
</FONT>
|
|
<BR>
|
|
<!-- <A HREF="mailto:hdfhelp@@ncsa.uiuc.edu"> -->
|
|
|
|
</BODY>
|
|
</HTML>
|
|
|