mirror of
https://github.com/HDFGroup/hdf5.git
synced 2024-12-27 08:01:04 +08:00
204a1404a4
Removed H5Pset_compression (commented it out, actually) and changed example which used it to use H5Pset_deflate. H5.format.html Driver Identification block: Added clarification regarding the representation of the version in the driver identification string.
942 lines
36 KiB
HTML
942 lines
36 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
|
<html>
|
|
<head>
|
|
<title>Dataset Interface (H5D)</title>
|
|
</head>
|
|
|
|
<body bgcolor="#FFFFFF">
|
|
|
|
<hr>
|
|
<center>
|
|
<table border=0 width=98%>
|
|
<tr><td valign=top align=left>
|
|
<a href="H5.intro.html">Introduction to HDF5</a> <br>
|
|
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
|
|
<a href="index.html">Other HDF5 documents and links</a> <br>
|
|
<!--
|
|
<a href="Glossary.html">Glossary</a><br>
|
|
-->
|
|
</td>
|
|
<td valign=top align=right>
|
|
And in this document, the
|
|
<a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>
|
|
<br>
|
|
<a href="Files.html">Files</a>
|
|
Datasets
|
|
<a href="Datatypes.html">Datatypes</a>
|
|
<a href="Dataspaces.html">Dataspaces</a>
|
|
<a href="Groups.html">Groups</a>
|
|
<br>
|
|
<a href="References.html">References</a>
|
|
<a href="Attributes.html">Attributes</a>
|
|
<a href="Properties.html">Property Lists</a>
|
|
<a href="Errors.html">Error Handling</a>
|
|
<br>
|
|
<a href="Filters.html">Filters</a>
|
|
<a href="Palettes.html">Palettes</a>
|
|
<a href="Caching.html">Caching</a>
|
|
<a href="Chunking.html">Chunking</a>
|
|
<a href="MountingFiles.html">Mounting Files</a>
|
|
<br>
|
|
<a href="Performance.html">Performance</a>
|
|
<a href="Debugging.html">Debugging</a>
|
|
<a href="Environment.html">Environment</a>
|
|
<a href="ddl.html">DDL</a>
|
|
<br>
|
|
<a href="Ragged.html">Ragged Arrays</a>
|
|
</td></tr>
|
|
</table>
|
|
</center>
|
|
<hr>
|
|
|
|
<h1>The Dataset Interface (H5D)</h1>
|
|
|
|
<h2>1. Introduction</h2>
|
|
|
|
<p>The purpose of the dataset interface is to provide a mechanism
|
|
to describe properties of datasets and to transfer data between
|
|
memory and disk. A dataset is composed of a collection of raw
|
|
data points and four classes of meta data to describe the data
|
|
points. The interface is hopefully designed in such a way as to
|
|
allow new features to be added without disrupting current
|
|
applications that use the dataset interface.
|
|
|
|
<p>The four classes of meta data are:
|
|
|
|
<dl>
|
|
<dt>Constant Meta Data
|
|
<dd>Meta data that is created when the dataset is created and
|
|
exists unchanged for the life of the dataset. For instance,
|
|
the datatype of stored array elements is defined when the
|
|
dataset is created and cannot be subsequently changed.
|
|
|
|
<dt>Persistent Meta Data
|
|
<dd>Meta data that is an integral and permanent part of a
|
|
dataset but can change over time. For instance, the size in
|
|
any dimension can increase over time if such an increase is
|
|
allowed when the dataset was created.
|
|
|
|
<dt>Memory Meta Data
|
|
<dd>Meta data that exists to describe how raw data is organized
|
|
in the application's memory space. For instance, the data
|
|
type of elements in an application array might not be the same
|
|
as the datatype of those elements as stored in the HDF5 file.
|
|
|
|
<dt>Transport Meta Data
|
|
<dd>Meta data that is used only during the transfer of raw data
|
|
from one location to another. For instance, the number of
|
|
processes participating in a collective I/O request or hints
|
|
to the library to control caching of raw data.
|
|
</dl>
|
|
|
|
<p>Each of these classes of meta data is handled differently by
|
|
the library although the same API might be used to create them.
|
|
For instance, the datatype exists as constant meta data and as
|
|
memory meta data; the same API (the <code>H5T</code> API) is
|
|
used to manipulate both pieces of meta data but they're handled
|
|
by the dataset API (the <code>H5D</code> API) in different
|
|
manners.
|
|
|
|
|
|
|
|
<h2>2. Storage Layout Properties</h2>
|
|
|
|
<p>The dataset API partitions these terms on three orthogonal axes
|
|
(layout, compression, and external storage) and uses a
|
|
<em>dataset creation property list</em> to hold the various
|
|
settings and pass them through the dataset interface. This is
|
|
similar to the way HDF5 files are created with a file creation
|
|
property list. A dataset creation property list is always
|
|
derived from the default dataset creation property list (use
|
|
<code>H5Pcreate()</code> to get a copy of the default property
|
|
list) by modifying properties with various
|
|
<code>H5Pset_<em>property</em>()</code> functions.
|
|
|
|
<dl>
|
|
<dt><code>herr_t H5Pset_layout (hid_t <em>plist_id</em>,
|
|
H5D_layout_t <em>layout</em>)</code>
|
|
<dd>The storage layout is a piece of constant meta data that
|
|
describes what method the library uses to organize the raw
|
|
data on disk. The default layout is contiguous storage.
|
|
|
|
<br><br>
|
|
<dl>
|
|
<dt><code>H5D_COMPACT</code> <i><b>(Not yet implemented.)</b></i>
|
|
<dd>The raw data is presumably small and can be stored
|
|
directly in the object header. Such data is
|
|
non-extendible, non-compressible, non-sparse, and cannot
|
|
be stored externally. Most of these restrictions are
|
|
arbitrary but are enforced because of the small size of
|
|
the raw data. Storing data in this format eliminates the
|
|
disk seek/read request normally necessary to read raw
|
|
data.
|
|
|
|
<br><br>
|
|
<dt><code>H5D_CONTIGUOUS</code>
|
|
<dd>The raw data is large, non-extendible, non-compressible,
|
|
non-sparse, and can be stored externally. This is the
|
|
default value for the layout property. The term
|
|
<em>large</em> means that it may not be possible to hold
|
|
the entire dataset in memory. The non-compressibility is
|
|
a side effect of the data being large, contiguous, and
|
|
fixed-size at the physical level, which could cause
|
|
partial I/O requests to be extremely expensive if
|
|
compression were allowed.
|
|
|
|
<br><br>
|
|
<dt><code>H5D_CHUNKED</code>
|
|
<dd>The raw data is large and can be extended in any
|
|
dimension at any time (provided the data space also allows
|
|
the extension). It may be sparse at the chunk level (each
|
|
chunk is non-sparse, but there might only be a few chunks)
|
|
and each chunk can be compressed and/or stored externally.
|
|
A dataset is partitioned into chunks so each chunk is the
|
|
same logical size. The chunks are indexed by a B-tree and
|
|
are allocated on demand (although it might be useful to be
|
|
able to preallocate storage for parts of a chunked array
|
|
to reduce contention for the B-tree in a parallel
|
|
environment). The chunk size must be defined with
|
|
<code>H5Pset_chunk()</code>.
|
|
|
|
<br><br>
|
|
<dt><em>others...</em>
|
|
<dd>Other layout types may be defined later without breaking
|
|
existing code. However, to be able to correctly read or
|
|
modify data stored with one of these new layouts, the
|
|
application will need to be linked with a new version of
|
|
the library. This happens automatically on systems with
|
|
dynamic linking.
|
|
</dl>
|
|
</dl>
|
|
|
|
<a name="Dataset_PSetChunk">
|
|
<p>Once the general layout is defined, the user can define
|
|
</a>
|
|
properties of that layout. Currently, the only layout that has
|
|
user-settable properties is the <code>H5D_CHUNKED</code> layout,
|
|
which needs to know the dimensionality and chunk size.
|
|
<dl>
|
|
<dt><code>herr_t H5Pset_chunk (hid_t <em>plist_id</em>, int
|
|
<em>ndims</em>, hsize_t <em>dim</em>[])</code>
|
|
<dd>This function defines the logical size of a chunk for
|
|
chunked layout. If the layout property is set to
|
|
<code>H5D_CHUNKED</code> and the chunk size is set to
|
|
<em>dim</em>. The number of elements in the <em>dim</em> array
|
|
is the dimensionality, <em>ndims</em>. One need not call
|
|
<code>H5Dset_layout()</code> when using this function since
|
|
the chunked layout is implied.
|
|
</dl>
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example: Chunked Storage</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p>This example shows how a two-dimensional dataset
|
|
is partitioned into chunks. The library can manage file
|
|
memory by moving the chunks around, and each chunk could be
|
|
compressed. The chunks are allocated in the file on demand
|
|
when data is written to the chunk.
|
|
<center>
|
|
<img alt="Chunked Storage" src="chunk1.gif">
|
|
</center>
|
|
|
|
<p><code><pre>
|
|
size_t hsize[2] = {1000, 1000};
|
|
plist = H5Pcreate (H5P_DATASET_CREATE);
|
|
H5Pset_chunk (plist, 2, size);
|
|
</pre></code>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
|
|
<p>Although it is most efficient if I/O requests are aligned on chunk
|
|
boundaries, this is not a constraint. The application can perform I/O
|
|
on any set of data points as long as the set can be described by the
|
|
data space. The set on which I/O is performed is called the
|
|
<em>selection</em>.
|
|
|
|
<h2>3. Compression Properties</h2>
|
|
|
|
<p>Chunked data storage
|
|
(see <a href="#Dataset_PSetChunk"><code>H5Pset_chunk</code></a>)
|
|
allows data compression as defined by the function
|
|
<code>H5Pset_deflate</code>.
|
|
|
|
<!--
|
|
<dl>
|
|
<dt><code>herr_t H5Pset_compression (hid_t <em>plist_id</em>,
|
|
H5Z_method_t <em>method</em>)</code>
|
|
<dt><code>H5Z_method_t H5Pget_compression (hid_t
|
|
<em>plist_id</em>)</code>
|
|
<dd>These functions set and query the compression method that
|
|
is used to compress the raw data of a dataset. The
|
|
<em>plist_id</em> is a dataset creation property list. The
|
|
possible values for the compression method are:
|
|
|
|
<br><br>
|
|
<dl>
|
|
<dt><code>H5Z_NONE</code>
|
|
<dd>This is the default and specifies that no compression is
|
|
to be performed.
|
|
|
|
<br><br>
|
|
<dt><code>H5Z_DEFLATE</code>
|
|
<dd>This specifies that a variation of the Lempel-Ziv 1977
|
|
(LZ77) encoding is used, the same encoding used by the
|
|
free GNU <code>gzip</code> program.
|
|
</dl>
|
|
-->
|
|
|
|
<br><br>
|
|
<dt><code>herr_t H5Pset_deflate (hid_t <em>plist_id</em>,
|
|
int <em>level</em>)</code>
|
|
<dt><code>int H5Pget_deflate (hid_t <em>plist_id</em>)</code>
|
|
<dd>These functions set or query the deflate level of
|
|
dataset creation property list <em>plist_id</em>. The
|
|
<code>H5Pset_deflate()</code> sets the compression method to
|
|
<code>H5Z_DEFLATE</code> and sets the compression level to
|
|
some integer between one and nine (inclusive). One results in
|
|
the fastest compression while nine results in the best
|
|
compression ratio. The default value is six if
|
|
<code>H5Pset_deflate()</code> isn't called. The
|
|
<code>H5Pget_deflate()</code> returns the compression level
|
|
for the deflate method, or negative if the method is not the
|
|
deflate method.
|
|
</dl>
|
|
|
|
<h2>4. External Storage Properties</h2>
|
|
|
|
<p>Some storage formats may allow storage of data across a set of
|
|
non-HDF5 files. Currently, only the <code>H5D_CONTIGUOUS</code> storage
|
|
format allows external storage. A set segments (offsets and sizes) in
|
|
one or more files is defined as an external file list, or <em>EFL</em>,
|
|
and the contiguous logical addresses of the data storage are mapped onto
|
|
these segments.
|
|
|
|
<dl>
|
|
<dt><code>herr_t H5Pset_external (hid_t <em>plist</em>, const
|
|
char *<em>name</em>, off_t <em>offset</em>, hsize_t
|
|
<em>size</em>)</code>
|
|
<dd>This function adds a new segment to the end of the external
|
|
file list of the specified dataset creation property list. The
|
|
segment begins a byte <em>offset</em> of file <em>name</em> and
|
|
continues for <em>size</em> bytes. The space represented by this
|
|
segment is adjacent to the space already represented by the external
|
|
file list. The last segment in a file list may have the size
|
|
<code>H5F_UNLIMITED</code>, in which case the external file may be
|
|
of unlimited size and no more files can be added to the external files list.
|
|
|
|
<br><br>
|
|
<dt><code>int H5Pget_external_count (hid_t <em>plist</em>)</code>
|
|
<dd>Calling this function returns the number of segments in an
|
|
external file list. If the dataset creation property list has no
|
|
external data then zero is returned.
|
|
|
|
<br><br>
|
|
<dt><code>herr_t H5Pget_external (hid_t <em>plist</em>, int
|
|
<em>idx</em>, size_t <em>name_size</em>, char *<em>name</em>, off_t
|
|
*<em>offset</em>, hsize_t *<em>size</em>)</code>
|
|
<dd>This is the counterpart for the <code>H5Pset_external()</code>
|
|
function. Given a dataset creation property list and a zero-based
|
|
index into that list, the file name, byte offset, and segment size are
|
|
returned through non-null arguments. At most <em>name_size</em>
|
|
characters are copied into the <em>name</em> argument which is not
|
|
null terminated if the file name is longer than the supplied name
|
|
buffer (this is similar to <code>strncpy()</code>).
|
|
</dl>
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example: Multiple Segments</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p>This example shows how a contiguous, one-dimensional dataset
|
|
is partitioned into three parts and each of those parts is
|
|
stored in a segment of an external file. The top rectangle
|
|
represents the logical address space of the dataset
|
|
while the bottom rectangle represents an external file.
|
|
<center>
|
|
<img alt="Multiple Segments" src="extern1.gif">
|
|
</center>
|
|
|
|
<p><code><pre>
|
|
plist = H5Pcreate (H5P_DATASET_CREATE);
|
|
H5Pset_external (plist, "velocity.data", 3000, 1000);
|
|
H5Pset_external (plist, "velocity.data", 0, 2500);
|
|
H5Pset_external (plist, "velocity.data", 4500, 1500);
|
|
</pre></code>
|
|
|
|
<p>One should note that the segments are defined in order of the
|
|
logical addresses they represent, not their order within the
|
|
external file. It would also have been possible to put the
|
|
segments in separate files. Care should be taken when setting
|
|
up segments in a single file since the library doesn't
|
|
automatically check for segments that overlap.
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example: Multi-Dimensional</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p>This example shows how a contiguous, two-dimensional dataset
|
|
is partitioned into three parts and each of those parts is
|
|
stored in a separate external file. The top rectangle
|
|
represents the logical address space of the dataset
|
|
while the bottom rectangles represent external files.
|
|
<center>
|
|
<img alt="Multiple Dimensions" src="extern2.gif">
|
|
</center>
|
|
|
|
<p><code><pre>
|
|
plist = H5Pcreate (H5P_DATASET_CREATE);
|
|
H5Pset_external (plist, "scan1.data", 0, 24);
|
|
H5Pset_external (plist, "scan2.data", 0, 24);
|
|
H5Pset_external (plist, "scan3.data", 0, 16);
|
|
</pre></code>
|
|
|
|
<p>The library maps the multi-dimensional array onto a linear
|
|
address space like normal, and then maps that address space
|
|
into the segments defined in the external file list.
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
<p>The segments of an external file can exist beyond the end of the
|
|
file. The library reads that part of a segment as zeros. When writing
|
|
to a segment that exists beyond the end of a file, the file is
|
|
automatically extended. Using this feature, one can create a segment
|
|
(or set of segments) which is larger than the current size of the
|
|
dataset, which allows to dataset to be extended at a future time
|
|
(provided the data space also allows the extension).
|
|
|
|
<p>All referenced external data files must exist before performing raw
|
|
data I/O on the dataset. This is normally not a problem since those
|
|
files are being managed directly by the application, or indirectly
|
|
through some other library.
|
|
|
|
|
|
<h2>5. Datatype</h2>
|
|
|
|
<p>Raw data has a constant datatype which describes the datatype
|
|
of the raw data stored in the file, and a memory datatype that
|
|
describes the datatype stored in application memory. Both data
|
|
types are manipulated with the <a
|
|
href="Datatypes.html"><code>H5T</code></a> API.
|
|
|
|
<p>The constant file datatype is associated with the dataset when
|
|
the dataset is created in a manner described below. Once
|
|
assigned, the constant datatype can never be changed.
|
|
|
|
<p>The memory datatype is specified when data is transferred
|
|
to/from application memory. In the name of data sharability,
|
|
the memory datatype must be specified, but can be the same
|
|
type identifier as the constant datatype.
|
|
|
|
<p>During dataset I/O operations, the library translates the raw
|
|
data from the constant datatype to the memory datatype or vice
|
|
versa. Structured datatypes include member offsets to allow
|
|
reordering of struct members and/or selection of a subset of
|
|
members and array datatypes include index permutation
|
|
information to allow things like transpose operations (<b>the
|
|
prototype does not support array reordering</b>) Permutations
|
|
are relative to some extrinsic descritpion of the dataset.
|
|
|
|
|
|
|
|
<h2>6. Data Space</h2>
|
|
|
|
<p>The dataspace of a dataset defines the number of dimensions
|
|
and the size of each dimension and is manipulated with the
|
|
<code>H5S</code> API. The <em>simple</em> dataspace consists of
|
|
maximum dimension sizes and actual dimension sizes, which are
|
|
usually the same. However, maximum dimension sizes can be the
|
|
constant <code>H5D_UNLIMITED</code> in which case the actual
|
|
dimension size can be incremented with calls to
|
|
<code>H5Dextend()</code>. The maximium dimension sizes are
|
|
constant meta data while the actual dimension sizes are
|
|
persistent meta data. Initial actual dimension sizes are
|
|
supplied at the same time as the maximum dimension sizes when
|
|
the dataset is created.
|
|
|
|
<p>The dataspace can also be used to define partial I/O
|
|
operations. Since I/O operations have two end-points, the raw
|
|
data transfer functions take two data space arguments: one which
|
|
describes the application memory data space or subset thereof
|
|
and another which describes the file data space or subset
|
|
thereof.
|
|
|
|
|
|
<h2>7. Setting Constant or Persistent Properties</h2>
|
|
|
|
<p>Each dataset has a set of constant and persistent properties
|
|
which describe the layout method, pre-compression
|
|
transformation, compression method, datatype, external storage,
|
|
and data space. The constant properties are set as described
|
|
above in a dataset creation property list whose identifier is
|
|
passed to <code>H5Dcreate()</code>.
|
|
|
|
<dl>
|
|
<dt><code>hid_t H5Dcreate (hid_t <em>file_id</em>, const char
|
|
*<em>name</em>, hid_t <em>type_id</em>, hid_t
|
|
<em>space_id</em>, hid_t <em>create_plist_id</em>)</code>
|
|
<dd>A dataset is created by calling <code>H5Dcreate</code> with
|
|
a file identifier, a dataset name, a datatype, a dataspace,
|
|
and constant properties. The datatype and dataspace are the
|
|
type and space of the dataset as it will exist in the file,
|
|
which may be different than in application memory.
|
|
Dataset names within a group must be unique:
|
|
<code>H5Dcreate</code> returns an error if a dataset with the
|
|
name specified in <code><em>name</em></code> already exists
|
|
at the location specified in <code><em>file_id</em></code>.
|
|
The <em>create_plist_id</em> is a <code>H5P_DATASET_CREATE</code>
|
|
property list created with <code>H5Pcreate()</code> and
|
|
initialized with the various functions described above.
|
|
<code>H5Dcreate()</code> returns a dataset handle for success
|
|
or negative for failure. The handle should eventually be
|
|
closed by calling <code>H5Dclose()</code> to release resources
|
|
it uses.
|
|
|
|
<br><br>
|
|
<dt><code>hid_t H5Dopen (hid_t <em>file_id</em>, const char
|
|
*<em>name</em>)</code>
|
|
<dd>An existing dataset can be opened for access by calling this
|
|
function. A dataset handle is returned for success or a
|
|
negative value is returned for failure. The handle should
|
|
eventually be closed by calling <code>H5Dclose()</code> to
|
|
release resources it uses.
|
|
|
|
<br><br>
|
|
<dt><code>herr_t H5Dclose (hid_t <em>dataset_id</em>)</code>
|
|
<dd>This function closes a dataset handle and releases all
|
|
resources it might have been using. The handle should not be
|
|
used in subsequent calls to the library.
|
|
|
|
<br><br>
|
|
<dt><code>herr_t H5Dextend (hid_t <em>dataset_id</em>,
|
|
hsize_t <em>dim</em>[])</code>
|
|
<dd>This function extends a dataset by increasing the size in
|
|
one or more dimensions. Not all datasets can be extended.
|
|
</dl>
|
|
|
|
|
|
|
|
<h2>8. Querying Constant or Persistent Properties</h2>
|
|
|
|
<p>Constant or persistent properties can be queried with a set of
|
|
three functions. Each function returns an identifier for a copy
|
|
of the requested properties. The identifier can be passed to
|
|
various functions which modify the underlying object to derive a
|
|
new object; the original dataset is completely unchanged. The
|
|
return values from these functions should be properly destroyed
|
|
when no longer needed.
|
|
|
|
<dl>
|
|
<dt><code>hid_t H5Dget_type (hid_t <em>dataset_id</em>)</code>
|
|
<dd>Returns an identifier for a copy of the dataset permanent
|
|
datatype or negative for failure.
|
|
|
|
<dt><code>hid_t H5Dget_space (hid_t <em>dataset_id</em>)</code>
|
|
<dd>Returns an identifier for a copy of the dataset permanent
|
|
data space, which also contains information about the current
|
|
size of the dataset if the data set is extendable with
|
|
<code>H5Dextend()</code>.
|
|
|
|
<dt><code>hid_t H5Dget_create_plist (hid_t
|
|
<em>dataset_id</em>)</code>
|
|
<dd>Returns an identifier for a copy of the dataset creation
|
|
property list. The new property list is created by examining
|
|
various permanent properties of the dataset. This is mostly a
|
|
catch-all for everything but type and space.
|
|
</dl>
|
|
|
|
|
|
|
|
<h2>9. Setting Memory and Transfer Properties</h2>
|
|
|
|
<p>A dataset also has memory properties which describe memory
|
|
within the application, and transfer properties that control
|
|
various aspects of the I/O operations. The memory can have a
|
|
datatype different than the permanent file datatype (different
|
|
number types, different struct member offsets, different array
|
|
element orderings) and can also be a different size (memory is a
|
|
subset of the permanent dataset elements, or vice versa). The
|
|
transfer properties might provide caching hints or collective
|
|
I/O information. Therefore, each I/O operation must specify
|
|
memory and transfer properties.
|
|
|
|
<p>The memory properties are specified with <em>type_id</em> and
|
|
<em>space_id</em> arguments while the transfer properties are
|
|
specified with the <em>transfer_id</em> property list for the
|
|
<code>H5Dread()</code> and <code>H5Dwrite()</code> functions
|
|
(these functions are described below).
|
|
|
|
<dl>
|
|
<dt><code>herr_t H5Pset_buffer (hid_t <em>xfer_plist</em>,
|
|
size_t <em>max_buf_size</em>, void *<em>tconv_buf</em>, void
|
|
*<em>bkg_buf</em>)</code>
|
|
<dt><code>size_t H5Pget_buffer (hid_t <em>xfer_plist</em>, void
|
|
**<em>tconv_buf</em>, void **<em>bkg_buf</em>)</code>
|
|
<dd>Sets or retrieves the maximum size in bytes of the temporary
|
|
buffer used for datatype conversion in the I/O pipeline. An
|
|
application-defined buffer can also be supplied as the
|
|
<em>tconv_buf</em> argument, otherwise a buffer will be
|
|
allocated and freed on demand by the library. A second
|
|
temporary buffer <em>bkg_buf</em> can also be supplied and
|
|
should be the same size as the <em>tconv_buf</em>. The
|
|
default values are 1MB for the maximum buffer size, and null
|
|
pointers for each buffer indicating that they should be
|
|
allocated on demand and freed when no longer needed. The
|
|
<code>H5Pget_buffer()</code> function returns the maximum
|
|
buffer size or zero on error.
|
|
</dl>
|
|
|
|
<p>If the maximum size of the temporary I/O pipeline buffers is
|
|
too small to hold the entire I/O request, then the I/O request
|
|
will be fragmented and the transfer operation will be strip
|
|
mined. However, certain restrictions apply to the strip
|
|
mining. For instance, when performing I/O on a hyperslab of a
|
|
simple data space the strip mining is in terms of the slowest
|
|
varying dimension. So if a 100x200x300 hyperslab is requested,
|
|
the temporary buffer must be large enough to hold a 1x200x300
|
|
sub-hyperslab.
|
|
|
|
<p>To prevent strip mining from happening, the application should
|
|
use <code>H5Pset_buffer()</code> to set the size of the
|
|
temporary buffer so it's large enough to hold the entire
|
|
request.
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p>This example shows how to define a function that sets
|
|
a dataset transfer property list so that strip mining
|
|
does not occur. It takes an (optional) dataset transfer
|
|
property list, a dataset, a data space that describes
|
|
what data points are being transfered, and a datatype
|
|
for the data points in memory. It returns a (new)
|
|
dataset transfer property list with the temporary
|
|
buffer size set to an appropriate value. The return
|
|
value should be passed as the fifth argument to
|
|
<code>H5Dread()</code> or <code>H5Dwrite()</code>.
|
|
<p><code><pre>
|
|
1 hid_t
|
|
2 disable_strip_mining (hid_t xfer_plist, hid_t dataset,
|
|
3 hid_t space, hid_t mem_type)
|
|
4 {
|
|
5 hid_t file_type; /* File datatype */
|
|
6 size_t type_size; /* Sizeof larger type */
|
|
7 size_t size; /* Temp buffer size */
|
|
8 hid_t xfer_plist; /* Return value */
|
|
9
|
|
10 file_type = H5Dget_type (dataset);
|
|
11 type_size = MAX(H5Tget_size(file_type), H5Tget_size(mem_type));
|
|
12 H5Tclose (file_type);
|
|
13 size = H5Sget_npoints(space) * type_size;
|
|
14 if (xfer_plist<0) xfer_plist = H5Pcreate (H5P_DATASET_XFER);
|
|
15 H5Pset_buffer(xfer_plist, size, NULL, NULL);
|
|
16 return xfer_plist;
|
|
17 }
|
|
</pre></code>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
|
|
|
|
<h2>10. Querying Memory or Transfer Properties</h2>
|
|
|
|
<p>Unlike constant and persistent properties, a dataset cannot be
|
|
queried for it's memory or transfer properties. Memory
|
|
properties cannot be queried because the application already
|
|
stores those properties separate from the buffer that holds the
|
|
raw data, and the buffer may hold multiple segments from various
|
|
datasets and thus have more than one set of memory properties.
|
|
The transfer properties cannot be queried from the dataset
|
|
because they're associated with the transfer itself and not with
|
|
the dataset (but one can call
|
|
<code>H5Pget_<em>property</em>()</code> to query transfer
|
|
properties from a tempalate).
|
|
|
|
|
|
<h2>11. Raw Data I/O</h2>
|
|
|
|
<p>All raw data I/O is accomplished through these functions which
|
|
take a dataset handle, a memory datatype, a memory data space,
|
|
a file data space, transfer properties, and an application
|
|
memory buffer. They translate data between the memory datatype
|
|
and space and the file datatype and space. The data spaces can
|
|
be used to describe partial I/O operations.
|
|
|
|
<dl>
|
|
<dt><code>herr_t H5Dread (hid_t <em>dataset_id</em>, hid_t
|
|
<em>mem_type_id</em>, hid_t <em>mem_space_id</em>, hid_t
|
|
<em>file_space_id</em>, hid_t <em>xfer_plist_id</em>,
|
|
void *<em>buf</em>/*out*/)</code>
|
|
<dd>Reads raw data from the specified dataset into <em>buf</em>
|
|
converting from file datatype and space to memory datatype
|
|
and space.
|
|
|
|
<br><br>
|
|
<dt><code>herr_t H5Dwrite (hid_t <em>dataset_id</em>, hid_t
|
|
<em>mem_type_id</em>, hid_t <em>mem_space_id</em>, hid_t
|
|
<em>file_space_id</em>, hid_t <em>xfer_plist_id</em>,
|
|
const void *<em>buf</em>)</code>
|
|
<dd>Writes raw data from an application buffer <em>buf</em> to
|
|
the specified dataset converting from memory datatype and
|
|
space to file datatype and space.
|
|
</dl>
|
|
|
|
|
|
<p>In the name of sharability, the memory datatype must be
|
|
supplied. However, it can be the same identifier as was used to
|
|
create the dataset or as was returned by
|
|
<code>H5Dget_type()</code>; the library will not implicitly
|
|
derive memory datatypes from constant datatypes.
|
|
|
|
<p>For complete reads of the dataset one may supply
|
|
<code>H5S_ALL</code> as the argument for the file data space.
|
|
If <code>H5S_ALL</code> is also supplied as the memory data
|
|
space then no data space conversion is performed. This is a
|
|
somewhat dangerous situation since the file data space might be
|
|
different than what the application expects.
|
|
|
|
|
|
|
|
<h2>12. Examples</h2>
|
|
|
|
<p>The examples in this section illustrate some common dataset
|
|
practices.
|
|
|
|
|
|
<p>This example shows how to create a dataset which is stored in
|
|
memory as a two-dimensional array of native <code>double</code>
|
|
values but is stored in the file in Cray <code>float</code>
|
|
format using LZ77 compression. The dataset is written to the
|
|
HDF5 file and then read back as a two-dimensional array of
|
|
<code>float</code> values.
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example 1</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p><code><pre>
|
|
1 hid_t file, data_space, dataset, properties;
|
|
2 double dd[500][600];
|
|
3 float ff[500][600];
|
|
4 hsize_t dims[2], chunk_size[2];
|
|
5
|
|
6 /* Describe the size of the array */
|
|
7 dims[0] = 500;
|
|
8 dims[1] = 600;
|
|
9 data_space = H5Screate_simple (2, dims);
|
|
10
|
|
11
|
|
12 /*
|
|
13 * Create a new file using with read/write access,
|
|
14 * default file creation properties, and default file
|
|
15 * access properties.
|
|
16 */
|
|
17 file = H5Fcreate ("test.h5", H5F_ACC_RDWR, H5P_DEFAULT,
|
|
18 H5P_DEFAULT);
|
|
19
|
|
20 /*
|
|
21 * Set the dataset creation plist to specify that
|
|
22 * the raw data is to be partitioned into 100x100 element
|
|
23 * chunks and that each chunk is to be compressed with
|
|
24 * LZ77.
|
|
25 */
|
|
26 chunk_size[0] = chunk_size[1] = 100;
|
|
27 properties = H5Pcreate (H5P_DATASET_CREATE);
|
|
28 H5Pset_chunk (properties, 2, chunk_size);
|
|
29 H5Pset_deflate (properties, 9);
|
|
30
|
|
31 /*
|
|
32 * Create a new dataset within the file. The datatype
|
|
33 * and data space describe the data on disk, which may
|
|
34 * be different than the format used in the application's
|
|
35 * memory.
|
|
36 */
|
|
37 dataset = H5Dcreate (file, "dataset", H5T_CRAY_FLOAT,
|
|
38 data_space, properties);
|
|
39
|
|
40 /*
|
|
41 * Write the array to the file. The datatype and data
|
|
42 * space describe the format of the data in the `dd'
|
|
43 * buffer. The raw data is translated to the format
|
|
44 * required on disk defined above. We use default raw
|
|
45 * data transfer properties.
|
|
46 */
|
|
47 H5Dwrite (dataset, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL,
|
|
48 H5P_DEFAULT, dd);
|
|
49
|
|
50 /*
|
|
51 * Read the array as floats. This is similar to writing
|
|
52 * data except the data flows in the opposite direction.
|
|
53 */
|
|
54 H5Dread (dataset, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL,
|
|
55 H5P_DEFAULT, ff);
|
|
56
|
|
64 H5Dclose (dataset);
|
|
65 H5Sclose (data_space);
|
|
66 H5Pclose (properties);
|
|
67 H5Fclose (file);
|
|
</pre></code>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
<p>This example uses the file created in Example 1 and reads a
|
|
hyperslab of the 500x600 file dataset. The hyperslab size is
|
|
100x200 and it is located beginning at element
|
|
<200,200>. We read the hyperslab into an 200x400 array in
|
|
memory beginning at element <0,0> in memory. Visually,
|
|
the transfer looks something like this:
|
|
|
|
<center>
|
|
<img alt="Raw Data Transfer" src="dataset_p1.gif">
|
|
</center>
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example 2</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p><code><pre>
|
|
1 hid_t file, mem_space, file_space, dataset;
|
|
2 double dd[200][400];
|
|
3 hssize_t offset[2];
|
|
4 hsize size[2];
|
|
5
|
|
6 /*
|
|
7 * Open an existing file and its dataset.
|
|
8 */
|
|
9 file = H5Fopen ("test.h5", H5F_ACC_RDONLY, H5P_DEFAULT);
|
|
10 dataset = H5Dopen (file, "dataset");
|
|
11
|
|
12 /*
|
|
13 * Describe the file data space.
|
|
14 */
|
|
15 offset[0] = 200; /*offset of hyperslab in file*/
|
|
16 offset[1] = 200;
|
|
17 size[0] = 100; /*size of hyperslab*/
|
|
18 size[1] = 200;
|
|
19 file_space = H5Dget_space (dataset);
|
|
20 H5Sset_hyperslab (file_space, 2, offset, size);
|
|
21
|
|
22 /*
|
|
23 * Describe the memory data space.
|
|
24 */
|
|
25 size[0] = 200; /*size of memory array*/
|
|
26 size[1] = 400;
|
|
27 mem_space = H5Screate_simple (2, size);
|
|
28
|
|
29 offset[0] = 0; /*offset of hyperslab in memory*/
|
|
30 offset[1] = 0;
|
|
31 size[0] = 100; /*size of hyperslab*/
|
|
32 size[1] = 200;
|
|
33 H5Sset_hyperslab (mem_space, 2, offset, size);
|
|
34
|
|
35 /*
|
|
36 * Read the dataset.
|
|
37 */
|
|
38 H5Dread (dataset, H5T_NATIVE_DOUBLE, mem_space,
|
|
39 file_space, H5P_DEFAULT, dd);
|
|
40
|
|
41 /*
|
|
42 * Close/release resources.
|
|
43 */
|
|
44 H5Dclose (dataset);
|
|
45 H5Sclose (mem_space);
|
|
46 H5Sclose (file_space);
|
|
47 H5Fclose (file);
|
|
</pre></code>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
<p>If the file contains a compound data structure one of whose
|
|
members is a floating point value (call it "delta") but the
|
|
application is interested in reading an array of floating point
|
|
values which are just the "delta" values, then the application
|
|
should cast the floating point array as a struct with a single
|
|
"delta" member.
|
|
|
|
<p>
|
|
<center>
|
|
<table border align=center width="100%">
|
|
<caption align=bottom><h4>Example 3</h4></caption>
|
|
<tr>
|
|
<td>
|
|
<p><code><pre>
|
|
1 hid_t file, dataset, type;
|
|
2 double delta[200];
|
|
3
|
|
4 /*
|
|
5 * Open an existing file and its dataset.
|
|
6 */
|
|
7 file = H5Fopen ("test.h5", H5F_ACC_RDONLY, H5P_DEFAULT);
|
|
8 dataset = H5Dopen (file, "dataset");
|
|
9
|
|
10 /*
|
|
11 * Describe the memory datatype, a struct with a single
|
|
12 * "delta" member.
|
|
13 */
|
|
14 type = H5Tcreate (H5T_COMPOUND, sizeof(double));
|
|
15 H5Tinsert (type, "delta", 0, H5T_NATIVE_DOUBLE);
|
|
16
|
|
17 /*
|
|
18 * Read the dataset.
|
|
19 */
|
|
20 H5Dread (dataset, type, H5S_ALL, H5S_ALL,
|
|
21 H5P_DEFAULT, dd);
|
|
22
|
|
23 /*
|
|
24 * Close/release resources.
|
|
25 */
|
|
26 H5Dclose (dataset);
|
|
27 H5Tclose (type);
|
|
28 H5Fclose (file);
|
|
</pre></code>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</center>
|
|
|
|
|
|
<hr>
|
|
<center>
|
|
<table border=0 width=98%>
|
|
<tr><td valign=top align=left>
|
|
<a href="H5.intro.html">Introduction to HDF5</a> <br>
|
|
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
|
|
<a href="index.html">Other HDF5 documents and links</a> <br>
|
|
<!--
|
|
<a href="Glossary.html">Glossary</a><br>
|
|
-->
|
|
</td>
|
|
<td valign=top align=right>
|
|
And in this document, the
|
|
<a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>
|
|
<br>
|
|
<a href="Files.html">Files</a>
|
|
Datasets
|
|
<a href="Datatypes.html">Datatypes</a>
|
|
<a href="Dataspaces.html">Dataspaces</a>
|
|
<a href="Groups.html">Groups</a>
|
|
<br>
|
|
<a href="References.html">References</a>
|
|
<a href="Attributes.html">Attributes</a>
|
|
<a href="Properties.html">Property Lists</a>
|
|
<a href="Errors.html">Error Handling</a>
|
|
<br>
|
|
<a href="Filters.html">Filters</a>
|
|
<a href="Palettes.html">Palettes</a>
|
|
<a href="Caching.html">Caching</a>
|
|
<a href="Chunking.html">Chunking</a>
|
|
<a href="MountingFiles.html">Mounting Files</a>
|
|
<br>
|
|
<a href="Performance.html">Performance</a>
|
|
<a href="Debugging.html">Debugging</a>
|
|
<a href="Environment.html">Environment</a>
|
|
<a href="ddl.html">DDL</a>
|
|
<br>
|
|
<a href="Ragged.html">Ragged Arrays</a>
|
|
</td></tr>
|
|
</table>
|
|
</center>
|
|
|
|
|
|
|
|
<hr>
|
|
<address>
|
|
<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
|
|
</address>
|
|
|
|
<!-- Created: Tue Dec 2 09:17:09 EST 1997 -->
|
|
<!-- hhmts start -->
|
|
Last modified: 14 October 1999
|
|
<!-- hhmts end -->
|
|
|
|
</body>
|
|
</html>
|