2000-12-22 21:32:40 +08:00
|
|
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
|
|
|
|
<html>
|
|
|
|
<head>
|
|
|
|
<title>Performance</title>
|
|
|
|
|
2003-07-03 05:28:11 +08:00
|
|
|
<!-- #BeginLibraryItem "/ed_libs/styles_UG.lbi" -->
|
|
|
|
<!--
|
|
|
|
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
|
|
|
|
* Copyright by the Board of Trustees of the University of Illinois. *
|
|
|
|
* All rights reserved. *
|
|
|
|
* *
|
|
|
|
* This file is part of HDF5. The full HDF5 copyright notice, including *
|
|
|
|
* terms governing use, modification, and redistribution, is contained in *
|
|
|
|
* the files COPYING and Copyright.html. COPYING can be found at the root *
|
|
|
|
* of the source code distribution tree; Copyright.html can be found at the *
|
|
|
|
* root level of an installed copy of the electronic HDF5 document set and *
|
|
|
|
* is linked from the top-level documents page. It can also be found at *
|
|
|
|
* http://hdf.ncsa.uiuc.edu/HDF5/doc/Copyright.html. If you do not have *
|
|
|
|
* access to either file, you may request a copy from hdfhelp@ncsa.uiuc.edu. *
|
|
|
|
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
|
|
|
|
-->
|
|
|
|
|
|
|
|
<link href="ed_styles/UGelect.css" rel="stylesheet" type="text/css">
|
2003-03-15 03:12:34 +08:00
|
|
|
<!-- #EndLibraryItem --></head>
|
2000-12-22 21:32:40 +08:00
|
|
|
|
2003-03-15 03:12:34 +08:00
|
|
|
<body bgcolor="#FFFFFF">
|
|
|
|
|
|
|
|
|
|
|
|
<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
|
2000-12-22 21:32:40 +08:00
|
|
|
<center>
|
|
|
|
<table border=0 width=98%>
|
|
|
|
<tr><td valign=top align=left>
|
2003-03-15 03:12:34 +08:00
|
|
|
<a href="index.html">HDF5 documents and links</a> <br>
|
|
|
|
<a href="H5.intro.html">Introduction to HDF5</a> <br>
|
|
|
|
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
|
2003-07-04 03:35:53 +08:00
|
|
|
<a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a> <br>
|
2003-03-15 03:12:34 +08:00
|
|
|
<!--
|
|
|
|
<a href="Glossary.html">Glossary</a><br>
|
|
|
|
-->
|
2000-12-22 21:32:40 +08:00
|
|
|
</td>
|
|
|
|
<td valign=top align=right>
|
2003-03-15 03:12:34 +08:00
|
|
|
And in this document, the
|
2003-07-04 03:35:53 +08:00
|
|
|
<a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>
|
2003-03-15 03:12:34 +08:00
|
|
|
<br>
|
|
|
|
<a href="Files.html">Files</a>
|
|
|
|
<a href="Datasets.html">Datasets</a>
|
|
|
|
<a href="Datatypes.html">Datatypes</a>
|
|
|
|
<a href="Dataspaces.html">Dataspaces</a>
|
|
|
|
<a href="Groups.html">Groups</a>
|
|
|
|
<br>
|
|
|
|
<a href="References.html">References</a>
|
|
|
|
<a href="Attributes.html">Attributes</a>
|
|
|
|
<a href="Properties.html">Property Lists</a>
|
|
|
|
<a href="Errors.html">Error Handling</a>
|
|
|
|
<br>
|
|
|
|
<a href="Filters.html">Filters</a>
|
|
|
|
<a href="Caching.html">Caching</a>
|
|
|
|
<a href="Chunking.html">Chunking</a>
|
|
|
|
<a href="MountingFiles.html">Mounting Files</a>
|
|
|
|
<br>
|
|
|
|
<a href="Performance.html">Performance</a>
|
|
|
|
<a href="Debugging.html">Debugging</a>
|
|
|
|
<a href="Environment.html">Environment</a>
|
|
|
|
<a href="ddl.html">DDL</a>
|
2000-12-22 21:32:40 +08:00
|
|
|
</td></tr>
|
|
|
|
</table>
|
|
|
|
</center>
|
|
|
|
<hr>
|
2003-03-15 03:12:34 +08:00
|
|
|
<!-- #EndLibraryItem --><h1>Performance Analysis and Issues</h1>
|
2000-12-22 21:32:40 +08:00
|
|
|
|
|
|
|
<h2>1. Introduction</h2>
|
|
|
|
|
|
|
|
<p>This section includes brief discussions of performance issues
|
|
|
|
in HDF5 and performance analysis tools for HDF5 or pointers to
|
|
|
|
such discussions.
|
|
|
|
|
|
|
|
<h2>2. Dataset Chunking</h2>
|
|
|
|
|
2001-07-12 06:01:45 +08:00
|
|
|
Appropriate dataset chunking can make a siginificant difference
|
|
|
|
in HDF5 performance. This topic is discussed in
|
|
|
|
<a href="Chunking.html">Dataset Chunking Issues</a> elsewhere
|
|
|
|
in this <cite>User's Guide</cite>.
|
|
|
|
|
2001-08-03 00:10:11 +08:00
|
|
|
<a name="Freespace">
|
2001-07-12 06:01:45 +08:00
|
|
|
<h2>3. Freespace Management</h2>
|
2001-08-03 00:10:11 +08:00
|
|
|
</a>
|
2001-07-12 06:01:45 +08:00
|
|
|
|
|
|
|
<p>HDF5 does not yet manage freespace as effectively as it might.
|
|
|
|
While a file is opened, the library actively tracks and re-uses
|
|
|
|
<em>freespace</em>, i.e., space that is freed (or released)
|
|
|
|
during the run.
|
|
|
|
But the library does not yet manage freespace across the
|
|
|
|
closing and reopening of a file; when a file is closed,
|
|
|
|
all knowledge of available freespace is lost.
|
|
|
|
What was freespace becomes an unusable <em>hole</em> in the file.
|
|
|
|
|
|
|
|
<p>There are several circumstances that can result in freespace
|
|
|
|
in an HDF5 file:
|
|
|
|
<ul>
|
|
|
|
<li>Reading then rewriting a dataset or compressed dataset
|
|
|
|
chunk.<sup><a href="#footcchunk">1</a></sup>
|
|
|
|
<ul>
|
|
|
|
<li>If the rewritten dataset or compressed chunk is the same
|
|
|
|
size as or smaller than the original, it will be written
|
|
|
|
to the same file location.
|
|
|
|
<li>If, however, the dataset or compressed chunk is larger
|
|
|
|
than the original, it will be written contiguously elsewhere
|
|
|
|
in the file, leaving freespace at the original location.
|
|
|
|
<li>If the rewritten dataset or compressed chunk is
|
|
|
|
substantially smaller than the original, the remaining
|
|
|
|
space will be released and identified as freespace.
|
|
|
|
</ul>
|
|
|
|
<li>Deleting (or unlinking) a dataset or group.
|
|
|
|
<ul>
|
|
|
|
<li>If an object, such as a dataset, group, or named datatype,
|
|
|
|
is deleted (normally with <code>H5Gunlink</code>),
|
|
|
|
the space previously occupied by the object is released
|
|
|
|
and identified as freespace.
|
|
|
|
</ul>
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<p>As stated above, freespace is not managed across the
|
|
|
|
closing and reopening of an HDF5 file; file space that was
|
|
|
|
known freespace while the file remained open becomes an
|
|
|
|
inaccessible hole when the file is closed.
|
|
|
|
Thus, if a file is often closed and reopened, datasets
|
|
|
|
frequently rewritten, or groups and/or datasets frequently
|
|
|
|
added and deleted, that file can develop large numbers of
|
|
|
|
holes and grow unnecessarily large. This can, in turn,
|
|
|
|
seriously impair application or library performance
|
|
|
|
as the file ages.
|
|
|
|
|
|
|
|
<p>An <code>h5pack</code> utility would enable <em>packing</em>
|
|
|
|
a file to remove the holes, but writing such a utility to
|
|
|
|
universally pack the file correctly is a complex task and the
|
|
|
|
HDF5 development team has not to date had the resources to
|
|
|
|
complete the task.
|
|
|
|
|
|
|
|
<p>For application developers or researchers who find themselves
|
|
|
|
working with files that become bloated in this manner, there
|
|
|
|
are, at this time, two remedies:
|
|
|
|
<ul>
|
|
|
|
<li><code>H5view</code>, an HDF5 Java tool, allows the user
|
|
|
|
to open a file and, using the <code>Save As...</code> feature,
|
|
|
|
save the file under a new filename. The new file can then
|
|
|
|
be closed and will be a packed version of the original file.
|
|
|
|
This approach is reasonably reliable, but with two caveats:
|
|
|
|
<ul>
|
|
|
|
<li>It is not automated.
|
|
|
|
<li>This ability is a side-effect of the tool's design;
|
|
|
|
it was not designed for this purpose and this approach
|
|
|
|
to file packing has not been exhaustively tested.
|
|
|
|
</ul>
|
|
|
|
<li>An application developer or researcher can write a utility
|
|
|
|
that is tuned to their data and file structures. This
|
|
|
|
untility can then read in a file, copy the structures and
|
|
|
|
datasets to a new file, and write the new file to storage.
|
|
|
|
This will eliminate the holes, making the new file a
|
|
|
|
fully-packed version of the original file.
|
|
|
|
</ul>
|
|
|
|
|
|
|
|
<a name="footcchunk">
|
|
|
|
<p></a>
|
|
|
|
<sup>1</sup>
|
|
|
|
<font size=-1>
|
|
|
|
This is a problem only with compressed chunks.
|
|
|
|
The compression ratio of data is highly dependent on the data
|
|
|
|
itself; regardless of whether the <em>size</em> of the data
|
|
|
|
changes, the size of the compressed data change substantially
|
|
|
|
as the data changes. Uncompressed chunks do not vary in size,
|
|
|
|
so this issue does not arise.
|
|
|
|
</font>
|
|
|
|
|
|
|
|
<h2>4. Use of the Pablo Instrumentation of HDF5</h2>
|
2000-12-22 21:32:40 +08:00
|
|
|
|
|
|
|
Pablo HDF5 Trace software provides a means of measuring the
|
|
|
|
performance of programs using HDF5.
|
|
|
|
|
|
|
|
<p>The Pablo software consists
|
|
|
|
of an instrumented copy of the HDF5 library, the Pablo Trace and
|
|
|
|
Trace Extensions libraries, and some utilities for processing the
|
|
|
|
output. The instrumented version of the HDF5 library has hooks
|
|
|
|
inserted into the HDF5 code which call routines in the Pablo Trace
|
|
|
|
library just after entry to each instrumented HDF5 routine and
|
|
|
|
just prior to exit from the routine. The Pablo Trace Extension
|
|
|
|
library has programs that track the I/O activity between the
|
|
|
|
entry and exit of the HDF5 routine during execution.
|
|
|
|
|
|
|
|
<p>A few lines of code must be inserted in the user's main program
|
|
|
|
to enable tracing and to specify which HDF5 procedures are to be
|
|
|
|
traced. The program is linked with the special HDF5 and Pablo
|
|
|
|
libraries to produce an executable. Running this executable on
|
|
|
|
a single processor produces an output file called the trace file
|
|
|
|
which contains records, called Pablo Self-Defining Data Format
|
|
|
|
(SDDF) records, which can later be analyzed using the
|
|
|
|
HDF5 Analysis Utilities. The HDF5 Analysis Utilites can be used
|
|
|
|
to interpret the SDDF records in the trace files to produce a
|
|
|
|
report describing the HDF5 IO activity that occurred during
|
|
|
|
execution.
|
|
|
|
|
|
|
|
<p>For further instructions, see the file <code>READ_ME</code>
|
|
|
|
in the <code> $(toplevel)/hdf5/pablo/ </code> subdirectory of
|
|
|
|
the HDF5 source code distribution.
|
|
|
|
|
|
|
|
<p>For further information about Pablo and the
|
|
|
|
Self-Defining Data Format, visit the Pablo website at
|
2003-03-15 03:12:34 +08:00
|
|
|
<code><a href="http://www-pablo.cs.uiuc.edu/">http://www-pablo.cs.uiuc.edu/</a></code>.</p>
|
2000-12-22 21:32:40 +08:00
|
|
|
|
|
|
|
|
2003-03-15 03:12:34 +08:00
|
|
|
<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
|
2000-12-22 21:32:40 +08:00
|
|
|
<center>
|
|
|
|
<table border=0 width=98%>
|
|
|
|
<tr><td valign=top align=left>
|
2003-03-15 03:12:34 +08:00
|
|
|
<a href="index.html">HDF5 documents and links</a> <br>
|
|
|
|
<a href="H5.intro.html">Introduction to HDF5</a> <br>
|
|
|
|
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
|
2003-07-04 03:35:53 +08:00
|
|
|
<a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a> <br>
|
2003-03-15 03:12:34 +08:00
|
|
|
<!--
|
|
|
|
<a href="Glossary.html">Glossary</a><br>
|
|
|
|
-->
|
2000-12-22 21:32:40 +08:00
|
|
|
</td>
|
|
|
|
<td valign=top align=right>
|
2003-03-15 03:12:34 +08:00
|
|
|
And in this document, the
|
2003-07-04 03:35:53 +08:00
|
|
|
<a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>
|
2003-03-15 03:12:34 +08:00
|
|
|
<br>
|
|
|
|
<a href="Files.html">Files</a>
|
|
|
|
<a href="Datasets.html">Datasets</a>
|
|
|
|
<a href="Datatypes.html">Datatypes</a>
|
|
|
|
<a href="Dataspaces.html">Dataspaces</a>
|
|
|
|
<a href="Groups.html">Groups</a>
|
|
|
|
<br>
|
|
|
|
<a href="References.html">References</a>
|
|
|
|
<a href="Attributes.html">Attributes</a>
|
|
|
|
<a href="Properties.html">Property Lists</a>
|
|
|
|
<a href="Errors.html">Error Handling</a>
|
|
|
|
<br>
|
|
|
|
<a href="Filters.html">Filters</a>
|
|
|
|
<a href="Caching.html">Caching</a>
|
|
|
|
<a href="Chunking.html">Chunking</a>
|
|
|
|
<a href="MountingFiles.html">Mounting Files</a>
|
|
|
|
<br>
|
|
|
|
<a href="Performance.html">Performance</a>
|
|
|
|
<a href="Debugging.html">Debugging</a>
|
|
|
|
<a href="Environment.html">Environment</a>
|
|
|
|
<a href="ddl.html">DDL</a>
|
2000-12-22 21:32:40 +08:00
|
|
|
</td></tr>
|
|
|
|
</table>
|
|
|
|
</center>
|
|
|
|
<hr>
|
2003-03-15 03:12:34 +08:00
|
|
|
<!-- #EndLibraryItem --><!-- #BeginLibraryItem "/ed_libs/Footer.lbi" --><address>
|
|
|
|
<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
|
|
|
|
<br>
|
2003-07-04 03:35:53 +08:00
|
|
|
Describes HDF5 Release 1.4.5, February 2003
|
2003-07-03 05:28:11 +08:00
|
|
|
</address><!-- #EndLibraryItem --><!-- Created: Thu Oct 14 16:46:00 CDT 1999 -->
|
2000-12-22 21:32:40 +08:00
|
|
|
<!-- hhmts start -->
|
2003-03-15 03:12:34 +08:00
|
|
|
Last modified: 2 August 2001
|
2000-12-22 21:32:40 +08:00
|
|
|
<!-- hhmts end -->
|
|
|
|
|
2003-03-15 03:12:34 +08:00
|
|
|
</body>
|
2000-12-22 21:32:40 +08:00
|
|
|
</html>
|