mirror of
https://github.com/HDFGroup/hdf5.git
synced 2025-04-12 17:31:09 +08:00
[svn-r4193] Purpose:
New section -- "Freespace Management" Description: Added "Freespace Management" section. Minor formatting. Platforms tested: IE 5
This commit is contained in:
parent
4b218c6a58
commit
7c706d9d14
@ -58,12 +58,100 @@
|
||||
|
||||
<h2>2. Dataset Chunking</h2>
|
||||
|
||||
Appropriate dataset chunking can make a siginificant difference
|
||||
in HDF5 performance. This topic is discussed in
|
||||
<a href="Chunking.html">Dataset Chunking Issues</a> elsewhere
|
||||
in this <cite>User's Guide</cite>.
|
||||
Appropriate dataset chunking can make a siginificant difference
|
||||
in HDF5 performance. This topic is discussed in
|
||||
<a href="Chunking.html">Dataset Chunking Issues</a> elsewhere
|
||||
in this <cite>User's Guide</cite>.
|
||||
|
||||
<h2>3. Use of the Pablo Instrumentation of HDF5</h2>
|
||||
<h2>3. Freespace Management</h2>
|
||||
|
||||
<p>HDF5 does not yet manage freespace as effectively as it might.
|
||||
While a file is opened, the library actively tracks and re-uses
|
||||
<em>freespace</em>, i.e., space that is freed (or released)
|
||||
during the run.
|
||||
But the library does not yet manage freespace across the
|
||||
closing and reopening of a file; when a file is closed,
|
||||
all knowledge of available freespace is lost.
|
||||
What was freespace becomes an unusable <em>hole</em> in the file.
|
||||
|
||||
<p>There are several circumstances that can result in freespace
|
||||
in an HDF5 file:
|
||||
<ul>
|
||||
<li>Reading then rewriting a dataset or compressed dataset
|
||||
chunk.<sup><a href="#footcchunk">1</a></sup>
|
||||
<ul>
|
||||
<li>If the rewritten dataset or compressed chunk is the same
|
||||
size as or smaller than the original, it will be written
|
||||
to the same file location.
|
||||
<li>If, however, the dataset or compressed chunk is larger
|
||||
than the original, it will be written contiguously elsewhere
|
||||
in the file, leaving freespace at the original location.
|
||||
<li>If the rewritten dataset or compressed chunk is
|
||||
substantially smaller than the original, the remaining
|
||||
space will be released and identified as freespace.
|
||||
</ul>
|
||||
<li>Deleting (or unlinking) a dataset or group.
|
||||
<ul>
|
||||
<li>If an object, such as a dataset, group, or named datatype,
|
||||
is deleted (normally with <code>H5Gunlink</code>),
|
||||
the space previously occupied by the object is released
|
||||
and identified as freespace.
|
||||
</ul>
|
||||
</ul>
|
||||
|
||||
<p>As stated above, freespace is not managed across the
|
||||
closing and reopening of an HDF5 file; file space that was
|
||||
known freespace while the file remained open becomes an
|
||||
inaccessible hole when the file is closed.
|
||||
Thus, if a file is often closed and reopened, datasets
|
||||
frequently rewritten, or groups and/or datasets frequently
|
||||
added and deleted, that file can develop large numbers of
|
||||
holes and grow unnecessarily large. This can, in turn,
|
||||
seriously impair application or library performance
|
||||
as the file ages.
|
||||
|
||||
<p>An <code>h5pack</code> utility would enable <em>packing</em>
|
||||
a file to remove the holes, but writing such a utility to
|
||||
universally pack the file correctly is a complex task and the
|
||||
HDF5 development team has not to date had the resources to
|
||||
complete the task.
|
||||
|
||||
<p>For application developers or researchers who find themselves
|
||||
working with files that become bloated in this manner, there
|
||||
are, at this time, two remedies:
|
||||
<ul>
|
||||
<li><code>H5view</code>, an HDF5 Java tool, allows the user
|
||||
to open a file and, using the <code>Save As...</code> feature,
|
||||
save the file under a new filename. The new file can then
|
||||
be closed and will be a packed version of the original file.
|
||||
This approach is reasonably reliable, but with two caveats:
|
||||
<ul>
|
||||
<li>It is not automated.
|
||||
<li>This ability is a side-effect of the tool's design;
|
||||
it was not designed for this purpose and this approach
|
||||
to file packing has not been exhaustively tested.
|
||||
</ul>
|
||||
<li>An application developer or researcher can write a utility
|
||||
that is tuned to their data and file structures. This
|
||||
untility can then read in a file, copy the structures and
|
||||
datasets to a new file, and write the new file to storage.
|
||||
This will eliminate the holes, making the new file a
|
||||
fully-packed version of the original file.
|
||||
</ul>
|
||||
|
||||
<a name="footcchunk">
|
||||
<p></a>
|
||||
<sup>1</sup>
|
||||
<font size=-1>
|
||||
This is a problem only with compressed chunks.
|
||||
The compression ratio of data is highly dependent on the data
|
||||
itself; regardless of whether the <em>size</em> of the data
|
||||
changes, the size of the compressed data change substantially
|
||||
as the data changes. Uncompressed chunks do not vary in size,
|
||||
so this issue does not arise.
|
||||
</font>
|
||||
|
||||
<h2>4. Use of the Pablo Instrumentation of HDF5</h2>
|
||||
|
||||
Pablo HDF5 Trace software provides a means of measuring the
|
||||
performance of programs using HDF5.
|
||||
@ -147,7 +235,7 @@
|
||||
|
||||
<!-- Created: Thu Oct 14 16:46:00 CDT 1999 -->
|
||||
<!-- hhmts start -->
|
||||
Last modified: 14 October 1999
|
||||
Last modified: 11 July 2001
|
||||
<!-- hhmts end -->
|
||||
|
||||
<br>
|
||||
|
Loading…
x
Reference in New Issue
Block a user