mirror of
synced 2025-03-25 17:00:45 +08:00
Purpose: New tool -- h5import Description: Added h5import. Platforms tested: Safari and IE 5
This commit is contained in:
@ -62,6 +62,8 @@ to convert files from HDF4 format to HDF5 format and vice versa.
A tool for listing specified features of HDF5 file contents
<li><a href="#Tools-Repart">h5repart</a> --
A tool for repartitioning a file, creating a family of files
<li><a href="#Tools-Import">h5import</a> --
A tool for importing data into an existing or new HDF5 file
<li><a href="#Tools-GIF2H5">gif2h5</a> --
A tool for converting a GIF file to an HDF5 file
<li><a href="#Tools-H52GIF">h52gif</a> --
@ -539,6 +541,691 @@ to convert files from HDF4 format to HDF5 format and vice versa.
<dt><strong>Tool Name:</strong> <a name="Tools-Import">h5import</a>
<em>infile</em> <em>in_options</em>
[<em>infile</em> <em>in_options</em> <b>...</b>]
-o <em>outfile</em>
<em>infile</em> <em>in_options</em>
[<em>infile</em> <em>in_options</em> <b>...</b>]
-outfile <em>outfile</em>
<dd><code>h5import -h</code>
<dd><code>h5import -help</code>
<dd>Imports data into an existing or new HDF5 file.
<dd><code>h5import</code> converts data
from one or more ASCII or binary files, <code><i>infile</i></code>,
into the same number of HDF5 datasets
in the existing or new HDF5 file, <code><i>outfile</i></code>.
Data conversion is performed in accordance with the
user-specified type and storage properties
specified in <code><em>in_options</em></code>.
The primary objective of <code>h5import</code> is to
import floating point or integer data.
The utility's design allows for future versions that
accept ASCII text files and store the contents as a
compact array of one-dimensional strings,
but that capability is not implemented in HDF5 Release 1.6.
<b>Input data and options:</b><br>
Input data can be provided in one of the follwing forms:
<ul><li>As an ASCII, or plain-text, file containing either
floating point or integer data
<li>As a binary file containing either 32-bit or
64-bit native floating point data
<li>As a binary file containing native integer data,
signed or unsigned and
8-bit, 16-bit, 32-bit, or 64-bit.
<li>As an ASCII, or plain-text, file containing text data.
(This feature is not implemented in HDF5 Release 1.6.)
Each input file, <code><i>infile</i></code>,
contains a single <em>n</em>-dimensional
array of values of one of the above types expressed
in the order of fastest-changing dimensions first.
Floating point data in an ASCII input file must be
expressed in the fixed floating form (e.g., 323.56)
<code>h5import</code> is designed to accept scientific notation
(e.g., 3.23E+02) in an ASCII, but that is not implemented in HDF5 release 1.6.
Each input file can be associated with options specifying
the datatype and storage properties.
These options can be specified either as
<em>command line arguments</em>
or in a <em>configuration file</em>.
Note that exactly one of these approaches must be used with a
single input file.
Command line arguments, best used with simple input files,
can be used to specify
the class, size, dimensions of the input data and
a path identifying the output dataset.
The recommended means of specifying input data options
is in a configuration file; this is also the only means of
specifying advanced storage features.
See further discussion in "The configuration file" below.
The only required option for input data is dimension sizes;
defaults are available for all others.
<code>h5import</code> will accept up to 30 input files in a single call.
Other considerations, such as the maximum length of a command line,
may impose a more stringent limitation.
<b>Output data and options:</b><br>
The name of the output file is specified following
the <code>-o</code> or <code>-output</code> option
in <code><i>outfile</i></code>.
The data from each input file is stored as a separate dataset
in this output file.
<code><i>outfile</i></code> may be an existing file.
If it does not yet exist, <code>h5import</code> will create it.
Output dataset information and storage properties can be
specified only by means of a configuration file.
<table width=100% border=0>
<tr valign=top align=left><td width=30> </td><td>
Dataset path
</td><td>If the groups in the path leading to the dataset
do not exist, <code>h5import</code> will create them.<br>
If no group is specified, the dataset will be created
under the root group.<br>
If no dataset name is specified, the dataset will be created
as <code>dataset1</code>.<br>
<code>h5import</code> does not check for a pre-existing dataset
of the specified or default name; it overwrites any such dataset
without offering an opportunity to preserve it.
<tr valign=top align=left><td width=30> </td><td>
Output type
</td><td>Datatype parameters for output data
<tr valign=top align=left><td width=30> </td><td>
Output data class
</td><td>Signed or unsigned integer or floating point
<tr valign=top align=left><td width=30> </td><td>
Output data size
</td><td>8-, 16-, 32-, or 64-bit integer<br>
31- or 64-bit floating point
<tr valign=top align=left><td width=30> </td><td>
Output architecture
<code>NATIVE</code> (Default)<br>
Other architectures are included in the <code>h5import</code> design
but are not implemented in this release.
<tr valign=top align=left><td width=30> </td><td>
Output byte order
</td><td>Little- or big-endian.<br>
Relevant only if output architecture
is <code>IEEE</code>, <code>UNIX</code>, or <code>STD</code>;
fixed for other architectures.
<tr valign=top align=left><td width=30> </td><td>
Dataset layout and storage <br>
</td><td>Denote how raw data is to be organized on the disk.
If none of the following are specified,
the default configuration is contiguous layout and with no compression.
<tr valign=top align=left><td width=30> </td><td>
</td><td>Contiguous (Default)<br>
<tr valign=top align=left><td width=30> </td><td>
External storage
</td><td>Allows raw data to be stored in a non-HDF5 file or in an
external HDF5 file.<br>
Requires contiguous layout.
<tr valign=top align=left><td width=30> </td><td>
</td><td>Sets the type of compression and the
level to which the dataset must be compressed.<br>
Requires chunked layout.
<tr valign=top align=left><td width=30> </td><td>
</td><td>Allows the dimensions of the dataset increase over time
and/or to be unlimited.<br>
Requires chunked layout.
<tr valign=top align=left><td width=30> </td><td>
Compressed and<br>
</td><td>Requires chunked layout.
<tr valign=top align=left><td width=30> </td><td>
<b>Command-line arguments:</b><br>
The <code>h5import</code> syntax for the command-line arguments,
<code><em>in_options</em></code>, is as follows:
<table width=100% border=0>
<tr><td> </td><td>
<code>h5import <em>infile</em> -d <em>dim_list</em>
[-p <em>pathname</em>]
[-t <em>input_class</em>]
[-s <em>input_size</em>]
[<em>infile</em> ...]
-o <em>outfile</em></code><br>
<code>h5import <em>infile</em> -dims <em>dim_list</em>
[-path <em>pathname</em>]
[-type <em>input_class</em>]
[-size <em>input_size</em>]
[<em>infile</em> ...]
-outfile <em>outfile</em></code><br>
<code>h5import <em>infile</em> -c <em>config_file</em>
[<em>infile</em> ...]
-outfile <em>outfile</em></code>
Note the following:
If the <code>-c <em>config_file</em></code> option is used with
an input file, no other argument can be used with that input file.
If the <code>-c <em>config_file</em></code> option is not used with
an input data file, the <code>-d <em>dim_list</em></code> argument
(or <code>-dims <em>dim_list</em></code>)
must be used and any combination of the remaining options may be used.
Any arguments used must appear in <em>exactly</em> the order used
in the syntax declarations immediately above.
<b>The configuration file:</b><br>
A configuration file is specified with the
<code>-c <em>config_file</em></code> option:
<table border=0>
<tr><td> </td><td>
<code>h5import <em>infile</em> -c <em>config_file</em>
[<em>infile</em> -c <em>config_file2</em> ...]
-outfile ><em>outfile</em></code>
The configuration file is an ASCII file and must be
organized as "Configuration_Keyword Value" pairs,
with one pair on each line.
For example, the line indicating that
the input data class (configuration keyword <code>INPUT-CLASS</code>)
is floating point in a text file (value <code>TEXTFP</code>)
would appear as follows:<br>
<code> INPUT-CLASS TEXTFP</code>
A configuration file may have the following keywords each
followed by one of the following defined values.
One entry for each of the first two keywords,
<code>RANK</code> and <code>DIMENSION-SIZES</code>,
is required; all other keywords are optional.
<table width=100% border=0>
<tr align=left><th valign=top align=left>
<hr>Keyword <br><code> </code>Value
</th><th valign=top align=left><hr>Description
<tr valign=top align=left><td>
<hr><code>RANK </code>
</td><td><hr>The number of dimensions in the dataset. (Required)
<tr valign=top align=left><td>
<code> <em>rank</em></code>
</td><td>An integer specifying the number of dimensions in the dataset.<br>
Example: <code> 4 </code> for a 4-dimensional dataset.
<tr valign=top align=left><td>
</td><td><hr>Sizes of the dataset dimensions. (Required)
<tr valign=top align=left><td>
<code> <em>dim_sizes</em></code>
</td><td>A string of space-separated integers
specifying the sizes of the dimensions in the dataset.
The number of sizes in this entry must match the value in
the <code>RANK</code> entry.<br>
Example: <code> 4 3 4 38 </code> for a 4x3x4x38 dataset.
<tr valign=top align=left><td>
</td><td><hr>Path of the output dataset.
<tr valign=top align=left><td>
<code> <em>path</em></code>
</td><td>The full HDF5 pathname identifying the output dataset
relative to the root group within the output file.<br>
I.e., <code><em>path</em></code> is a string of optional group names,
each followed by a slash,
and ending with a dataset name.
If the groups in the path do no exist, they will be created.<br>
If <code>PATH</code> is not specified, the default
<code><em>path</em></code> is <code>/dataset1</code>.<br>
Example: The configuration file entry
<table border=0>
<tr><td> </td><td>
<code>PATH grp1/grp2/dataset1</code>
indicates that the output dataset <code>dataset1</code> will be written
in the group <code>grp2/</code> which is in the group <code>grp1/</code>,
a member of the root group in the output file.
<tr valign=top align=left><td>
<hr><code>INPUT-CLASS </code>
</td><td><hr>A string denoting the type of input data.
<tr valign=top align=left><td>
<code> TEXTIN</code>
</td><td>Input is signed integer data in an ASCII file.
<tr valign=top align=left><td>
<code> TEXTUIN</code>
</td><td>Input is unsigned integer data in an ASCII file.
<tr valign=top align=left><td>
<code> TEXTFP</code>
</td><td>Input is floating point data in fixed notation (e.g., 325.34)
in an ASCII file.
<tr valign=top align=left><td>
<code> TEXTFPE</code>
</td><td>Input is floating point data in scientific notation (e.g., 3.2534E+02)
in an ASCII file.<br>
(Not implemented in this release.)
<tr valign=top align=left><td>
<code> IN</code>
</td><td>Input is signed integer data in a binary file.
<tr valign=top align=left><td>
<code> UIN</code>
</td><td>Input is unsigned integer data in a binary file.
<tr valign=top align=left><td>
<code> FP</code>
</td><td>Input is floating point data in a binary file. (Default)
<tr valign=top align=left><td>
<code> STR</code>
</td><td>Input is character data in an ASCII file.
With this value, the configuration keywords
<code>RANK</code>, <code>DIMENSION-SIZES</code>,
<code>OUTPUT-CLASS</code>, <code>OUTPUT-SIZE</code>,
<code>OUTPUT-ARCHITECTURE</code>, and <code>OUTPUT-BYTE-ORDER</code>
will be ignored.<br>
(Not implemented in this release.)
<tr valign=top align=left><td>
</td><td><hr>An integer denoting the size of the input data, in bits.
<tr valign=top align=left><td>
<code> 8</code><br>
<code> 16</code><br>
<code> 32</code><br>
<code> 64</code>
</td><td>For signed and unsigned integer data:
<code>TEXTIN</code>, <code>TEXTUIN</code>,
<code>IN</code>, or <code>UIN</code>.
(Default: <code> 32</code>)
<tr valign=top align=left><td>
<code> 32</code><br>
<code> 64</code>
</td><td>For floating point data:
<code>TEXTFP</code>, <code>TEXTFPE</code>,
or <code>FP</code>.
(Default: <code> 32</code>)
<tr valign=top align=left><td>
<hr><code>OUTPUT-CLASS </code>
</td><td><hr>A string denoting the type of output data.
<tr valign=top align=left><td>
<code> IN</code>
</td><td>Output is signed integer data.<br>
(Default if <code>INPUT-CLASS</code> is
<code>IN</code> or <code>TEXTIN</code>)
<tr valign=top align=left><td>
<code> UIN</code>
</td><td>Output is unsigned integer data.<br>
(Default if <code>INPUT-CLASS</code> is
<code>UIN</code> or <code>TEXTUIN</code>)
<tr valign=top align=left><td>
<code> FP</code>
</td><td>Output is floating point data.<br>
(Default if <code>INPUT-CLASS</code> is not specified or is
<code>FP</code>, <code>TEXTFP</code>, or <code>TEXTFPE</code>)
<tr valign=top align=left><td>
<code> STR</code>
</td><td>Output is character data,
to be written as a 1-dimensional array of strings.<br>
(Default if <code>INPUT-CLASS</code> is <code>UIN</code>
or <code>TEXTUIN</code>)<br>
(Not implemented in this release.)
<tr valign=top align=left><td>
</td><td><hr>An integer denoting the size of the output data, in bits.
<tr valign=top align=left><td>
<code> 8</code><br>
<code> 16</code><br>
<code> 32</code><br>
<code> 64</code>
</td><td>For signed and unsigned integer data:
<code>IN</code> or <code>UIN</code>.
(Default: Same as <code>INPUT-SIZE</code>, else <code> 32</code>)
<tr valign=top align=left><td>
<code> 32</code><br>
<code> 64</code>
</td><td>For floating point data:
(Default: Same as <code>INPUT-SIZE</code>, else <code> 32</code>)
<tr valign=top align=left><td>
</td><td><hr>A string denoting the type of output architecture.
<tr valign=top align=left><td>
<code> STD</code><br>
<code> IEEE</code><br>
<code> INTEL</code> *<br>
<code> CRAY</code> *<br>
<code> MIPS</code> *<br>
<code> ALPHA</code> *<br>
<code> NATIVE</code><br>
<code> UNIX</code> *
</td><td>See the "Predefined Atomic Types" section
in the "HDF5 Datatypes" chapter
of the <cite>HDF5 User's Guide</cite>
for a discussion of these architectures.<br>
Values marked with an asterisk (*) are not implemented in this release.<br>
(Default: <code>NATIVE</code>)
<tr valign=top align=left><td>
</td><td><hr>A string denoting the output byte order.
This entry is ignored if the <code>OUTPUT-ARCHITECTURE</code>
is not specified or if it is specified as <code>IEEE</code>,
<code>UNIX</code>, or <code>STD</code>.
<tr valign=top align=left><td>
<code> BE</code>
</td><td>Big-endian. (Default)
<tr valign=top align=left><td>
<code> LE</code>
<tr valign=top align=left><td colspan="2">
<hr>The following options are disabled by default, making
the default storage properties no chunking, no compression,
no external storage, and no extensible dimensions.
<tr valign=top align=left><td>
</td><td><hr>Dimension sizes of the chunk for chunked output data.
<br><b><i>BTW, is this</i></b> <code>CHUNKED-DIMENSION</code> <b><i>or</i></b> <code>CHUNKED-D...-SIZES</code><b><i>?</i></b>
<tr valign=top align=left><td>
<code> <em>chunk_dims</em></code>
</td><td>A string of space-separated integers specifying the
dimension sizes of the chunk for chunked output data.
The number of dimensions must correspond to the value
of <code>RANK</code>.<br>
The presence of this field indicates that the
output dataset is to be stored in chunked layout;
if this configuration field is absent,
the dataset will be stored in contiguous layout.
<tr valign=top align=left><td>
</td><td><hr>Type of compression to be used with chunked storage.
Requires that <code>CHUNKED-DIMENSION</code> be specified.
<tr valign=top align=left><td>
<code> GZIP</code>
</td><td>Gzip compression.<br>
Othe compression algorithms are not implemented
in this release of <code>h5import</code>.
<tr valign=top align=left><td>
</td><td><hr>Compression level.
Required if <code>COMPRESSION-TYPE</code> is specified.
<b><i>Since there is a default, is "required" true?</i></b>
<tr valign=top align=left><td>
<code> 1</code> through <code>9</code>
</td><td>Gzip compression levels:
<code>1</code> will result in the fastest compression
while <code>9</code> will result in the best compression ratio.
(Default: 6)
<tr valign=top align=left><td>
</td><td><hr>Name of an external file in which to create the output dataset.
Cannot be used with <code>CHUNKED-DIMENSIONS</code>,
<tr valign=top align=left><td>
<code> <i>external_file</i> </code>
</td><td>A string specifying the name of an external file.
<tr valign=top align=left><td>
</td><td><hr>Maximum sizes of all dimensions.
Requires that <code>CHUNKED-DIMENSION</code> be specified.
<tr valign=top align=left><td>
<code> <em>max_dims</em></code>
</td><td>A string of space-separated integers specifying the
maximum size of each dimension of the output dataset.
A value of <code>-1</code> for any dimension implies
unlimited size for that particular dimension.<br>
The number of dimensions must correspond to the value
of <code>RANK</code>.<br>
<tr valign=top align=left><td><hr></td><td><hr></td></tr>
<b>The <code>help</code> option:</b><br>
The help option, expressed as one of
<table width=100% border=0>
<tr><td> </td><td>
<code>h5import -h</code><br>
<code>h5import -help</code><br>
<tr><td colspan="2">prints the <code>h5import</code> usage summary</td></tr>
<tr><td> </td><td>
h5import -h[elp], OR<br>
h5import <infile> <options>
[<infile> <options>...]
-o[utfile] <outfile></code>
<tr><td colspan="2">then exits.</td></tr>
<dt><strong>Options and Parameters:</strong>
<dd>Name of the Input file(s).
<dd>Input options. Note that while only the <code>-dims</code> argument
is required, arguments must used in the order in which they are listed below.
<dt><code>-d <em>dim_list</em></code>
<dt><code>-dims <em>dim_list</em></code>
<dd>Input data dimensions.
<code><em>dim_list</em></code> is a string of
comma-separated numbers with no spaces
describing the dimensions of the input data.
For example, a 50 x 100 2-dimensional array would be
specified as <code>-dims 50,100</code>.<br>
Required argument: if no configuration file is used,
this command-line argument is mandatory.
<dt><code>-p <em>pathname</em></code>
<dt><code>-pathname <em>pathname</em></code>
<dd><code><em>pathname</em></code> is a string consisiting of
one or more strings separated by '/' specifying the path
of the dataset in the output file.
If the groups in the path do no exist, they will be created.<br>
Optional argument: if not specified,
the default path is <code>/dataset1</code>.<br>
<code>h5import</code> does not check for a pre-existing dataset
of the specified or default name; it overwrites any such dataset
without offering an opportunity to preserve it.
<dt><code>-t <em>input_class</em></code>
<dt><code>-type <em>input_class</em></code>
<dd><code><em>input_class</em></code> specifies the class of the
input data and determines the class of the output data.<br>
Valid values are as defined in the "ZZZinput_classZZZ"
section of "ZZZconfig_fileZZZ".<br>
Optional argument: if not specified,
the default value is <code>FP</code>.
<dt><code>-s <em>input_size</em></code>
<dt><code>-size <em>input_size</em></code>
<dd><code><em>input_size</em></code> specifies the size in bits
of the input data and determines the size of the output data.<br>
Valid values for signed or unsigned integers are
<code>8</code>, <code>16</code>, <code>32</code>, and <code>64</code>.<br>
Valid values for floating point data are
<code>32</code> and <code>64</code>.<br>
Optional argument: if not specified,
the default value is <code>32</code>.
<dt><code>-c <em>config_file</em></code>
<dd><code><em>config_file</em></code> specifies a configuration file.<br>
This argument replaces all other arguments except
<code><em>infile</em></code>and <code>-o <em>outfile</em></code>
<dd>Name of the HDF5 output file.
<dd><b>Using command-line arguments:</b>
<table width=100% border=0>
<tr><td colspan=2>
<code>h5import infile -dims 2,3,4 -type TEXTIN -size 32 -o out1</code>
</td></tr><tr><td> </td><td>
This command creates a file <code>out1</code> containing
a single 2x3x4 32-bit integer dataset.
Since no pathname is specified, the dataset is stored
in <code>out1</code> as <code>/dataset1</code>.
</td></tr><tr><td colspan=2>
<code>h5import infile -dims 20,50 -path bin1/dset1 -type FP -size 64 -o out2</code>
</td></tr><tr><td> </td><td>
This command creates a file <code>out2</code> containing
a single a 20x50 64-bit floating point dataset.
The dataset is stored in <code>out2</code> as <code>/bin1/dset1</code>.
<dd><b>Sample configuration files:</b><br>
The following configuration file specifies the following:<br>
– The input data is a 5x2x4 floating point array in an ASCII file.<br>
– The output dataset will be saved in chunked layout,
with chunk dimension sizes of 2x2x2.<br>
– The output datatype will be 64-bit floating point, little-endian, IEEE.<br>
– The output dataset will be stored in <code><em>outfile</em></code>
at <code>/work/h5/pkamat/First-set</code>.<br>
– The maximum dimension sizes of the output dataset
will be 8x8x(unlimited).
PATH work h5 pkamat First-set
The next configuration file specifies the following:<br>
– The input data is a 6x3x5x2x4 integer array in a binary file.<br>
– The output dataset will be saved in chunked layout,
with chunk dimension sizes of 2x2x2x2x2.<br>
– The output datatype will be 32-bit integer in <code>NATIVE</code> format
(as the output architecure is not specified).<br>
– The output dataset will be compressed using Gzip compression
with a compression level of 7.<br>
– The output dataset will be stored in <code><em>outfile</em></code>
at <code>/Second-set</code>.
PATH Second-set
<dt><strong>Current Status:</strong>
<dt><strong>See Also:</strong>
<dt><strong>Tool Name:</strong> <a name="Tools-GIF2H5">gif2h5</a>
@ -1081,7 +1768,7 @@ And in this document, the
Describes HDF5 Release 1.5, Unreleased Development Branch
</address><!-- #EndLibraryItem -->
Last modified: 6 May 2003
Last modified: 30 May 2003
Reference in New Issue
Block a user