mirror of
https://github.com/HDFGroup/hdf5.git
synced 2024-12-09 07:32:32 +08:00
224 lines
8.7 KiB
Plaintext
224 lines
8.7 KiB
Plaintext
1. Title:
|
|
User Guide for SWMR Use Case Programs
|
|
|
|
2. Purpose:
|
|
This is a User Guide of the SWMR Use Case programs. It descibes the use
|
|
case program and explain how to run them.
|
|
|
|
2.1. Author and Dates:
|
|
Version 2: By Albert Cheng (acheng@hdfgroup.org), 2013/06/18.
|
|
Version 1: By Albert Cheng (acheng@hdfgroup.org), 2013/06/01.
|
|
|
|
|
|
%%%%Use Case 1.7%%%%
|
|
|
|
3. Use Case [1.7]:
|
|
Appending a single chunk
|
|
|
|
3.1. Program name:
|
|
use_append_chunk
|
|
|
|
3.2. Description:
|
|
Appending a single chunk of raw data to a dataset along an unlimited
|
|
dimension within a pre-created file and reading the new data back.
|
|
|
|
It first creates one 3d dataset using chunked storage, each chunk
|
|
is a (1, chunksize, chunksize) square. The dataset is (unlimited,
|
|
chunksize, chunksize). Data type is 2 bytes integer. It starts out
|
|
"empty", i.e., first dimension is 0.
|
|
|
|
The writer then appends planes, each of (1,chunksize,chunksize)
|
|
to the dataset. Fills each plan with plane number and then writes
|
|
it at the nth plane. Increases the plane number and repeats till
|
|
the end of dataset, when it reaches chunksize long. End product is
|
|
a chunksize^3 cube.
|
|
|
|
The reader is a separated process, running in parallel with
|
|
the writer. It reads planes from the dataset. It expects the
|
|
dataset is being changed (growing). It checks the unlimited dimension
|
|
(dimension[0]). When it increases, it will read in the new planes, one
|
|
by one, and verify the data correctness. (The nth plan should contain
|
|
all "n".) When the unlimited dimension grows to the chunksize (it
|
|
becomes a cube), that is the expected end of data, the reader exits.
|
|
|
|
3.3. How to run the program:
|
|
Simplest way is
|
|
$ use_append_chunk
|
|
|
|
It creates a skeleton dataset (0,256,256) of shorts. Then fork off
|
|
a process, which becomes the reader process to read planes from the
|
|
dataset, while the original process continues as the writer process
|
|
to append planes onto the dataset.
|
|
|
|
Other possible options:
|
|
|
|
1. -z option: different chunksize. Default is 256.
|
|
$ use_append_chunk -z 1024
|
|
|
|
It uses (1,1024,1024) chunks to produce a 1024^3 cube, about 2GB big.
|
|
|
|
|
|
2. -f filename: different dataset file name
|
|
$ use_append_chunk -f /gpfs/tmp/append_data.h5
|
|
|
|
The data file is /gpfs/tmp/append_data.h5. This allows two independent
|
|
processes in separated compute nodes to access the datafile on the
|
|
shared /gpfs file system.
|
|
|
|
|
|
3. -l option: launch only the reader or writer process.
|
|
$ use_append_chunk -f /gpfs/tmp/append_data.h5 -l w # in node X
|
|
$ use_append_chunk -f /gpfs/tmp/append_data.h5 -l r # in node Y
|
|
|
|
In node X, launch the writer process, which creates the data file
|
|
and appends to it.
|
|
In node Y, launch the read process to read the data file.
|
|
|
|
Note that you need to time the read process to start AFTER the write
|
|
process has created the skeleton data file. Otherwise, the reader
|
|
will encounter errors such as data file not found.
|
|
|
|
4. -n option: number of planes to write/read. Default is same as the
|
|
chunk size as specified by option -z.
|
|
$ use_append_chunk -n 1000 # 1000 planes are writtern and read.
|
|
|
|
5. -s option: use SWMR file access mode or not. Default is yes.
|
|
$ use_append_chunk -s 0
|
|
|
|
It opens the HDF5 data file without the SWMR access mode (0 means
|
|
off). This likely will result in error. This option is provided for
|
|
users to see the effect of the neede SWMR access mode for concurrent
|
|
access.
|
|
|
|
3.4. Test Shell Script:
|
|
The Use Case program is installed in the test/ directory and is
|
|
compiled as part of the make process. A test script (test_usecases.sh)
|
|
is installed in the same directory to test the use case programs. The
|
|
test script is rather basic and is more for demonstrating how to
|
|
use the program.
|
|
|
|
|
|
%%%%Use Case 1.8%%%%
|
|
|
|
4. Use Case [1.8]:
|
|
Appending a hyperslab of multiple chunks.
|
|
|
|
4.1. Program name:
|
|
use_append_mchunks
|
|
|
|
4.2. Description:
|
|
Appending a hyperslab that spans several chunks of a dataset with
|
|
unlimited dimensions within a pre-created file and reading the new
|
|
data back.
|
|
|
|
It first creates one 3d dataset using chunked storage, each chunk is a (1,
|
|
chunksize, chunksize) square. The dataset is (unlimited, 2*chunksize,
|
|
2*chunksize). Data type is 2 bytes integer. Therefore, each plane
|
|
consists of 4 chunks. It starts out "empty", i.e., first dimension is 0.
|
|
|
|
The writer then appends planes, each of (1,2*chunksize,2*chunksize)
|
|
to the dataset. Fills each plan with plane number and then writes
|
|
it at the nth plane. Increases the plane number and repeats till
|
|
the end of dataset, when it reaches chunksize long. End product is
|
|
a (2*chunksize)^3 cube.
|
|
|
|
The reader is a separated process, running in parallel with
|
|
the writer. It reads planes from the dataset. It expects the
|
|
dataset is being changed (growing). It checks the unlimited dimension
|
|
(dimension[0]). When it increases, it will read in the new planes, one
|
|
by one, and verify the data correctness. (The nth plan should contain
|
|
all "n".) When the unlimited dimension grows to the 2*chunksize (it
|
|
becomes a cube), that is the expected end of data, the reader exits.
|
|
|
|
4.3. How to run the program:
|
|
Simplest way is
|
|
$ use_append_mchunks
|
|
|
|
It creates a skeleton dataset (0,512,512) of shorts. Then fork off
|
|
a process, which becomes the reader process to read planes from the
|
|
dataset, while the original process continues as the writer process
|
|
to append planes onto the dataset.
|
|
|
|
Other possible options:
|
|
|
|
1. -z option: different chunksize. Default is 256.
|
|
$ use_append_mchunks -z 512
|
|
|
|
It uses (1,512,512) chunks to produce a 1024^3 cube, about 2GB big.
|
|
|
|
|
|
2. -f filename: different dataset file name
|
|
$ use_append_mchunks -f /gpfs/tmp/append_data.h5
|
|
|
|
The data file is /gpfs/tmp/append_data.h5. This allows two independent
|
|
processes in separated compute nodes to access the datafile on the
|
|
shared /gpfs file system.
|
|
|
|
|
|
3. -l option: launch only the reader or writer process.
|
|
$ use_append_mchunks -f /gpfs/tmp/append_data.h5 -l w # in node X
|
|
$ use_append_mchunks -f /gpfs/tmp/append_data.h5 -l r # in node Y
|
|
|
|
In node X, launch the writer process, which creates the data file
|
|
and appends to it.
|
|
In node Y, launch the read process to read the data file.
|
|
|
|
Note that you need to time the read process to start AFTER the write
|
|
process has created the skeleton data file. Otherwise, the reader
|
|
will encounter errors such as data file not found.
|
|
|
|
4. -n option: number of planes to write/read. Default is same as the
|
|
chunk size as specified by option -z.
|
|
$ use_append_mchunks -n 1000 # 1000 planes are writtern and read.
|
|
|
|
5. -s option: use SWMR file access mode or not. Default is yes.
|
|
$ use_append_mchunks -s 0
|
|
|
|
It opens the HDF5 data file without the SWMR access mode (0 means
|
|
off). This likely will result in error. This option is provided for
|
|
users to see the effect of the neede SWMR access mode for concurrent
|
|
access.
|
|
|
|
4.4. Test Shell Script:
|
|
The Use Case program is installed in the test/ directory and is
|
|
compiled as part of the make process. A test script (test_usecases.sh)
|
|
is installed in the same directory to test the use case programs. The
|
|
test script is rather basic and is more for demonstrating how to
|
|
use the program.
|
|
|
|
|
|
%%%%Use Case 1.9%%%%
|
|
|
|
5. Use Case [1.9]:
|
|
Appending n-1 dimensional planes
|
|
|
|
5.1. Program names:
|
|
use_append_chunk and use_append_mchunks
|
|
|
|
5.2. Description:
|
|
Appending n-1 dimensional planes or regions to a chunked dataset where
|
|
the data does not fill the chunk.
|
|
|
|
This means the chunks have multiple planes and when a plane is written,
|
|
only one of the planes in each chunk is written. This use case is
|
|
achieved by extending the previous use cases 1.7 and 1.8 by defining the
|
|
chunks to have more than 1 plane. The -y option is implemented for both
|
|
use_append_chunk and use_append_mchunks.
|
|
|
|
5.3. How to run the program:
|
|
Simplest way is
|
|
$ use_append_mchunks -y 5
|
|
|
|
It creates a skeleton dataset (0,512,512), with storage chunks (5,512,512)
|
|
of shorts. It then proceeds like use case 1.8 by forking off a reader
|
|
process. The original process continues as the writer process that
|
|
writes 1 plane at a time, updating parts of the chunks involved. The
|
|
reader reads 1 plane at a time, retrieving data from partial chunks.
|
|
|
|
The other possible options will work just like the two use cases.
|
|
|
|
5.4. Test Shell Script:
|
|
Commands are added with -y options to demonstrate how the two use case
|
|
programs can be used as for this use case.
|
|
|