mirror of
https://github.com/HDFGroup/hdf5.git
synced 2025-01-24 15:25:00 +08:00
11dfa25910
* Updated source file copyright headers to remove "Copyright by the Board of Trustees of the University of Illinois", which is kept in the top-level COPYING file.
1599 lines
56 KiB
HTML
1599 lines
56 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<!-- This HTML file has been created by texi2html 1.51
|
|
from VFL.texi on 18 November 1999 -->
|
|
|
|
<TITLE>HDF5 Virtual File Layer</TITLE>
|
|
</HEAD>
|
|
|
|
|
|
<!--
|
|
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
|
|
* Copyright by The HDF Group. *
|
|
* All rights reserved. *
|
|
* *
|
|
* This file is part of HDF5. The full HDF5 copyright notice, including *
|
|
* terms governing use, modification, and redistribution, is contained in *
|
|
* the COPYING file, which can be found at the root of the source code *
|
|
* distribution tree, or in https://www.hdfgroup.org/licenses. *
|
|
* If you do not have access to either file, you may request a copy from *
|
|
* help@hdfgroup.org. *
|
|
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
|
|
-->
|
|
|
|
|
|
<BODY>
|
|
|
|
<strong>Revision History</strong>
|
|
<p>Initial document, 18 November 1999.</p>
|
|
|
|
<p>Updated on 10/24/00, Quincey Koziol</p>
|
|
|
|
<p>Added the section “Programming Note for C++ Developers Using C
|
|
Functions,” 08/23/2012, Mark Evans
|
|
|
|
|
|
|
|
<P>
|
|
<P><HR><P>
|
|
<H1>Table of Contents</H1>
|
|
<UL>
|
|
<LI><A NAME="TOC1" HREF="#SEC1">Introduction</A>
|
|
<LI><A NAME="TOC2" HREF="#SEC2">Using a File Driver</A>
|
|
<UL>
|
|
<LI><A NAME="TOC3" HREF="#SEC3">Driver Header Files</A>
|
|
<LI><A NAME="TOC4" HREF="#SEC4">Creating and Opening Files</A>
|
|
<LI><A NAME="TOC5" HREF="#SEC5">Performing I/O</A>
|
|
<LI><A NAME="TOC6" HREF="#SEC6">File Driver Interchangeability</A>
|
|
</UL>
|
|
<LI><A NAME="TOC7" HREF="#SEC7">Implementation of a Driver</A>
|
|
<UL>
|
|
<LI><A NAME="TOC8" HREF="#SEC8">Mode Functions</A>
|
|
<LI><A NAME="TOC9" HREF="#SEC9">File Functions</A>
|
|
<UL>
|
|
<LI><A NAME="TOC10" HREF="#SEC10">Opening Files</A>
|
|
<LI><A NAME="TOC11" HREF="#SEC11">Closing Files</A>
|
|
<LI><A NAME="TOC12" HREF="#SEC12">File Keys</A>
|
|
<LI><A NAME="TOC13" HREF="#SEC13">Saving Modes Across Opens</A>
|
|
</UL>
|
|
<LI><A NAME="TOC14" HREF="#SEC14">Address Space Functions</A>
|
|
<UL>
|
|
<LI><A NAME="TOC15" HREF="#SEC15">Userblock and Superblock</A>
|
|
<LI><A NAME="TOC16" HREF="#SEC16">Allocation of Format Regions</A>
|
|
<LI><A NAME="TOC17" HREF="#SEC17">Freeing Format Regions</A>
|
|
<LI><A NAME="TOC18" HREF="#SEC18">Querying Address Range</A>
|
|
</UL>
|
|
<LI><A NAME="TOC19" HREF="#SEC19">Data Functions</A>
|
|
<UL>
|
|
<LI><A NAME="TOC20" HREF="#SEC20">Contiguous I/O Functions</A>
|
|
<LI><A NAME="TOC21" HREF="#SEC21">Flushing Cached Data</A>
|
|
</UL>
|
|
<LI><A NAME="TOC22" HREF="#SEC22">Optimization Functions</A>
|
|
<LI><A NAME="TOC23" HREF="#SEC23">Registration of a Driver</A>
|
|
<ul>
|
|
<li><a name="TOCProgNote" href="#SECProgNote">
|
|
Programming Note for C++ Developers Using C Functions</a>
|
|
</li>
|
|
</ul>
|
|
<LI><A NAME="TOC24" HREF="#SEC24">Querying Driver Information</A>
|
|
</UL>
|
|
<LI><A NAME="TOC25" HREF="#SEC25">Miscellaneous</A>
|
|
</UL>
|
|
<P><HR><P>
|
|
|
|
|
|
<H1><A NAME="SEC1" HREF="#TOC1">Introduction</A></H1>
|
|
|
|
<P>
|
|
The HDF5 file format describes how HDF5 data structures and dataset raw
|
|
data are mapped to a linear <STRONG>format address space</STRONG> and the HDF5
|
|
library implements that bidirectional mapping in terms of an
|
|
API. However, the HDF5 format specifications do <EM>not</EM> indicate how
|
|
the format address space is mapped onto storage and HDF (version 5 and
|
|
earlier) simply mapped the format address space directly onto a single
|
|
file by convention.
|
|
|
|
</P>
|
|
<P>
|
|
Since early versions of HDF5 it became apparent that users want the ability to
|
|
map the format address space onto different types of storage (a single file,
|
|
multiple files, local memory, global memory, network distributed global
|
|
memory, a network protocol, <I>etc</I>.) with various types of maps. For
|
|
instance, some users want to be able to handle very large format address
|
|
spaces on operating systems that support only 2GB files by partitioning the
|
|
format address space into equal-sized parts each served by a separate
|
|
file. Other users want the same multi-file storage capability but want to
|
|
partition the address space according to purpose (raw data in one file, object
|
|
headers in another, global heap in a third, <I>etc.</I>) in order to improve I/O
|
|
speeds.
|
|
|
|
</P>
|
|
<P>
|
|
In fact, the number of storage variations is probably larger than the
|
|
number of methods that the HDF5 team is capable of implementing and
|
|
supporting. Therefore, a <STRONG>Virtual File Layer</STRONG> API is being
|
|
implemented which will allow application teams or departments to design
|
|
and implement their own mapping between the HDF5 format address space
|
|
and storage, with each mapping being a separate <STRONG>file driver</STRONG>
|
|
(possibly written in terms of other file drivers). The HDF5 team will
|
|
provide a small set of useful file drivers which will also serve as
|
|
examples for those who which to write their own:
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><CODE>H5FD_SEC2</CODE>
|
|
<DD>
|
|
This is the default driver which uses Posix file-system functions like
|
|
<CODE>read</CODE> and <CODE>write</CODE> to perform I/O to a single file. All I/O
|
|
requests are unbuffered although the driver does optimize file seeking
|
|
operations to some extent.
|
|
|
|
<DT><CODE>H5FD_STDIO</CODE>
|
|
<DD>
|
|
This driver uses functions from <TT>`stdio.h'</TT> to perform buffered I/O
|
|
to a single file.
|
|
|
|
<DT><CODE>H5FD_CORE</CODE>
|
|
<DD>
|
|
This driver performs I/O directly to memory and can be used to create small
|
|
temporary files that never exist on permanent storage. This type of storage is
|
|
generally very fast since the I/O consists only of memory-to-memory copy
|
|
operations.
|
|
|
|
<DT><CODE>H5FD_MPIIO</CODE>
|
|
<DD>
|
|
This is the driver of choice for accessing files in parallel using MPI and
|
|
MPI-IO. It is only predefined if the library is compiled with parallel I/O
|
|
support.
|
|
|
|
<DT><CODE>H5FD_FAMILY</CODE>
|
|
<DD>
|
|
Large format address spaces are partitioned into more manageable pieces and
|
|
sent to separate storage locations using an underlying driver of the user's
|
|
choice. The <CODE>h5repart</CODE> tool can be used to change the sizes of the
|
|
family members when stored as files or to convert a family of files to a
|
|
single file or vice versa.
|
|
|
|
<DT><CODE>H5FD_SPLIT</CODE>
|
|
<DD>
|
|
The format address space is split into meta data and raw data and each is
|
|
mapped onto separate storage using underlying drivers of the user's
|
|
choice. The meta data storage can be read by itself (for limited
|
|
functionality) or both files can be accessed together.
|
|
</DL>
|
|
|
|
|
|
|
|
<H1><A NAME="SEC2" HREF="#TOC2">Using a File Driver</A></H1>
|
|
|
|
<P>
|
|
Most application writers will use a driver defined by the HDF5 library or
|
|
contributed by another programming team. This chapter describes how existing
|
|
drivers are used.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC3" HREF="#TOC3">Driver Header Files</A></H2>
|
|
|
|
<P>
|
|
Each file driver is defined in its own public header file which should
|
|
be included by any application which plans to use that driver. The
|
|
predefined drivers are in header files whose names begin with
|
|
<SAMP>`H5FD'</SAMP> followed by the driver name and <SAMP>`.h'</SAMP>. The <TT>`hdf5.h'</TT>
|
|
header file includes all the predefined driver header files.
|
|
|
|
</P>
|
|
<P>
|
|
Once the appropriate header file is included a symbol of the form
|
|
<SAMP>`H5FD_'</SAMP> followed by the upper-case driver name will be the driver
|
|
identification number.<A NAME="DOCF1" HREF="#FOOT1">(1)</A> However, the
|
|
value may change if the library is closed (<I>e.g.</I>, by calling
|
|
<CODE>H5close</CODE>) and the symbol is referenced again.
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC4" HREF="#TOC4">Creating and Opening Files</A></H2>
|
|
|
|
<P>
|
|
In order to create or open a file one must define the method by which the
|
|
storage is accessed<A NAME="DOCF2" HREF="#FOOT2">(2)</A> and does so by creating a file access property list<A NAME="DOCF3" HREF="#FOOT3">(3)</A> which is passed to the <CODE>H5Fcreate</CODE> or
|
|
<CODE>H5Fopen</CODE> function. A default file access property list is created by
|
|
calling <CODE>H5Pcreate</CODE> and then the file driver information is inserted by
|
|
calling a driver initialization function such as <CODE>H5Pset_fapl_family</CODE>:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
|
|
size_t member_size = 100*1024*1024; /*100MB*/
|
|
H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT);
|
|
hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
|
|
H5Pclose(fapl);
|
|
</PRE>
|
|
|
|
<P>
|
|
Each file driver will have its own initialization function
|
|
whose name is <CODE>H5Pset_fapl_</CODE> followed by the driver name and which
|
|
takes a file access property list as the first argument followed by
|
|
additional driver-dependent arguments.
|
|
|
|
</P>
|
|
<P>
|
|
An alternative to using the driver initialization function is to set the
|
|
driver directly using the <CODE>H5Pset_driver</CODE> function.<A NAME="DOCF4" HREF="#FOOT4">(4)</A> Its second argument is the file driver identifier, which may
|
|
have a different numeric value from run to run depending on the order in which
|
|
the file drivers are registered with the library. The third argument
|
|
encapsulates the additional arguments of the driver initialization
|
|
function. This method only works if the file driver writer has made the
|
|
driver-specific property list structure a public datatype, which is
|
|
often not the case.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
|
|
static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT};
|
|
H5Pset_driver(fapl, H5FD_FAMILY, &fa);
|
|
hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
|
|
H5Pclose(fapl);
|
|
</PRE>
|
|
|
|
<P>
|
|
It is also possible to query the file driver information from a file access
|
|
property list by calling <CODE>H5Pget_driver</CODE> to determine the driver and then
|
|
calling a driver-defined query function to obtain the driver information:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t driver = H5Pget_driver(fapl);
|
|
if (H5FD_SEC2==driver) {
|
|
/*nothing further to get*/
|
|
} else if (H5FD_FAMILY==driver) {
|
|
hid_t member_fapl;
|
|
haddr_t member_size;
|
|
H5Pget_fapl_family(fapl, &member_size, &member_fapl);
|
|
} else if (....) {
|
|
....
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC5" HREF="#TOC5">Performing I/O</A></H2>
|
|
|
|
<P>
|
|
The <CODE>H5Dread</CODE> and <CODE>H5Dwrite</CODE> functions transfer data between
|
|
application memory and the file. They both take an optional data transfer
|
|
property list which has some general driver-independent properties and
|
|
optional driver-defined properties. An application will typically perform I/O
|
|
in one of three styles via the <CODE>H5Dread</CODE> or <CODE>H5Dwrite</CODE> function:
|
|
|
|
</P>
|
|
<P>
|
|
Like file access properties in the previous section, data transfer properties
|
|
can be set using a driver initialization function or a general purpose
|
|
function. For example, to set the MPI-IO driver to use independent access for
|
|
I/O operations one would say:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
|
|
H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT);
|
|
H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
|
|
H5Pclose(dxpl);
|
|
</PRE>
|
|
|
|
<P>
|
|
The alternative is to initialize a driver defined C <CODE>struct</CODE> and pass it
|
|
to the <CODE>H5Pset_driver</CODE> function:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
|
|
static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT};
|
|
H5Pset_driver(dxpl, H5FD_MPIO, &dx);
|
|
H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
|
|
</PRE>
|
|
|
|
<P>
|
|
The transfer property list can be queried in a manner similar to the file
|
|
access property list: the driver provides a function (or functions) to return
|
|
various information about the transfer property list:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
hid_t driver = H5Pget_driver(dxpl);
|
|
if (H5FD_MPIO==driver) {
|
|
H5FD_mpio_xfer_t xfer_mode;
|
|
H5Pget_dxpl_mpio(dxpl, &xfer_mode);
|
|
} else {
|
|
....
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC6" HREF="#TOC6">File Driver Interchangeability</A></H2>
|
|
|
|
<P>
|
|
The HDF5 specifications describe two things: the mapping of data onto a linear
|
|
<STRONG>format address space</STRONG> and the C API which performs the mapping.
|
|
However, the mapping of the format address space onto storage intentionally
|
|
falls outside the scope of the HDF5 specs. This is a direct result of the fact
|
|
that it is not generally possible to store information about how to access
|
|
storage inside the storage itself. For instance, given only the file name
|
|
<TT>`/arborea/1225/work/f%03d'</TT> the HDF5 library is unable to tell whether the
|
|
name refers to a file on the local file system, a family of files on the local
|
|
file system, a file on host <SAMP>`arborea'</SAMP> port 1225, a family of files on a
|
|
remote system, <I>etc</I>.
|
|
|
|
</P>
|
|
<P>
|
|
Two ways which library could figure out where the storage is located are:
|
|
storage access information can be provided by the user, or the library can try
|
|
all known file access methods. This implementation uses the former method.
|
|
|
|
</P>
|
|
<P>
|
|
In general, if a file was created with one driver then it isn't possible to
|
|
open it with another driver. There are of course exceptions: a file created
|
|
with MPIO could probably be opened with the sec2 driver, any file created
|
|
by the sec2 driver could be opened as a family of files with one member,
|
|
<I>etc</I>. In fact, sometimes a file must not only be opened with the same
|
|
driver but also with the same driver properties. The predefined drivers are
|
|
written in such a way that specifying the correct driver is sufficient for
|
|
opening a file.
|
|
|
|
</P>
|
|
|
|
|
|
<H1><A NAME="SEC7" HREF="#TOC7">Implementation of a Driver</A></H1>
|
|
|
|
<P>
|
|
A driver is simply a collection of functions and data structures which are
|
|
registered with the HDF5 library at runtime. The functions fall into these
|
|
categories:
|
|
|
|
</P>
|
|
|
|
<UL>
|
|
<LI>Functions which operate on modes
|
|
|
|
<LI>Functions which operate on files
|
|
|
|
<LI>Functions which operate on the address space
|
|
|
|
<LI>Functions which operate on data
|
|
|
|
<LI>Functions for driver initialization
|
|
|
|
<LI>Optimization functions
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC8" HREF="#TOC8">Mode Functions</A></H2>
|
|
|
|
<P>
|
|
Some drivers need information about file access and data transfers which are
|
|
very specific to the driver. The information is usually implemented as a pair
|
|
of pointers to C structs which are allocated and initialized as part of an
|
|
HDF5 property list and passed down to various driver functions. There are two
|
|
classes of settings: file access modes that describe how to access the file
|
|
through the driver, and data transfer modes which are settings that control
|
|
I/O operations. Each file opened by a particular driver may have a different
|
|
access mode; each dataset I/O request for a particular file may have a
|
|
different data transfer mode.
|
|
|
|
</P>
|
|
<P>
|
|
Since each driver has its own particular requirements for various settings,
|
|
each driver is responsible for defining the mode structures that it
|
|
needs. Higher layers of the library treat the structures as opaque but must be
|
|
able to copy and free them. Thus, the driver provides either the size of the
|
|
structure or a pair of function pointers for each of the mode types.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The family driver needs to know how the format address
|
|
space is partitioned and the file access property list to use for the
|
|
family members.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
/* Driver-specific file access properties */
|
|
typedef struct H5FD_family_fapl_t {
|
|
hsize_t memb_size; /*size of each member */
|
|
hid_t memb_fapl_id; /*file access property list of each memb*/
|
|
} H5FD_family_fapl_t;
|
|
|
|
/* Driver specific data transfer properties */
|
|
typedef struct H5FD_family_dxpl_t {
|
|
hid_t memb_dxpl_id; /*data xfer property list of each memb */
|
|
} H5FD_family_dxpl_t;
|
|
</PRE>
|
|
|
|
<P>
|
|
In order to copy or free one of these structures the member file access
|
|
or data transfer properties must also be copied or freed. This is done
|
|
by providing a copy and close function for each structure:
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The file access property list copy and close functions
|
|
for the family driver:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static void *
|
|
H5FD_family_fapl_copy(const void *_old_fa)
|
|
{
|
|
const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa;
|
|
H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t));
|
|
assert(new_fa);
|
|
|
|
memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t));
|
|
new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id);
|
|
return new_fa;
|
|
}
|
|
|
|
static herr_t
|
|
H5FD_family_fapl_free(void *_fa)
|
|
{
|
|
H5FD_family_fapl_t *fa = (H5FD_family_fapl_t*)_fa;
|
|
H5Pclose(fa->memb_fapl_id);
|
|
free(fa);
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
Generally when a file is created or opened the file access properties
|
|
for the driver are copied into the file pointer which is returned and
|
|
they may be modified from their original value (for instance, the file
|
|
family driver modifies the member size property when opening an existing
|
|
family). In order to support the <CODE>H5Fget_access_plist</CODE> function the
|
|
driver must provide a <CODE>fapl_get</CODE> callback which creates a copy of
|
|
the driver-specific properties based on a particular file.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The file family driver copies the member size file
|
|
access property list into the return value:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static void *
|
|
H5FD_family_fapl_get(H5FD_t *_file)
|
|
{
|
|
H5FD_family_t *file = (H5FD_family_t*)_file;
|
|
H5FD_family_fapl_t *fa = calloc(1, sizeof(H5FD_family_fapl_t*));
|
|
|
|
fa->memb_size = file->memb_size;
|
|
fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id);
|
|
return fa;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC9" HREF="#TOC9">File Functions</A></H2>
|
|
|
|
<P>
|
|
The higher layers of the library expect files to have a name and allow the
|
|
file to be accessed in various modes. The driver must be able to create a new
|
|
file, replace an existing file, or open an existing file. Opening or creating
|
|
a file should return a handle, a pointer to a specialization of the
|
|
<CODE>H5FD_t</CODE> struct, which allows read-only or read-write access and which
|
|
will be passed to the other driver functions as they are
|
|
called.<A NAME="DOCF5" HREF="#FOOT5">(5)</A>
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
typedef struct {
|
|
/* Public fields */
|
|
H5FD_class_t *cls; /*class data defined below*/
|
|
|
|
/* Private fields -- driver-defined */
|
|
|
|
} H5FD_t;
|
|
</PRE>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> The family driver requires handles to the underlying
|
|
storage, the size of the members for this particular file (which might be
|
|
different than the member size specified in the file access property list if
|
|
an existing file family is being opened), the name used to open the file in
|
|
case additional members must be created, and the flags to use for creating
|
|
those additional members. The <CODE>eoa</CODE> member caches the size of the format
|
|
address space so the family members don't have to be queried in order to find
|
|
it.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
/* The description of a file belonging to this driver. */
|
|
typedef struct H5FD_family_t {
|
|
H5FD_t pub; /*public stuff, must be first */
|
|
hid_t memb_fapl_id; /*file access property list for members */
|
|
hsize_t memb_size; /*maximum size of each member file */
|
|
int nmembs; /*number of family members */
|
|
int amembs; /*number of member slots allocated */
|
|
H5FD_t **memb; /*dynamic array of member pointers */
|
|
haddr_t eoa; /*end of allocated addresses */
|
|
char *name; /*name generator printf format */
|
|
unsigned flags; /*flags for opening additional members */
|
|
} H5FD_family_t;
|
|
</PRE>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver needs to keep track of the underlying Unix
|
|
file descriptor and also the end of format address space and current Unix file
|
|
size. It also keeps track of the current file position and last operation
|
|
(read, write, or unknown) in order to optimize calls to <CODE>lseek</CODE>. The
|
|
<CODE>device</CODE> and <CODE>inode</CODE> fields are defined on Unix in order to uniquely
|
|
identify the file and will be discussed below.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
typedef struct H5FD_sec2_t {
|
|
H5FD_t pub; /*public stuff, must be first */
|
|
int fd; /*the unix file */
|
|
haddr_t eoa; /*end of allocated region */
|
|
haddr_t eof; /*end of file; current file size*/
|
|
haddr_t pos; /*current file I/O position */
|
|
int op; /*last operation */
|
|
dev_t device; /*file device number */
|
|
ino_t inode; /*file i-node number */
|
|
} H5FD_sec2_t;
|
|
</PRE>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC10" HREF="#TOC10">Opening Files</A></H3>
|
|
|
|
<P>
|
|
All drivers must define a function for opening/creating a file. This
|
|
function should have a prototype which is:
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static H5FD_t * <B>open</B> <I>(const char *<VAR>name</VAR>, unsigned <VAR>flags</VAR>, hid_t <VAR>fapl</VAR>, haddr_t <VAR>maxaddr</VAR>)</I>
|
|
<DD><A NAME="IDX1"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The file name <VAR>name</VAR> and file access property list <VAR>fapl</VAR> are
|
|
the same as were specified in the <CODE>H5Fcreate</CODE> or <CODE>H5Fopen</CODE>
|
|
call. The <VAR>flags</VAR> are the same as in those calls also except the
|
|
flag <CODE>H5F_ACC_CREATE</CODE> is also present if the call was to
|
|
<CODE>H5Fcreate</CODE> and they are documented in the <TT>`H5Fpublic.h'</TT>
|
|
file. The <VAR>maxaddr</VAR> argument is the maximum format address that the
|
|
driver should be prepared to handle (the minimum address is always
|
|
zero).
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver opens a Unix file with the requested name
|
|
and saves information which uniquely identifies the file (the Unix device
|
|
number and inode).
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static H5FD_t *
|
|
H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/,
|
|
haddr_t maxaddr)
|
|
{
|
|
unsigned o_flags;
|
|
int fd;
|
|
struct stat sb;
|
|
H5FD_sec2_t *file=NULL;
|
|
|
|
/* Check arguments */
|
|
if (!name || !*name) return NULL;
|
|
if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL;
|
|
if (ADDR_OVERFLOW(maxaddr)) return NULL;
|
|
|
|
/* Build the open flags */
|
|
o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY;
|
|
if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC;
|
|
if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT;
|
|
if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL;
|
|
|
|
/* Open the file */
|
|
if ((fd=open(name, o_flags, 0666))<0) return NULL;
|
|
if (fstat(fd, &sb)<0) {
|
|
close(fd);
|
|
return NULL;
|
|
}
|
|
|
|
/* Create the new file struct */
|
|
file = calloc(1, sizeof(H5FD_sec2_t));
|
|
file->fd = fd;
|
|
file->eof = sb.st_size;
|
|
file->pos = HADDR_UNDEF;
|
|
file->op = OP_UNKNOWN;
|
|
file->device = sb.st_dev;
|
|
file->inode = sb.st_ino;
|
|
|
|
return (H5FD_t*)file;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC11" HREF="#TOC11">Closing Files</A></H3>
|
|
|
|
<P>
|
|
Closing a file simply means that all cached data should be flushed to the next
|
|
lower layer, the file should be closed at the next lower layer, and all
|
|
file-related data structures should be freed. All information needed by the
|
|
close function is already present in the file handle.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static herr_t <B>close</B> <I>(H5FD_t *<VAR>file</VAR>)</I>
|
|
<DD><A NAME="IDX2"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <VAR>file</VAR> argument is the handle which was returned by the <CODE>open</CODE>
|
|
function, and the <CODE>close</CODE> should free only memory associated with the
|
|
driver-specific part of the handle (the public parts will have already been released by HDF5's virtual file layer).
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver just closes the underlying Unix file,
|
|
making sure that the actual file size is the same as that known to the
|
|
library by writing a zero to the last file position it hasn't been
|
|
written by some previous operation (which happens in the same code which
|
|
flushes the file contents and is shown below).
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static herr_t
|
|
H5FD_sec2_close(H5FD_t *_file)
|
|
{
|
|
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
|
|
|
|
if (H5FD_sec2_flush(_file)<0) return -1;
|
|
if (close(file->fd)<0) return -1;
|
|
free(file);
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC12" HREF="#TOC12">File Keys</A></H3>
|
|
|
|
<P>
|
|
Occasionally an application will attempt to open a single file more than one
|
|
time in order to obtain multiple handles to the file. HDF5 allows the files to
|
|
share information<A NAME="DOCF6" HREF="#FOOT6">(6)</A> but in order to
|
|
accomplish this HDF5 must be able to tell when two names refer to the same
|
|
file. It does this by associating a driver-defined key with each file opened
|
|
by a driver and comparing the key for an open request with the keys for all
|
|
other files currently open by the same driver.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> const int <B>cmp</B> <I>(const H5FD_t *<VAR>f1</VAR>, const H5FD_t *<VAR>f2</VAR>)</I>
|
|
<DD><A NAME="IDX3"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The driver may provide a function which compares two files <VAR>f1</VAR> and
|
|
<VAR>f2</VAR> belonging to the same driver and returns a negative, positive, or
|
|
zero value <I>a la</I> the <CODE>strcmp</CODE> function.<A NAME="DOCF7" HREF="#FOOT7">(7)</A> If this
|
|
function is not provided then HDF5 assumes that all calls to the <CODE>open</CODE>
|
|
callback return unique files regardless of the arguments and it is up to the
|
|
application to avoid doing this if that assumption is incorrect.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
Each time a file is opened the library calls the <CODE>cmp</CODE> function to
|
|
compare that file with all other files currently open by the same driver and
|
|
if one of them matches (at most one can match) then the file which was just
|
|
opened is closed and the previously opened file is used instead.
|
|
|
|
</P>
|
|
<P>
|
|
Opening a file twice with incompatible flags will result in failure. For
|
|
instance, opening a file with the truncate flag is a two step process which
|
|
first opens the file without truncation so keys can be compared, and if no
|
|
matching file is found already open then the file is closed and immediately
|
|
reopened with the truncation flag set (if a matching file is already open then
|
|
the truncating open will fail).
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver uses the Unix device and i-node as the
|
|
key. They were initialized when the file was opened.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static int
|
|
H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2)
|
|
{
|
|
const H5FD_sec2_t *f1 = (const H5FD_sec2_t*)_f1;
|
|
const H5FD_sec2_t *f2 = (const H5FD_sec2_t*)_f2;
|
|
|
|
if (f1->device < f2->device) return -1;
|
|
if (f1->device > f2->device) return 1;
|
|
|
|
if (f1->inode < f2->inode) return -1;
|
|
if (f1->inode > f2->inode) return 1;
|
|
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC13" HREF="#TOC13">Saving Modes Across Opens</A></H3>
|
|
|
|
<P>
|
|
Some drivers may also need to store certain information in the file superblock
|
|
in order to be able to reliably open the file at a later date. This is done by
|
|
three functions: one to determine how much space will be necessary to store
|
|
the information in the superblock, one to encode the information, and one to
|
|
decode the information. These functions are optional, but if any one is
|
|
defined then the other two must also be defined.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static hsize_t <B>sb_size</B> <I>(H5FD_t *<VAR>file</VAR>)</I>
|
|
<DD><A NAME="IDX4"></A>
|
|
<DT><U>Function:</U> static herr_t <B>sb_encode</B> <I>(H5FD_t *<VAR>file</VAR>, char *<VAR>name</VAR>, unsigned char *<VAR>buf</VAR>)</I>
|
|
<DD><A NAME="IDX5"></A>
|
|
<DT><U>Function:</U> static herr_t <B>sb_decode</B> <I>(H5FD_t *<VAR>file</VAR>, const char *<VAR>name</VAR>, const unsigned char *<VAR>buf</VAR>)</I>
|
|
<DD><A NAME="IDX6"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>sb_size</CODE> function returns the number of bytes necessary to encode
|
|
information needed later if the file is reopened. The <CODE>sb_encode</CODE>
|
|
function encodes information from the file into buffer <VAR>buf</VAR>
|
|
allocated by the caller. It also writes an 8-character (plus null
|
|
termination) into the <CODE>name</CODE> argument, which should be a unique
|
|
identification for the driver. The <CODE>sb_decode</CODE> function looks at
|
|
the <VAR>name</VAR>
|
|
|
|
</P>
|
|
<P>
|
|
decodes
|
|
data from the buffer <VAR>buf</VAR> and updates the <VAR>file</VAR> argument with the new information,
|
|
advancing <VAR>*p</VAR> in the process.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
The part of this which is somewhat tricky is that the file must be readable
|
|
before the superblock information is decoded. File access modes fall outside
|
|
the scope of the HDF5 file format, but they are placed inside the boot block
|
|
for convenience.<A NAME="DOCF8" HREF="#FOOT8">(8)</A>
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> <EM>To be written later.</EM>
|
|
|
|
</P>
|
|
|
|
|
|
<H2><A NAME="SEC14" HREF="#TOC14">Address Space Functions</A></H2>
|
|
|
|
<P>
|
|
HDF5 does not assume that a file is a linear address space of bytes. Instead,
|
|
the library will call functions to allocate and free portions of the HDF5
|
|
format address space, which in turn map onto functions in the file driver to
|
|
allocate and free portions of file address space. The library tells the file
|
|
driver how much format address space it wants to allocate and the driver
|
|
decides what format address to use and how that format address is mapped onto
|
|
the file address space. Usually the format address is chosen so that the file
|
|
address can be calculated in constant time for data I/O operations (which are
|
|
always specified by format addresses).
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC15" HREF="#TOC15">Userblock and Superblock</A></H3>
|
|
|
|
<P>
|
|
The HDF5 format allows an optional userblock to appear before the actual HDF5
|
|
data in such a way that if the userblock is <STRONG>sucked out</STRONG> of the file and
|
|
everything remaining is shifted downward in the file address space, then the
|
|
file is still a valid HDF5 file. The userblock size can be zero or any
|
|
multiple of two greater than or equal to 512 and the file superblock begins
|
|
immediately after the userblock.
|
|
|
|
</P>
|
|
<P>
|
|
HDF5 allocates space for the userblock and superblock by calling an
|
|
allocation function defined below, which must return a chunk of memory at
|
|
format address zero on the first call.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC16" HREF="#TOC16">Allocation of Format Regions</A></H3>
|
|
|
|
<P>
|
|
The library makes many types of allocation requests:
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><CODE>H5FD_MEM_SUPER</CODE>
|
|
<DD>
|
|
An allocation request for the userblock and/or superblock.
|
|
<DT><CODE>H5FD_MEM_BTREE</CODE>
|
|
<DD>
|
|
An allocation request for a node of a B-tree.
|
|
<DT><CODE>H5FD_MEM_DRAW</CODE>
|
|
<DD>
|
|
An allocation request for the raw data of a dataset.
|
|
<DT><CODE>H5FD_MEM_META</CODE>
|
|
<DD>
|
|
An allocation request for the raw data of a dataset which
|
|
the user has indicated will be relatively small.
|
|
<DT><CODE>H5FD_MEM_GROUP</CODE>
|
|
<DD>
|
|
An allocation request for a group leaf node (internal nodes of the group tree
|
|
are allocated as H5MF_BTREE).
|
|
<DT><CODE>H5FD_MEM_GHEAP</CODE>
|
|
<DD>
|
|
An allocation request for a global heap collection. Global heaps are used to
|
|
store certain types of references such as dataset region references. The set
|
|
of all global heap collections can become quite large.
|
|
<DT><CODE>H5FD_MEM_LHEAP</CODE>
|
|
<DD>
|
|
An allocation request for a local heap. Local heaps are used to store the
|
|
names which are members of a group. The combined size of all local heaps is a
|
|
function of the number of object names in the file.
|
|
<DT><CODE>H5FD_MEM_OHDR</CODE>
|
|
<DD>
|
|
An allocation request for (part of) an object header. Object headers are
|
|
relatively small and include meta information about objects (like the data
|
|
space and type of a dataset) and attributes.
|
|
</DL>
|
|
|
|
<P>
|
|
When a chunk of memory is freed the library adds it to a free list and
|
|
allocation requests are satisfied from the free list before requesting memory
|
|
from the file driver. Each type of allocation request enumerated above has its
|
|
own free list, but the file driver can specify that certain object types can
|
|
share a free list. It does so by providing an array which maps a request type
|
|
to a free list. If any value of the map is <CODE>H5MF_DEFAULT</CODE> (zero) then the
|
|
object's own free list is used. The special value <CODE>H5MF_NOLIST</CODE> indicates
|
|
that the library should not attempt to maintain a free list for that
|
|
particular object type, instead calling the file driver each time an object of
|
|
that type is freed.
|
|
|
|
</P>
|
|
<P>
|
|
Mappings predefined in the <TT>`H5FDpublic.h'</TT> file are:
|
|
<DL COMPACT>
|
|
|
|
<DT><CODE>H5FD_FLMAP_SINGLE</CODE>
|
|
<DD>
|
|
All memory usage types are mapped to a single free list.
|
|
<DT><CODE>H5FD_FLMAP_DICHOTOMY</CODE>
|
|
<DD>
|
|
Memory usage is segregated into meta data and raw data for the purposes of
|
|
memory management.
|
|
<DT><CODE>H5FD_FLMAP_DEFAULT</CODE>
|
|
<DD>
|
|
Each memory usage type has its own free list.
|
|
</DL>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> To make a map that manages object headers on one free list
|
|
and everything else on another free list one might initialize the map with the
|
|
following code: (the use of <CODE>H5FD_MEM_SUPER</CODE> is arbitrary)
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
H5FD_mem_t mt, map[H5FD_MEM_NTYPES];
|
|
|
|
for (mt=0; mt<H5FD_MEM_NTYPES; mt++) {
|
|
map[mt] = (H5FD_MEM_OHDR==mt) ? mt : H5FD_MEM_SUPER;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
If an allocation request cannot be satisfied from the free list then one of
|
|
two things happen. If the driver defines an allocation callback then it is
|
|
used to allocate space; otherwise new memory is allocated from the end of the
|
|
format address space by incrementing the end-of-address marker.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static haddr_t <B>alloc</B> <I>(H5FD_t *<VAR>file</VAR>, H5MF_type_t <VAR>type</VAR>, hsize_t <VAR>size</VAR>)</I>
|
|
<DD><A NAME="IDX7"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <VAR>file</VAR> argument is the file from which space is to be allocated,
|
|
<VAR>type</VAR> is the type of memory being requested (from the list above) without
|
|
being mapped according to the freelist map and <VAR>size</VAR> is the number of
|
|
bytes being requested. The library is allowed to allocate large chunks of
|
|
storage and manage them in a layer above the file driver (although the current
|
|
library doesn't do that). The allocation function should return a format
|
|
address for the first byte allocated. The allocated region extends from that
|
|
address for <VAR>size</VAR> bytes. If the request cannot be honored then the
|
|
undefined address value is returned (<CODE>HADDR_UNDEF</CODE>). The first call to
|
|
this function for a file which has never had memory allocated <EM>must</EM>
|
|
return a format address of zero or <CODE>HADDR_UNDEF</CODE> since this is how the
|
|
library allocates space for the userblock and/or superblock.
|
|
</DL>
|
|
|
|
</P>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> <EM>To be written later.</EM>
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC17" HREF="#TOC17">Freeing Format Regions</A></H3>
|
|
|
|
<P>
|
|
When the library is finished using a certain region of the format address
|
|
space it will return the space to the free list according to the type of
|
|
memory being freed and the free list map described above. If the free list has
|
|
been disabled for a particular memory usage type (according to the free list
|
|
map) and the driver defines a <CODE>free</CODE> callback then it will be
|
|
invoked. The <CODE>free</CODE> callback is also invoked for all entries on the free
|
|
list when the file is closed.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static herr_t <B>free</B> <I>(H5FD_t *<VAR>file</VAR>, H5MF_type_t <VAR>type</VAR>, haddr_t <VAR>addr</VAR>, hsize_t <VAR>size</VAR>)</I>
|
|
<DD><A NAME="IDX8"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <VAR>file</VAR> argument is the file for which space is being freed; <VAR>type</VAR>
|
|
is the type of object being freed (from the list above) without being mapped
|
|
according to the freelist map; <VAR>addr</VAR> is the first format address to free;
|
|
and <VAR>size</VAR> is the size in bytes of the region being freed. The region
|
|
being freed may refer to just part of the region originally allocated and/or
|
|
may cross allocation boundaries provided all regions being freed have the same
|
|
usage type. However, the library will never attempt to free regions which have
|
|
already been freed or which have never been allocated.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
A driver may choose to not define the <CODE>free</CODE> function, in which case
|
|
format addresses will be leaked. This isn't normally a huge problem since the
|
|
library contains a simple free list of its own and freeing parts of the format
|
|
address space is not a common occurrence.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> <EM>To be written later.</EM>
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC18" HREF="#TOC18">Querying Address Range</A></H3>
|
|
|
|
<P>
|
|
Each file driver must have some mechanism for setting and querying the end of
|
|
address, or <STRONG>EOA</STRONG>, marker. The EOA marker is the first format address
|
|
after the last format address ever allocated. If the last part of the
|
|
allocated address range is freed then the driver may optionally decrease the
|
|
eoa marker.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static haddr_t <B>get_eoa</B> <I>(H5FD_t *<VAR>file</VAR>)</I>
|
|
<DD><A NAME="IDX9"></A>
|
|
|
|
</P>
|
|
<P>
|
|
This function returns the current value of the EOA marker for the specified
|
|
file.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver just returns the current eoa marker value
|
|
which is cached in the file structure:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static haddr_t
|
|
H5FD_sec2_get_eoa(H5FD_t *_file)
|
|
{
|
|
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
|
|
return file->eoa;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
The eoa marker is initially zero when a file is opened and the library may set
|
|
it to some other value shortly after the file is opened (after the superblock
|
|
is read and the saved eoa marker is determined) or when allocating additional
|
|
memory in the absence of an <CODE>alloc</CODE> callback (described above).
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver simply caches the eoa marker in the file
|
|
structure and does not extend the underlying Unix file. When the file is
|
|
flushed or closed then the Unix file size is extended to match the eoa marker.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static herr_t
|
|
H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr)
|
|
{
|
|
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
|
|
file->eoa = addr;
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC19" HREF="#TOC19">Data Functions</A></H2>
|
|
|
|
<P>
|
|
These functions operate on data, transferring a region of the format address
|
|
space between memory and files.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H3><A NAME="SEC20" HREF="#TOC20">Contiguous I/O Functions</A></H3>
|
|
|
|
<P>
|
|
A driver must specify two functions to transfer data from the library to the
|
|
file and vice versa.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static herr_t <B>read</B> <I>(H5FD_t *<VAR>file</VAR>, H5FD_mem_t <VAR>type</VAR>, hid_t <VAR>dxpl</VAR>, haddr_t <VAR>addr</VAR>, hsize_t <VAR>size</VAR>, void *<VAR>buf</VAR>)</I>
|
|
<DD><A NAME="IDX10"></A>
|
|
<DT><U>Function:</U> static herr_t <B>write</B> <I>(H5FD_t *<VAR>file</VAR>, H5FD_mem_t <VAR>type</VAR>, hid_t <VAR>dxpl</VAR>, haddr_t <VAR>addr</VAR>, hsize_t <VAR>size</VAR>, const void *<VAR>buf</VAR>)</I>
|
|
<DD><A NAME="IDX11"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>read</CODE> function reads data from file <VAR>file</VAR> beginning at address
|
|
<VAR>addr</VAR> and continuing for <VAR>size</VAR> bytes into the buffer <VAR>buf</VAR>
|
|
supplied by the caller. The <CODE>write</CODE> function transfers data in the
|
|
opposite direction. Both functions take a data transfer property list
|
|
<VAR>dxpl</VAR> which indicates the fine points of how the data is to be
|
|
transferred and which comes directly from the <CODE>H5Dread</CODE> or
|
|
<CODE>H5Dwrite</CODE> function. Both functions receive <VAR>type</VAR> of
|
|
data being written, which may allow a driver to tune it's behavior for
|
|
different kinds of data.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
Both functions should return a negative value if they fail to transfer the
|
|
requested data, or non-negative if they succeed. The library will never
|
|
attempt to read from unallocated regions of the format address space.
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver just makes system calls. It tries not to
|
|
call <CODE>lseek</CODE> if the current operation is the same as the previous
|
|
operation and the file position is correct. It also fills the output buffer
|
|
with zeros when reading between the current EOF and EOA markers and restarts
|
|
system calls which were interrupted.
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static herr_t
|
|
H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/,
|
|
haddr_t addr, hsize_t size, void *buf/*out*/)
|
|
{
|
|
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
|
|
ssize_t nbytes;
|
|
|
|
assert(file && file->pub.cls);
|
|
assert(buf);
|
|
|
|
/* Check for overflow conditions */
|
|
if (REGION_OVERFLOW(addr, size)) return -1;
|
|
if (addr+size>file->eoa) return -1;
|
|
|
|
/* Seek to the correct location */
|
|
if ((addr!=file->pos || OP_READ!=file->op) &&
|
|
file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) {
|
|
file->pos = HADDR_UNDEF;
|
|
file->op = OP_UNKNOWN;
|
|
return -1;
|
|
}
|
|
|
|
/*
|
|
* Read data, being careful of interrupted system calls, partial results,
|
|
* and the end of the file.
|
|
*/
|
|
while (size>0) {
|
|
do nbytes = read(file->fd, buf, size);
|
|
while (-1==nbytes && EINTR==errno);
|
|
if (-1==nbytes) {
|
|
/* error */
|
|
file->pos = HADDR_UNDEF;
|
|
file->op = OP_UNKNOWN;
|
|
return -1;
|
|
}
|
|
if (0==nbytes) {
|
|
/* end of file but not end of format address space */
|
|
memset(buf, 0, size);
|
|
size = 0;
|
|
}
|
|
assert(nbytes>=0);
|
|
assert((hsize_t)nbytes<=size);
|
|
size -= (hsize_t)nbytes;
|
|
addr += (haddr_t)nbytes;
|
|
buf = (char*)buf + nbytes;
|
|
}
|
|
|
|
/* Update current position */
|
|
file->pos = addr;
|
|
file->op = OP_READ;
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 <CODE>write</CODE> callback is similar except it updates
|
|
the file EOF marker when extending the file.
|
|
|
|
</P>
|
|
|
|
|
|
<H3><A NAME="SEC21" HREF="#TOC21">Flushing Cached Data</A></H3>
|
|
|
|
<P>
|
|
Some drivers may desire to cache data in memory in order to make larger I/O
|
|
requests to the underlying file and thus improving bandwidth. Such drivers
|
|
should register a cache flushing function so that the library can insure that
|
|
data has been flushed out of the drivers in response to the application
|
|
calling <CODE>H5Fflush</CODE>.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static herr_t <B>flush</B> <I>(H5FD_t *<VAR>file</VAR>)</I>
|
|
<DD><A NAME="IDX12"></A>
|
|
|
|
</P>
|
|
<P>
|
|
Flush all data for file <VAR>file</VAR> to storage.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver doesn't cache any data but it also doesn't
|
|
extend the Unix file as aggressively as it should. Therefore, when finalizing a
|
|
file it should write a zero to the last byte of the allocated region so that
|
|
when reopening the file later the EOF marker will be at least as large as the
|
|
EOA marker saved in the superblock (otherwise HDF5 will refuse to open the
|
|
file, claiming that the data appears to be truncated).
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static herr_t
|
|
H5FD_sec2_flush(H5FD_t *_file)
|
|
{
|
|
H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
|
|
|
|
if (file->eoa>file->eof) {
|
|
if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1;
|
|
if (write(file->fd, "", 1)!=1) return -1;
|
|
file->eof = file->eoa;
|
|
file->pos = file->eoa;
|
|
file->op = OP_WRITE;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2><A NAME="SEC22" HREF="#TOC22">Optimization Functions</A></H2>
|
|
|
|
<P>
|
|
The library is capable of performing several generic optimizations on I/O, but
|
|
these types of optimizations may not be appropriate for a given VFL driver.
|
|
</P>
|
|
|
|
<P>
|
|
Each driver may provide a query function to allow the library to query whether
|
|
to enable these optimizations. If a driver lacks a query function, the library
|
|
will disable all types of optimizations which can be queried.
|
|
</P>
|
|
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> static herr_t <B>query</B> <I>(const H5FD_t *<VAR>file</VAR>, unsigned long *<VAR>flags</VAR>)</I>
|
|
<DD><A NAME="IDX17"></A>
|
|
</P>
|
|
<P>
|
|
This function is called by the library to query which optimizations to enable
|
|
for I/O to this driver. These are the flags which are currently defined:
|
|
|
|
<UL>
|
|
<DL>
|
|
<DT>H5FD_FEAT_AGGREGATE_METADATA (0x00000001)
|
|
<DD>Defining the H5FD_FEAT_AGGREGATE_METADATA for a VFL driver means that
|
|
the library will attempt to allocate a larger block for metadata and
|
|
then sub-allocate each metadata request from that larger block.
|
|
<DT>H5FD_FEAT_ACCUMULATE_METADATA (0x00000002)
|
|
<DD>Defining the H5FD_FEAT_ACCUMULATE_METADATA for a VFL driver means that
|
|
the library will attempt to cache metadata as it is written to the file
|
|
and build up a larger block of metadata to eventually pass to the VFL
|
|
'write' routine.
|
|
<DT>H5FD_FEAT_DATA_SIEVE (0x00000004)
|
|
<DD>Defining the H5FD_FEAT_DATA_SIEVE for a VFL driver means that
|
|
the library will attempt to cache raw data as it is read from/written to
|
|
a file in a "data sieve" buffer. See Rajeev Thakur's papers:
|
|
<UL>
|
|
<DL>
|
|
<DT>http://www.mcs.anl.gov/~thakur/papers/romio-coll.ps.gz
|
|
<DT>http://www.mcs.anl.gov/~thakur/papers/mpio-high-perf.ps.gz
|
|
</DL>
|
|
</UL>
|
|
</DL>
|
|
</UL>
|
|
</P>
|
|
|
|
</DL>
|
|
</P>
|
|
|
|
<H2><A NAME="SEC23" HREF="#TOC23">Registration of a Driver</A></H2>
|
|
|
|
<P>
|
|
Before a driver can be used the HDF5 library needs to be told of its
|
|
existence. This is done by registering the driver, which results in a driver
|
|
identification number. Instead of passing many arguments to the registration
|
|
function, the driver information is entered into a structure and the address
|
|
of the structure is passed to the registration function where it is
|
|
copied. This allows the HDF5 API to be extended while providing backward
|
|
compatibility at the source level.
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> hid_t <B>H5FDregister</B> <I>(H5FD_class_t *<VAR>cls</VAR>)</I>
|
|
<DD><A NAME="IDX13"></A>
|
|
|
|
</P>
|
|
<P>
|
|
The driver described by struct <VAR>cls</VAR> is registered with the library and an
|
|
ID number for the driver is returned.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>H5FD_class_t</CODE> type is a struct with the following fields:
|
|
|
|
</P>
|
|
<DL COMPACT>
|
|
|
|
<DT><CODE>const char *name</CODE>
|
|
<DD>
|
|
A pointer to a constant, null-terminated driver name to be used for debugging
|
|
purposes.
|
|
<DT><CODE>size_t fapl_size</CODE>
|
|
<DD>
|
|
The size in bytes of the file access mode structure or zero if the driver
|
|
supplies a copy function or doesn't define the structure.
|
|
<DT><CODE>void *(*fapl_copy)(const void *fapl)</CODE>
|
|
<DD>
|
|
An optional function which copies a driver-defined file access mode structure.
|
|
This field takes precedence over <CODE>fm_size</CODE> when both are defined.
|
|
<DT><CODE>void (*fapl_free)(void *fapl)</CODE>
|
|
<DD>
|
|
An optional function to free the driver-defined file access mode structure. If
|
|
null, then the library calls the C <CODE>free</CODE> function to free the
|
|
structure.
|
|
<DT><CODE>size_t dxpl_size</CODE>
|
|
<DD>
|
|
The size in bytes of the data transfer mode structure or zero if the driver
|
|
supplies a copy function or doesn't define the structure.
|
|
<DT><CODE>void *(*dxpl_copy)(const void *dxpl)</CODE>
|
|
<DD>
|
|
An optional function which copies a driver-defined data transfer mode
|
|
structure. This field takes precedence over <CODE>xm_size</CODE> when both are
|
|
defined.
|
|
<DT><CODE>void (*dxpl_free)(void *dxpl)</CODE>
|
|
<DD>
|
|
An optional function to free the driver-defined data transfer mode
|
|
structure. If null, then the library calls the C <CODE>free</CODE> function to
|
|
free the structure.
|
|
<DT><CODE>H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)</CODE>
|
|
<DD>
|
|
The function which opens or creates a new file.
|
|
<DT><CODE>herr_t (*close)(H5FD_t *file)</CODE>
|
|
<DD>
|
|
The function which ends access to a file.
|
|
<DT><CODE>int (*cmp)(const H5FD_t *f1, const H5FD_t *f2)</CODE>
|
|
<DD>
|
|
An optional function to determine whether two open files have the same key. If
|
|
this function is not present then the library assumes that two files will
|
|
never be the same.
|
|
<DT><CODE>int (*query)(const H5FD_t *f, unsigned long *flags)</CODE>
|
|
<DD>
|
|
An optional function to determine which library optimizations a driver can
|
|
support.
|
|
<DT><CODE>haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size)</CODE>
|
|
<DD>
|
|
An optional function to allocate space in the file.
|
|
<DT><CODE>herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size)</CODE>
|
|
<DD>
|
|
An optional function to free space in the file.
|
|
<DT><CODE>haddr_t (*get_eoa)(H5FD_t *file)</CODE>
|
|
<DD>
|
|
A function to query how much of the format address space has been allocated.
|
|
<DT><CODE>herr_t (*set_eoa)(H5FD_t *file, haddr_t)</CODE>
|
|
<DD>
|
|
A function to set the end of address space.
|
|
<DT><CODE>haddr_t (*get_eof)(H5FD_t *file)</CODE>
|
|
<DD>
|
|
A function to return the current end-of-file marker value.
|
|
<DT><CODE>herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer)</CODE>
|
|
<DD>
|
|
A function to read data from a file.
|
|
<DT><CODE>herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer)</CODE>
|
|
<DD>
|
|
A function to write data to a file.
|
|
<DT><CODE>herr_t (*flush)(H5FD_t *file)</CODE>
|
|
<DD>
|
|
A function which flushes cached data to the file.
|
|
<DT><CODE>H5FD_mem_t fl_map[H5FD_MEM_NTYPES]</CODE>
|
|
<DD>
|
|
An array which maps a file allocation request type to a free list.
|
|
</DL>
|
|
|
|
<P>
|
|
<STRONG>Example:</STRONG> The sec2 driver would be registered as:
|
|
|
|
</P>
|
|
|
|
<PRE>
|
|
static const H5FD_class_t H5FD_sec2_g = {
|
|
"sec2", /*name */
|
|
MAXADDR, /*maxaddr */
|
|
NULL, /*sb_size */
|
|
NULL, /*sb_encode */
|
|
NULL, /*sb_decode */
|
|
0, /*fapl_size */
|
|
NULL, /*fapl_get */
|
|
NULL, /*fapl_copy */
|
|
NULL, /*fapl_free */
|
|
0, /*dxpl_size */
|
|
NULL, /*dxpl_copy */
|
|
NULL, /*dxpl_free */
|
|
H5FD_sec2_open, /*open */
|
|
H5FD_sec2_close, /*close */
|
|
H5FD_sec2_cmp, /*cmp */
|
|
H5FD_sec2_query, /*query */
|
|
NULL, /*alloc */
|
|
NULL, /*free */
|
|
H5FD_sec2_get_eoa, /*get_eoa */
|
|
H5FD_sec2_set_eoa, /*set_eoa */
|
|
H5FD_sec2_get_eof, /*get_eof */
|
|
H5FD_sec2_read, /*read */
|
|
H5FD_sec2_write, /*write */
|
|
H5FD_sec2_flush, /*flush */
|
|
H5FD_FLMAP_SINGLE, /*fl_map */
|
|
};
|
|
|
|
hid_t
|
|
H5FD_sec2_init(void)
|
|
{
|
|
if (!H5FD_SEC2_g) {
|
|
H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g);
|
|
}
|
|
return H5FD_SEC2_g;
|
|
}
|
|
</PRE>
|
|
|
|
<P>
|
|
A driver can be removed from the library by unregistering it
|
|
|
|
</P>
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> herr_t <B>H5Dunregister</B> <I>(hid_t <VAR>driver</VAR>)</I>
|
|
<DD><A NAME="IDX14"></A>
|
|
Where <VAR>driver</VAR> is the ID number returned when the driver was registered.
|
|
</DL>
|
|
|
|
</P>
|
|
<P>
|
|
Unregistering a driver makes it unusable for creating new file access or data
|
|
transfer property lists but doesn't affect any property lists or files that
|
|
already use that driver.
|
|
|
|
</P>
|
|
|
|
|
|
|
|
|
|
<H3><A NAME="SECProgNote" HREF="#TOCProgNote">Programming Note
|
|
for C++ Developers Using C Functions</A></H3>
|
|
|
|
<p>If a C routine that takes a function pointer as an argument is
|
|
called from within C++ code, the C routine should be returned from
|
|
normally. </p>
|
|
|
|
<p>Examples of this kind of routine include callbacks such as
|
|
<code>H5Pset_elink_cb</code> and <code>H5Pset_type_conv_cb</code>
|
|
and functions such as <code>H5Tconvert</code> and
|
|
<code>H5Ewalk2</code>.</p>
|
|
|
|
<p>Exiting the routine in its normal fashion allows the HDF5 C
|
|
Library to clean up its work properly. In other words, if the C++
|
|
application jumps out of the routine back to the C++
|
|
“catch” statement, the library is not given the
|
|
opportunity to close any temporary data structures that were set
|
|
up when the routine was called. The C++ application should save
|
|
some state as the routine is started so that any problem that
|
|
occurs might be diagnosed.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<H2><A NAME="SEC24" HREF="#TOC24">Querying Driver Information</A></H2>
|
|
|
|
<P>
|
|
<DL>
|
|
<DT><U>Function:</U> void * <B>H5Pget_driver_data</B> <I>(hid_t <VAR>fapl</VAR>)</I>
|
|
<DD><A NAME="IDX15"></A>
|
|
<DT><U>Function:</U> void * <B>H5Pget_driver_data</B> <I>(hid_t <VAR>fxpl</VAR>)</I>
|
|
<DD><A NAME="IDX16"></A>
|
|
|
|
</P>
|
|
<P>
|
|
This function is intended to be used by driver functions, not applications.
|
|
It returns a pointer directly into the file access property list
|
|
<CODE><VAR>fapl</VAR></CODE> which is a copy of the driver's file access mode originally
|
|
provided to the <CODE>H5Pset_driver</CODE> function. If its argument is a data
|
|
transfer property list <CODE>fxpl</CODE> then it returns a pointer to the
|
|
driver-specific data transfer information instead.
|
|
</DL>
|
|
|
|
</P>
|
|
|
|
|
|
|
|
<H1><A NAME="SEC25" HREF="#TOC25">Miscellaneous</A></H1>
|
|
|
|
<P>
|
|
The various private <CODE>H5F_low_*</CODE> functions will be replaced by public
|
|
<CODE>H5FD*</CODE> functions so they can be called from drivers.
|
|
|
|
</P>
|
|
<P>
|
|
All private functions <CODE>H5F_addr_*</CODE> which operate on addresses will be
|
|
renamed as public functions by removing the first underscore so they can be
|
|
called by drivers.
|
|
|
|
</P>
|
|
<P>
|
|
The <CODE>haddr_t</CODE> address data type will be passed by value throughout the
|
|
library. The original intent was that this type would eventually be a union of
|
|
file address types for the various drivers and may become quite large, but
|
|
that was back when drivers were part of HDF5. It will become an alias for an
|
|
unsigned integer type (32 or 64 bits depending on how the library was
|
|
configured).
|
|
|
|
</P>
|
|
<P>
|
|
The various <CODE>H5F*.c</CODE> driver files will be renamed <CODE>H5FD*.c</CODE> and each
|
|
will have a corresponding header file. All driver functions except the
|
|
initializer and API will be declared static.
|
|
|
|
</P>
|
|
<P>
|
|
This documentation didn't cover optimization functions which would be useful
|
|
to drivers like MPI-IO. Some drivers may be able to perform data pipeline
|
|
operations more efficiently than HDF5 and need to be given a chance to
|
|
override those parts of the pipeline. The pipeline would be designed to call
|
|
various H5FD optimization functions at various points which return one of
|
|
three values: the operation is not implemented by the driver, the operation is
|
|
implemented but failed in a non-recoverable manner, the operation is
|
|
implemented and succeeded.
|
|
|
|
</P>
|
|
<P>
|
|
Various parts of HDF5 check the only the top-level file driver and do
|
|
something special if it is the MPI-IO driver. However, we might want to be
|
|
able to put the MPI-IO driver under other drivers such as the raw part of a
|
|
split driver or under a debug driver whose sole purpose is to accumulate
|
|
statistics as it passes all requests through to the MPI-IO driver. Therefore
|
|
we will probably need a function which takes a format address and or object
|
|
type and returns the driver which would have been used at the lowest level to
|
|
process the request.
|
|
|
|
</P>
|
|
|
|
<P><HR><P>
|
|
<H1>Footnotes</H1>
|
|
<H3><A NAME="FOOT1" HREF="#DOCF1">(1)</A></H3>
|
|
<P>The driver name is by convention and might
|
|
not apply to drivers which are not distributed with HDF5.
|
|
<H3><A NAME="FOOT2" HREF="#DOCF2">(2)</A></H3>
|
|
<P>The access method also indicates how to translate
|
|
the storage name to a storage server such as a file, network protocol, or
|
|
memory.
|
|
<H3><A NAME="FOOT3" HREF="#DOCF3">(3)</A></H3>
|
|
<P>The term
|
|
"<EM>file</EM> access property list" is a misnomer since storage isn't
|
|
required to be a file.
|
|
<H3><A NAME="FOOT4" HREF="#DOCF4">(4)</A></H3>
|
|
<P>This
|
|
function is overloaded to operate on data transfer property lists also, as
|
|
described below.
|
|
<H3><A NAME="FOOT5" HREF="#DOCF5">(5)</A></H3>
|
|
<P>Read-only access is only appropriate when opening an existing
|
|
file.
|
|
<H3><A NAME="FOOT6" HREF="#DOCF6">(6)</A></H3>
|
|
<P>For instance, writing data to one handle will cause
|
|
the data to be immediately visible on the other handle.
|
|
<H3><A NAME="FOOT7" HREF="#DOCF7">(7)</A></H3>
|
|
<P>The ordering is
|
|
arbitrary as long as it's consistent within a particular file driver.
|
|
<H3><A NAME="FOOT8" HREF="#DOCF8">(8)</A></H3>
|
|
<P>File access modes do not describe data, but rather
|
|
describe how the HDF5 format address space is mapped to the underlying
|
|
file(s). Thus, in general the mapping must be known before the file superblock
|
|
can be read. However, the user usually knows enough about the mapping for the
|
|
superblock to be readable and once the superblock is read the library can fill
|
|
in the missing parts of the mapping.
|
|
<P><HR><P>
|
|
|
|
<?php include("../ed_libs/Footer2.htm"); ?>
|
|
|
|
</BODY>
|
|
</HTML>
|