More cleanups on the object header message handling code, to make it
easier to work with and move forward on the creation order coding.
Various other minor cleanups & warning fixes.
Tested on:
FreeBSD/32 6.2 (duty)
Finish moving object header message routines into their own source code
module, along with renaming them to have "H5O_msg_" prefix...
Tested on:
Mac OS X/32 10.4.8 (amazon)
FreeBSD/32 4.11 (sleipnir)
Linux/32 2.4 (heping)
AIX/32 5.? (copper)
Migrate more object header routines to use the H5O_msg_ prefix and put
them into the src/H5Omessage.c code module.
Tested on:
Mac OS X/32 10.4.8 (amazon)
FreeBSD/32 4.11 (sleipnir)
Linux/32 2.4 (heping)
AIX/32 5.? (copper)
Refactor object header code to separate process of creating an object
header message from the process of writing to an existing one.
Start renaming operations that deal with object header messages to have
"H5O_msg_" prefix...
Tested on:
Mac OS X/32 10.4.8 (amazon)
FreeBSD/32 4.11 (sleipnir)
Linux/32 2.4 (heping)
AIX/32 5.? (copper)
Propagate object creation properties up into group, dataset and named
datatype property lists, when those property lists are retrieved for
existing objects in a file.
Also, add H5Tget_create_plist() API routine, to allow named datatype
property lists to be retrieved for named datatypes.
Tested on:
FreeBSD/32 4.11 (sleipnir)
Linux/32 2.4 (heping)
Linux/64 2.4 (mir)
AIX/32 5.? (copper)
"make check-vfd" will now run all tests in the test directory with different
file drivers (at least, all of those tests that use the testing framework's
FAPL). Tests that fail will be skipped.
This is not a perfect fix, but is better than nothing.
Along with this change, check-vfd should be added to the Daily Tests.
New feature
Description:
Check in baseline for compact group revisions, which radically revises the
source code for managing groups and object headers.
WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!!
WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!!
This initiates the "unstable" phase of the 1.7.x branch, leading up
to the 1.8.0 release. Please test this code, but do _NOT_ keep files created
with it - the format will change again before the release and you will not
be able to read your old files!!!
WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!!
WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!!
Solution:
There's too many changes to really describe them all, but some of them
include:
- Stop abusing the H5G_entry_t structure and split it into two separate
structures for non-symbol table node use within the library: H5O_loc_t
for object locations in a file and H5G_name_t to store the path to
an opened object. H5G_entry_t is now only used for storing symbol
table entries on disk.
- Retire H5G_namei() in favor of a more general mechanism for traversing
group paths and issuing callbacks on objects located. This gets us out
of the business of hacking H5G_namei() for new features, generally.
- Revised H5O* routines to take a H5O_loc_t instead of H5G_entry_t
- Lots more...
Platforms tested:
h5committested and maybe another dozen configurations.... :-)
Code cleanup
Description:
Merge back changes from "compact group" work that improve the
infrastructure of the library and may impact others. In this round of
merging, that includes:
- Move datatype allocation into single internal routine, instead of
duplicated code that was spread out in a dozen or so places.
- Clean up guts of object header routines (H5O_*) to allow for some of
the fancieroperations that need to be performed on groups, along with
some general improvements.
- Added a new error code
- Some minor cleanups in other code....
Platforms tested:
FreeBSD 4.11 (sleipnir)
Linux 2.4
Mac OS X
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-)
Description:
Generally speaking, this is the "signed->unsigned" change to selections.
However, in the process of merging code back, things got stickier and stickier
until I ended up doing a big "sync the two branches up" operation. So... I
brought back all the "infrastructure" fixes from the development branch to the
release branch (which I think were actually making some improvement in
performance) as well as fixed several bugs which had been fixed in one branch,
but not the other.
I've also tagged the repository before making this checkin with the label
"before_signed_unsigned_changes".
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & fphdf5
FreeBSD 4.10 (sleipnir) w/threadsafe
FreeBSD 4.10 (sleipnir) w/backward compatibility
Solaris 2.7 (arabica) w/"purify options"
Solaris 2.8 (sol) w/FORTRAN & C++
AIX 5.x (copper) w/parallel & FORTRAN
IRIX64 6.5 (modi4) w/FORTRAN
Linux 2.4 (heping) w/FORTRAN & C++
Misc. update:
Code cleanup
Description:
Clean up minor warnings and align with release branch.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & FPH5
Solaris 2.7 (arabica) w/production mode
Linux 2.4 (heping) w/C++ and FORTRAN
Description: Restore 6 old error API functions back to the library to be backward
compatible with v1.6. They are H5Epush, H5Eprint, H5Ewalk, H5Eclear, H5Eset_auto,
H5Eget_auto. These functions do not have error stack as parameter.
Solution: Internally, these functions use default error stack.
Platforms tested: h5committest and fuss.
Misc. update: RELEASE.txt
Code cleanup
Description:
Clean up a bunch of warnings and bring new code better inline with current
library coding practice.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
Misc. update:
Purpose:
Bug Fix
Description:
If an HDF5 file grows larger than its address space, it dies and is unable to
write any data. This is more likely to happen since users are able to change
the number of bytes used to store addresses in the file.
Solution:
HDF5 now throws an error instead of dying. In addition, it "reserves" address
space for the local heap and for object headers (which do not allocate space
immediately). This ensures that after the error occurs, there is enough address
space left to flush the entire file to disk, so no data is lost.
A more complete explanation is at /doc/html/TechNotes/ReservedFileSpace.html
Platforms tested:
sleipnir, copper (parallel), verbena, arabica, Windows (Visual Studio 7)
Solution:
Platforms tested:
Misc. update:
bug fix
Description:
1. windows cannot recongize long long. We agree to use long_long to represent all "long long" compatible data types.
2. windows test can not check error code, it depends on error messages to be returned. This will be fixed in the future. In odhr.c, somehow only error value1 is generated, there are no error message even if some tests are missing. So just add an error message when error code is 1.
Solution:
change long long to long_long in dtypes.c;
Add an error message when error code is not 0 in ohdr.c
Platforms tested:
eirene(fortran),
arabica(fortran)
Misc. update:
Bug fix
Description:
When too many messages were inserted into an object header, the library
had an internal pointer to the "new message" that was pointing to the incorrect
location when the array of messages was re-allocated.
In the worst case, this could cause a file to be corrupted.
Solution:
Update the internal pointer when the array is re-allocated.
Platforms tested:
FreeBSD 4.9 (sleipnir)
Too small to require h5committest
Bug fixes and code cleanup
Description:
Re-worked ohdr test to use modification time messages instead of symbol
table messages, now that the library correctly tries deleting the
local heap and B-trees for the symbol tables (which didn't exist and
caused the test to fail).
Added tests for using named datatypes in attributes to verify that the
reference counts are being tracked correctly, etc.
Platforms tested:
FreeBSD 4.9 (sleipnir)
h5committest
Code cleanup
Description:
Clean up miscellaneous warnings which have crept into the code.
Fix "_POSIX_C_SOURCE not defined" warning on FreeBSD.
Adjust gcc compiler flags to be more concise for production mode.
Refactor the H5O code so that there is a stronger boundary between code
in the H5O package and code in the library which just calls H5O routines.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
FreeBSD 4.7 (sleipnir) serial & parallel and gcc 2.95.4 & gcc 3.2.2
Misc. update:
Update MANIFEST if you add or remove any file.
Bug Fix
Description:
Metadata cache in parallel I/O can cause hangs in applications which
perform independent I/O on chunked datasets, because the metadata cache
can attempt to flush out dirty metadata from only a single process, instead
of collectively from all processes.
Solution:
Pass a dataset transfer property list down from every API function which
could possibly trigger metadata I/O.
Then, split the metadata cache into two sets of entries to allow dirty
metadata to be set aside when a hash table collision occurs during
independent I/O.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
FreeBSD 4.7 (sleipnir) serial & parallel
Misc. update:
Updated release_docs/RELEASE
Bug fix
Description:
Currently, when the library encounters an object header message that isn't
know, it fails to open that object in the file.
Solution:
Allow the library to skip over the unknown object header message and
continue to process the remaining messages, in the hope that the skipped
message isn't important later. If it is important, it will be caught at
a higher level of the library.
Platforms tested:
FreeBSD 4.7 (sleipnir)
Lots of performance improvements & a couple new internal API interfaces.
Description:
Performance Improvements:
- Cached file offset & length sizes in shared file struct, to avoid
constantly looking them up in the FCPL.
- Generic property improvements:
- Added "revision" number to generic property classes to speed
up comparisons.
- Changed method of storing properties from using a hash-table
to the TBBT routines in the library.
- Share the propery names between classes and the lists derived
from them.
- Removed redundant 'def_value' buffer from each property.
- Switching code to use a "copy on write" strategy for
properties in each list, where the properties in each list
are shared with the properties in the class, until a
property's value is changed in a list.
- Fixed error in layout code which was allocating too many buffers.
- Redefined public macros of the form (H5open()/H5check, <variable>)
internally to only be (<variable>), avoiding innumerable useless
calls to H5open() and H5check_version().
- Reuse already zeroed buffers in H5F_contig_fill instead of
constantly re-zeroing them.
- Don't write fill values if writing entire dataset.
- Use gettimeofday() system call instead of time() system when
checking the modification time of a dataset.
- Added reference counted string API and use it for tracking the
names of objects opening in a file (for the ID->name code).
- Removed redundant H5P_get() calls in B-tree routines.
- Redefine H5T datatype macros internally to the library, to avoid
calling H5check redundantly.
- Keep dataspace information for dataset locally instead of reading
from disk each time. Added new module to track open objects
in a file, to allow this (which will be useful eventually for
some FPH5 metadata caching issues).
- Remove H5AC_find macro which was inlining metadata cache lookups,
and call function instead.
- Remove redundant memset() calls from H5G_namei() routine.
- Remove redundant checking of object type when locating objects
in metadata cache and rely on the address only.
- Create default dataset object to use when default dataset creation
property list is used to create datasets, bypassing querying
for all the property list values.
- Use default I/O vector size when performing raw data with the
default dataset transfer property list, instead of querying for
I/O vector size.
- Remove H5P_DEFAULT internally to the library, replacing it with
more specific default property list based on the type of
property list needed.
- Remove redundant memset() calls in object header message (H5O*)
routines.
- Remove redunant memset() calls in data I/O routines.
- Split free-list allocation routines into malloc() and calloc()-
like routines, instead of one combined routine.
- Remove lots of indirection in H5O*() routines.
- Simplify metadata cache entry comparison routine (used when
flushing entire cache out).
- Only enable metadata cache statistics when H5AC_DEBUG is turned
on, instead of always tracking them.
- Simplify address comparison macro (H5F_addr_eq).
- Remove redundant metadata cache entry protections during dataset
creation by protecting the object header once and making all
the modifications necessary for the dataset creation before
unprotecting it.
- Reduce # of "number of element in extent" computations performed
by computing and storing the value during dataspace creation.
- Simplify checking for group location's file information, when file
has not been involving in file-mounting operations.
- Use binary encoding for modification time, instead of ASCII.
- Hoist H5HL_peek calls (to get information in a local heap)
out of loops in many group routine.
- Use static variable for iterators of selections, instead of
dynamically allocation them each time.
- Lookup & insert new entries in one step, avoiding traversing
group's B-tree twice.
- Fixed memory leak in H5Gget_objname_idx() routine (tangential to
performance improvements, but fixed along the way).
- Use free-list for reference counted strings.
- Don't bother copying object names into cached group entries,
since they are re-created when an object is opened.
The benchmark I used to measure these results created several thousand
small (2K) datasets in a file and wrote out the data for them. This is
Elena's "regular.c" benchmark.
These changes resulted in approximately ~4.3x speedup of the
development branch when compared to the previous code in the
development branch and ~1.4x speedup compared to the release
branch.
Additionally, these changes reduce the total memory used (code and
data) by the development branch by ~800KB, bringing the development
branch back into the same ballpark as the release branch.
I'll send out a more detailed description of the benchmark results
as a followup note.
New internal API routines:
Added "reference counted strings" API for tracking strings that get
used by multiple owners without duplicating the strings.
Added "ternary search tree" API for text->object mappings.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
Other platforms/configurations tested?
FreeBSD 4.7 (sleipnir) serial & parallel
Solaris 2.6 (baldric) serial
Update
Description:
Changed includes of the form:
#include <hdf5_file.h>
to
#include "hdf5_file.h"
so that gcc can pick them up easier without including the system
header files since we don't care about them.
Platforms tested:
Linux
Clean up warnings
Description:
The "FAILED" macro is defined by Windows and is causing warnings and
potential errors when compiled on that platform.
Solution:
Change our macro from FAILED to H5_FAILED.
Platforms tested:
FreeBSD 4.2 (hawkwind)
The "FILENAME" declared extern in h5test.h is not always used.
It was used in h5_cleanup to remove temporary files created
during tests. Not all tests codes have used this routine.
Indeed, quite a few of test programs do "#define FILENAME ".
Also, h5_cleanup needs to work in tandem with h5_fixname.
h5_fixname accepts an explicite base_name argument instead
of using the global variable FILENAME. That is cleaner.
Solution:
Added char *base_name[] as a new argument to h5_cleanup, in
the same style as h5_fixname. Removed "extern char *FILENAME..."
from use. Also, undo some unnecessary declaration of "char *FILENAME"
from some tests which don't use it at all (yet).
Platforms tested:
modi4-64(irix64), arabica(solari2.7), eirene(linux)
(arabica could not launch tests automatically. I had to hack
in LD_LIBRARY_PATH to make them run.)
----------------------
This extensive change is the virtual file layer implementation. I've
ported and tested the sec2, family, and core drivers and only ported
the mpio driver (Albert will test it). So if you need MPIO I would
recommend sticking with the previous version for a while.
You will get a few compile warnings about split and stdio drivers not
being implemented and possibly tracing information not inserted in
some of the drivers. You can safely ignore them but I plan to fix
them.
I'm still working on the split driver because I just realized that it
needs a part of the VFL that isn't written yet.
Documentation is being updated also because there were some minor
changes (mostly just name changes). It should be available on my web
site later this week.
./MANIFEST
./src/Makefile.in
./src/hdf5.h
./src/H5Flow.c [REMOVED]
./src/H5Fstdio.c [REMOVED]
./src/H5Fsec2.c [REMOVED]
./src/H5Fsplit.c [REMOVED]
./src/H5Fmpio.c [REMOVED]
./src/H5Ffamily.c [REMOVED]
./src/H5Fcore.c [REMOVED]
./src/H5MFpublic.h [REMOVED]
./src/H5FD.c [NEW]
./src/H5FDcore.c [NEW]
./src/H5FDcore.h [NEW]
./src/H5FDfamily.c [NEW]
./src/H5FDfamily.h [NEW]
./src/H5FDmpio.c [NEW]
./src/H5FDmpio.h [NEW]
./src/H5FDprivate.h [NEW]
./src/H5FDpublic.h [NEW]
./src/H5FDsec2.c [NEW]
./src/H5FDsec2.h [NEW]
Removed/added files for virtual file layer.
./bin/trace
./src/H5.c
Removed unused public datatypes and added new VFL public
datatypes.
Changed an error message.
./config/BlankForm
./config/dec-flags
./config/gnu-flags
./config/hpux10.20
./config/hpux9.03
./config/irix5.x
./config/irix6.x
./config/solaris2.x
./config/unicosmk
Removed the H5F_OPT_SEEK and H5F_LOW_DFLT constants from the
configuration since they're no longer applicable. The default
file driver is always the sec2 driver and it always optimizes
calls to lseek() or lseek64().
./config/depend.in
C preprocessor errors generated during automatic dependency
building are sent to /dev/null to prevent them from appearing
twice in the make output.
./src/H5AC.c
./src/H5B.c
./src/H5D.c
./src/H5F.c
./src/H5G.c
./src/H5Gent.c
./src/H5Gnode.c
./src/H5HG.c
./src/H5HL.c
./src/H5O.c
./src/H5Oattr.c
./src/H5Odtype.c
./src/H5Oefl.c
./src/H5Oshared.c
./src/H5T.c
./src/H5detect.c
./test/ohdr.c
Changed H5F_ADDR_UNDEF to HADDR_UNDEF to be more consistent
with the `haddr_t' datatype which is now a public type.
./src/H5D.c
./src/H5P.c
./src/H5Ppublic.h
./src/H5Tconv.c
./test/cmpd_dset.c
./test/dsets.c
./test/overhead.c
./test/tselect.c
./test/tvltypes.c
The H5P_DATASET_XFER constant was changed to H5P_DATA_XFER
because the properties apply to all types of I/O operations,
not just datasets.
./src/H5B.c
./src/H5Bprivate.h
./src/H5D.c
./src/H5Dpublic.h
./src/H5F.c
./src/H5Farray.c
./src/H5Fistore.c
./src/H5Fprivate.h
./src/H5Fpublic.h
./src/H5Gnode.c
./src/H5Gpkg.h
./src/H5HG.c
./src/H5HL.c
./src/H5O.c
./src/H5R.c
./src/H5Sall.c
./src/H5Shyper.c
./src/H5Smpio.c
./src/H5Spoint.c
./src/H5Sprivate.h
./test/big.c
./test/h5test.c
./test/istore.c
./testpar/t_dset.c
./testpar/t_file.c
./tools/h5debug.c
./tools/h5ls.c
Modified to work with the virtual file layer by calling H5FD_*
functions instead of H5F_low_* functions and by passing file
access and data transfer properties by object ID instead of
pointer.
Changed H5D_transfer_t to H5FD_mpio_xfer_t since the
COLLECTIVE vs. INDEPENDENT transfer mode is specific to the
MPIO file driver.
Moved MPIO-specific stuff into the MPIO driver.
./src/H5B.c
./src/H5D.c
./src/H5Fprivate.h
The H5F_mpio_* private functions were renamed and placed in
the H5FDmpio driver except those which appeared in H5Smpio.c.
./src/H5E.c
./src/H5Epublic.h
Added major error number H5E_VFL for virtual file layer
related errors.
./src/H5F.c
./src/H5Fprivate.h
Changed the logic that controls whether the boot block is
written. Instead of assuming that the first call to write the
boot block is only to allocate space, I've added a function
argument which makes this explicit.
Changed the way files are compared so that a driver-defined
comparison function can be called. Files which belong to
different drivers are always considered different.
Removed H5F_driver_t since file drivers are now identified by
object ID instead of a special non-user-extendible datatype.
Removed all the hard-coded low-level file properties which
have been replaced by the various file drivers.
./src/H5I.c
./src/H5Iprivate.h
Added the H5I_inc_ref() which was removed a few months ago
since we finally have a use for it.
./src/H5Ipublic.h
Added the H5I_VFL object ID type to identify file drivers in
the virtual file layer.
./src/H5MF.c
./src/H5MFprivate.h
Moved all the allocation/deallocation code into the virtual
file layer which allows file drivers to override much of it.
./src/H5P.c
./src/H5Ppublic.h
Moved file driver-specific code into the various file driver
files.
The H5Pcopy() and H5Pclose() functions make calls into the
virtual file driver to manage the memory for driver-specific
file access and data transfer properties.
./src/H5private.h
./src/H5public.h
The `haddr_t' type is now public.
./test/tfile.c
Added a few more comments.
----------------------
./config/irix64
The old (-32) compiler is now supported by setting envrionment
CC='cc -32'. The -64 compiler is the default or you can set
CC='cc -64'.
./src/H5A.c
./src/H5D.c
./src/H5F.c
./src/H5Fistore.c
./src/H5Flow.c
./src/H5G.c
./src/H5I.c
./src/H5Ocomp.c
./src/H5P.c
./src/H5R.c
./src/H5RA.c
./src/H5T.c
./src/H5Tbit.c
./src/H5Tconv.c
./src/H5Z.c
./src/H5detect.c
./test/big.c
./test/cmpd_dset.c
./test/dsets.c
./test/dtypes.c
./test/enum.c
./test/mtime.c
./test/ohdr.c
./tools/h5ls.c
Fixed lots of warnings on Irix64. Mailed a few remaining
warnings in H5S to Quincey and a few in the dumper to
Ruey-Hsia.
----------------------
./src/H5T.c
Fixed a typo in the registration of the `unsigned char' to
`unsigned long long' type conversion that caused it to not be
registered, falling back to software whenever that conversion
path was taken.
./MANIFEST
./test/Makefile.in
./test/testhdf5.c
./test/testhdf5.h
./test/theap.c [REMOVED]
./test/lheap.c [NEW]
./test/tohdr.c [REMOVED]
./test/ohdr.c [NEW]
./test/tstab.c [REMOVED]
./test/stab.c [NEW]
Removed the `t' from the front of these names and made each
test a stand-alone program following the format of most of the
other tests.
./test/big.c
Uses libh5test.a but always sets the low-level driver to 1GB
file family.
The `#if' near the top to set the data space to 8GB has been
simplified now that `long_long' is always defined and the
error message is improved when `long_long' isn't wide enough.
Cleanup code was added to the error handling.
./test/gheap.c
./test/istore.c
Uses libh5test.a. Added error cleanup code.
./test/dtypes.c
./test/h5test.c
Added 68 new tests that check hardware and software
conversions between `long long' and `unsigned long long' and
the other integer types. The tests only run on machines where
sizeof(long_long)!=sizeof(long). We test a total of 180
different integer conversions, half in hardware and half in
software.
Cut down the number of times each test is run from 5 to 1 so
it doesn't take so long. If you want to run more times
there's a constant that can be changed at the top of the file.
./test/extend.c
Removed unused variable.
./test/h5test.c
./test/h5test.h
./test/external.c
./test/fillval.c
The h5_cleanup() returns true/false so it can be used in an `if'
statement to clean up additional files.
./doc/html/Environment.html
Indented. Added HDF5_PREFIX and HDF5_DRIVER descriptions.
./src/H5P.c
Changed the trace type for the second argument from `Iu' to
`x' since it's an output parameter.
./INSTALL
Added a warning that the GNU zlib that comes with the latest
version of HDF4 is too old to use with HDF5 and must be
renamed so configure doesn't see it when `--enable-hdf4' is
used.