Clean up code (to align w/future sblock_mdc branch changes), tweak
tests for [slightly] easier debugging, fix memory leak when copying chunked
datasets with I/O filters, fix memory leak of free space section when it was
exactly the right size to use for extending an existing block in the file.
Tested on:
FreeBSD/32 6.3 (duty) in debug mode
FreeBSD/64 6.3 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/Intel compilers w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in production mode
the file didn't have the data. It happened because each handle had its own
object structure, and the empty one overwrote the data with fill value. This is
fixed by making some attribute information like the data be shared in the
attribute structure.
Tested on smirom, kagiso, and linew.
for it, too. This round of checkin isn't well tarnished yet. I'll come back to work on Jan. 8. I'll revise it then.
Tested it on kagiso, smirom, linew.
Tested platform:
Kagiso only since it is only a comment block change. If it works in one
machine, it should work in all, I hope. Still need to check the parallel
build on copper.
version and size information into the superblock to eliminate a read when
loading it.
This is a file format change, and hopefully the last one (knock on wood).
Tested on kagiso and Windows (mostly just a SOHM change).
This feature is still in progress; Shared Object Header Messages are not
complete as a feature and are not thoroughly tested. There are still
"TODO" comments in the code (comments with the word "JAMES" in them,
so as not to be confused with other TODO comments).
Hopefully this checkin will reduce the liklihood of conflicts as I finish
implementing this feature.
All current tests pass on juniper, copper (parallel), heping, kagiso, and mir.
Break out a bunch of the misc. routines that were in src/H5.c into more
specific modules.
Add optimized fletcher32 checksum routine, for checksumming metadata as
well as raw data.
Tested On:
Linux/32 2.6 (chicago)
Linux/64 2.6 (chicago2)
Will test further after checkin...
Users can create external links using H5L_create_external(). These links
point to an object in another HDF5 file. Users can alter the behavior of
external links or create new kinds of links by registering callbacks
using the H5L interface.
Added tests, tools support, etc.
Also a number of other, minor changes have been made (some restructuring of
the H5L interface, for instance).
Additional documentation and examples are forthcoming.
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
Bug fix
Description:
tid.c is a test for user-defined hid_t's. It should be part of the
testhdf5 tests, but was not being run.
It also had a couple of minor bugs.
Solution:
Added id test to testhdf5 and fixed bugs.
Platforms tested:
mir, sleipnir, modi4
Feature
Description:
Added "support" for UTF-8 character encoding.
Solution:
Wrote tests to check that UTF-8 can be used in a number of places in
HDF5 (object names, data, etc.). These tests live in test/tunicode.c.
Added a new UTF-8 character encoding for datatypes.
Platforms tested:
mir, modi4, heping
Misc. update:
Remove feature
Description:
Retire threaded, balanced binary tree code from HDF5 use. Requiescat in
pace...
Also, regenerate dependencies files.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require full h5committesting (the code is already
disconnected from everything except its tests)
Add new internal data structure
Description:
Add an implementation of skip lists to the library (see comment in
src/H5SL.c for references to the papers describing them) as a potential
replacement for our current threaded, balanced binary tree container.
Skip lists are much simpler to implement and should be faster to use.
Also, added new error codes to release branch, so bump the minor version
number to indicate that the library is no longer perfectly compatible with
the 1.6.3 release.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Solaris 2.7 (arabica)
Too minor to require further testing (the skip lists aren't actually
used by any library code yet)
tconfig.c:
Verified only the unsigned int types and did not detect inconsistencies
on signed type such as int_fast16_t.
Changed code to verify both signed and unsigned int types wherever
applicable. It also depends on signed and unsigned forms of an
int type must be of the same sizes.
testhdf5.c:
Shorten the configure test name to 'config'--easier to type.
Tested on LLNL Frost and Snow. Will test in Copper after commit.
feature
Description:
Another revamp of the test interface.
TestInit: is used to register Test Program name, test program specific
Usage and option parsing routines.
TestUsage: will invoke extra usage routine if provided.
TestParseCmdLine: will invoke extra option parsing routine if provided.
GetTestSummary() and GetTestCleanup() replaces the previous Summary and
CleanUp arguments of TestParseCmdLine.
test/testhdf5, test/ttsafe.c, testpar/t_mpi.c, testpar/testphdf5.c:
All have been updated to use the new Test Routines.
testpar/t_mpi.c:
Also a fix of a compiler optimization bug when pgcc in Linux is
used to compile it. Changed buf[] and expected to unsigned char
type to avoid a bug that failed to do sign-extension.
Platforms tested:
"h5committested"
Also tested thread-safe option in eirene.
Feature
Description:
Changed TestParseCmdLine to accept an optional extra_parse()
function provided by indidivual test application. Extra_parse()
can handle extra options special to that test application.
Updated testhdf5.c to use the new syntax of TestParseCmdLine().
Platforms tested:
On eirene both serial and parallel.
Feature
Description:
Added to AddTest() a generic parameters pointer argument to
allow some extra parameters for some tests. E.g., test file
names can be customized during runtime and passed into the
test routines.
Platforms tested:
"h5committested".
Also run compat test in eirene.
Code cleanup & reorganization
Description:
Move further in the testing framework cleanup, eliminating all the
global variables (moving them into testframe.c as static variables) from the
testing framework code and moving it into the libh5test.a.
Platforms tested:
FreeBSD 4.9 (sleipnir) w & w/o thread-safety, c++ & parallel
h5committested
Code cleanup
Description:
Refactor library testing framework (used for the testhdf5 & ttsafe tests)
to remove almost all of the duplicated code, moving the common code into a
new 'testframe.c' source file.
Platforms tested:
FreeBSD 4.9 (sleipnir) w & w/o thread-safety
h5committest
Fixed a dumb mistake.
Description:
Other tests barked at the changes.
Solution:
Move the declaration of Index and Test[] into h5test.c.
Platforms tested:
Tested in eirene.
Misc. update:
Code reorg.
Description:
Move the InitTest() from individual tests (testhdf5 and ttsafe) to
libh5test (h5test.c) so that it can be shared among all tests.
Platforms tested:
Only tested in Eirene via serial with thread-safe enabled.
No other platforms test since this is pretty trivial.
bug fix, update documentation
Description:
version 8 of Code Warrior has a bug on the open file function
that causes one of our tests to fail
documented this in the install instructions
removed a CW specific header file include , that was left from version 6
from the file testhdf5.c
Platforms tested:
Code Warrior
linux
Misc. update:
Update
Description:
Updated the Copyright statement
Platforms tested:
Linux (This change is only in the comments, so I just check that the
modules still compile)
Misc. update:
New internal feature
Description:
Add internal API for building and working with heaps (H5HP). This will be
used for the LRU algorithm in the new metadata cache code.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
FreeBSD 4.7 (sleipnir)
Lots of performance improvements & a couple new internal API interfaces.
Description:
Performance Improvements:
- Cached file offset & length sizes in shared file struct, to avoid
constantly looking them up in the FCPL.
- Generic property improvements:
- Added "revision" number to generic property classes to speed
up comparisons.
- Changed method of storing properties from using a hash-table
to the TBBT routines in the library.
- Share the propery names between classes and the lists derived
from them.
- Removed redundant 'def_value' buffer from each property.
- Switching code to use a "copy on write" strategy for
properties in each list, where the properties in each list
are shared with the properties in the class, until a
property's value is changed in a list.
- Fixed error in layout code which was allocating too many buffers.
- Redefined public macros of the form (H5open()/H5check, <variable>)
internally to only be (<variable>), avoiding innumerable useless
calls to H5open() and H5check_version().
- Reuse already zeroed buffers in H5F_contig_fill instead of
constantly re-zeroing them.
- Don't write fill values if writing entire dataset.
- Use gettimeofday() system call instead of time() system when
checking the modification time of a dataset.
- Added reference counted string API and use it for tracking the
names of objects opening in a file (for the ID->name code).
- Removed redundant H5P_get() calls in B-tree routines.
- Redefine H5T datatype macros internally to the library, to avoid
calling H5check redundantly.
- Keep dataspace information for dataset locally instead of reading
from disk each time. Added new module to track open objects
in a file, to allow this (which will be useful eventually for
some FPH5 metadata caching issues).
- Remove H5AC_find macro which was inlining metadata cache lookups,
and call function instead.
- Remove redundant memset() calls from H5G_namei() routine.
- Remove redundant checking of object type when locating objects
in metadata cache and rely on the address only.
- Create default dataset object to use when default dataset creation
property list is used to create datasets, bypassing querying
for all the property list values.
- Use default I/O vector size when performing raw data with the
default dataset transfer property list, instead of querying for
I/O vector size.
- Remove H5P_DEFAULT internally to the library, replacing it with
more specific default property list based on the type of
property list needed.
- Remove redundant memset() calls in object header message (H5O*)
routines.
- Remove redunant memset() calls in data I/O routines.
- Split free-list allocation routines into malloc() and calloc()-
like routines, instead of one combined routine.
- Remove lots of indirection in H5O*() routines.
- Simplify metadata cache entry comparison routine (used when
flushing entire cache out).
- Only enable metadata cache statistics when H5AC_DEBUG is turned
on, instead of always tracking them.
- Simplify address comparison macro (H5F_addr_eq).
- Remove redundant metadata cache entry protections during dataset
creation by protecting the object header once and making all
the modifications necessary for the dataset creation before
unprotecting it.
- Reduce # of "number of element in extent" computations performed
by computing and storing the value during dataspace creation.
- Simplify checking for group location's file information, when file
has not been involving in file-mounting operations.
- Use binary encoding for modification time, instead of ASCII.
- Hoist H5HL_peek calls (to get information in a local heap)
out of loops in many group routine.
- Use static variable for iterators of selections, instead of
dynamically allocation them each time.
- Lookup & insert new entries in one step, avoiding traversing
group's B-tree twice.
- Fixed memory leak in H5Gget_objname_idx() routine (tangential to
performance improvements, but fixed along the way).
- Use free-list for reference counted strings.
- Don't bother copying object names into cached group entries,
since they are re-created when an object is opened.
The benchmark I used to measure these results created several thousand
small (2K) datasets in a file and wrote out the data for them. This is
Elena's "regular.c" benchmark.
These changes resulted in approximately ~4.3x speedup of the
development branch when compared to the previous code in the
development branch and ~1.4x speedup compared to the release
branch.
Additionally, these changes reduce the total memory used (code and
data) by the development branch by ~800KB, bringing the development
branch back into the same ballpark as the release branch.
I'll send out a more detailed description of the benchmark results
as a followup note.
New internal API routines:
Added "reference counted strings" API for tracking strings that get
used by multiple owners without duplicating the strings.
Added "ternary search tree" API for text->object mappings.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
Other platforms/configurations tested?
FreeBSD 4.7 (sleipnir) serial & parallel
Solaris 2.6 (baldric) serial
Bug Fix
Description:
When file space was returned to the file space free-list for reuse,
occasionally raw data allocations which used space from the free-list
would overlap with the metadata accumulator and get over-written with
the cached information in the accumulator, corrupting the data.
Solution:
Check if the space about to be recycled on the free-list is going to be
used for raw data and also overlaps with the metadata accumulator cache,
avoiding using space that fits those criteria.
This fixes bug #701
Platforms tested:
FreeBSD 4.5 (sleipnir)
New feature.
Description:
Added a test to verify the correctness of information provided by
configure in H5config.h. Some information, such as SIZEOF some
types can be hardcoded by config/<machine>. This test verified
the information is indeed correct.
Currenly, only size of C language types are verified.
Platforms tested:
Eirene, regular, arabica.
Code cleanup (sorta)
Description:
When the first versions of the HDF5 library were designed, I remembered
vividly the difficulties of porting code from a 32-bit platform to a 16-bit
platform and asked that people use intn & uintn instead of int & unsigned
int, respectively. However, in hindsight, this was overkill and
unnecessary since we weren't going to be porting the HDF5 library to
16-bit architectures.
Currently, the extra uintn & intn typedefs are causing problems for users
who'd like to include both the HDF5 and HDF4 header files in one source
module (like Kent's h4toh5 library).
Solution:
Changed the uintn & intn's to unsigned and int's respectively.
Platforms tested:
FreeBSD 4.4 (hawkwind)
Update
Description:
Changed includes of the form:
#include <hdf5_file.h>
to
#include "hdf5_file.h"
so that gcc can pick them up easier without including the system
header files since we don't care about them.
Platforms tested:
Linux
Bug fix (sort of)
Description:
The RCSID string in H5public.h was causing the C++ code problem as it
was included multiple times and C++ did not like multiple definitions
of the same static variable.
Solution:
Since we don't really make use of the RCSID strings as we have not
installed it in all source files, we decided to remove it.
Platforms tested:
eirene (linux), modi4 (IRIX64-64) both serial and parallel modes.
New Feature
Description:
Added array datatype tests to the regression tests. These datatype
combinations are tested currently:
1-D array of atomic datatypes
3-D array of atomic datatypes
array of array of atomic datatypes
array of compound of atomic datatypes
array of compound of array datatypes
array of VL of atomic datatypes
array of VL of array datatypes
Also added a test to verify that the older style compound datatype with
array fields works correctly.
Platforms tested:
FreeBSD 4.1.1 (hawkwind)
Bug fix
Description:
"Time" datatypes (H5T_UNIX_D*) were not being stored and retrieved in
the datatype object header message correctly.
Solution:
Store endian-ness and precision in the datatype object header message and
added test to continue to track them working correctly.
This fixes bug #512.
Platforms tested:
FreeBSD 4.1.1 (hawkwind)
The "FILENAME" declared extern in h5test.h is not always used.
It was used in h5_cleanup to remove temporary files created
during tests. Not all tests codes have used this routine.
Indeed, quite a few of test programs do "#define FILENAME ".
Also, h5_cleanup needs to work in tandem with h5_fixname.
h5_fixname accepts an explicite base_name argument instead
of using the global variable FILENAME. That is cleaner.
Solution:
Added char *base_name[] as a new argument to h5_cleanup, in
the same style as h5_fixname. Removed "extern char *FILENAME..."
from use. Also, undo some unnecessary declaration of "char *FILENAME"
from some tests which don't use it at all (yet).
Platforms tested:
modi4-64(irix64), arabica(solari2.7), eirene(linux)
(arabica could not launch tests automatically. I had to hack
in LD_LIBRARY_PATH to make them run.)
----------------------
./src/H5T.c
Fixed a typo in the registration of the `unsigned char' to
`unsigned long long' type conversion that caused it to not be
registered, falling back to software whenever that conversion
path was taken.
./MANIFEST
./test/Makefile.in
./test/testhdf5.c
./test/testhdf5.h
./test/theap.c [REMOVED]
./test/lheap.c [NEW]
./test/tohdr.c [REMOVED]
./test/ohdr.c [NEW]
./test/tstab.c [REMOVED]
./test/stab.c [NEW]
Removed the `t' from the front of these names and made each
test a stand-alone program following the format of most of the
other tests.
./test/big.c
Uses libh5test.a but always sets the low-level driver to 1GB
file family.
The `#if' near the top to set the data space to 8GB has been
simplified now that `long_long' is always defined and the
error message is improved when `long_long' isn't wide enough.
Cleanup code was added to the error handling.
./test/gheap.c
./test/istore.c
Uses libh5test.a. Added error cleanup code.
./test/dtypes.c
./test/h5test.c
Added 68 new tests that check hardware and software
conversions between `long long' and `unsigned long long' and
the other integer types. The tests only run on machines where
sizeof(long_long)!=sizeof(long). We test a total of 180
different integer conversions, half in hardware and half in
software.
Cut down the number of times each test is run from 5 to 1 so
it doesn't take so long. If you want to run more times
there's a constant that can be changed at the top of the file.
./test/extend.c
Removed unused variable.
./test/h5test.c
./test/h5test.h
./test/external.c
./test/fillval.c
The h5_cleanup() returns true/false so it can be used in an `if'
statement to clean up additional files.
./doc/html/Environment.html
Indented. Added HDF5_PREFIX and HDF5_DRIVER descriptions.
./src/H5P.c
Changed the trace type for the second argument from `Iu' to
`x' since it's an output parameter.
./INSTALL
Added a warning that the GNU zlib that comes with the latest
version of HDF4 is too old to use with HDF5 and must be
renamed so configure doesn't see it when `--enable-hdf4' is
used.