Code cleanup/feature twist
Description:
Adjust recent H5AC routines to take H5F_t pointers instead of H5C_t
pointers, to match the rest of the H5AC routines.
This change propagated into a few of the tests, which also had some
compiler warnings cleaned up...
Platforms tested:
FreeBSD 4.11 (sleipnir) w/parallel
Linux 2.4/64 (mir) w/C++ & FORTRAN
Add pinned entry capability to cache.
Description:
For frequently accessed cache entries, the protect/unprotect overhead
is sometimes a bottleneck.
Solution:
Allow entries to be pinned in the cache. Pinned entries can't be
evicted, but can be flushed or modified.
Platforms tested:
h5committested -- minus one small typo in test/cache.c whose fix was
tested on copper and heping only.
Misc. update:
Several bug fixes
Description:
Added config file for Cray X1 (previous file was misnamed)
Simplified some code in hyperslab.c test that seemed to have been
confusing compiler on Cray X1.
Fixed typo in config/commence.am
Cleaned up hl/src/Makefile.am
Solution:
All four fixes should be straightforward. The failure on Cray was
very difficult to debug, but involved arithmetic errors.
This change seems to fix it.
Platforms tested:
heping, copper, sol, some Cray X1 (more testing when system comes back up)
new feature
Description:
1) separated the HL library into "public" and "private" header files, with the same caracteristics as the basic library
2) added the public headers to hdf5.h (with a conditional include macro, defined in configure.in)
3) added the path to HL in all Makefile.am 's , because of the inclusion in hdf5.h
Solution:
Platforms tested:
linux 32, 64
AIX
solaris
with fortran and c++
(one packet table example fails)
Misc. update:
bug fix.
Description:
The MPE_Stop_log did not work in copper. It spewed out MPE errors
and ended in infinite looping.
Solution:
Changed to a smaller test size to avoid generating huge MPE log files
when MPE is configured in.
Platforms tested:
Copper (mpe)
Misc. update:
Bug fix.
Description:
When MPE is used, the test generates huge Clog files in /tmp that
could fill up the disk (like in Copper.)
Solution:
Turned off MPE logging.
Platforms tested:
heping (pp) and copper(pp)
Misc. update:
Adding parallel tests for optional collective chunk APIs
Description:
Three new APIs
"H5Pset_dxpl_mpio_chunk_opt_ratio
H5Pset_dxpl_mpio_chunk_opt_num
H5Pset_dxpl_mpio_chunk_opt"
for optional optimization choices from users
have been added to the libraries.
This check-in adds six tests to verify the funcationality and correctedness
of these APIs.
These tests need to be verified with 3 or more processors and with MPI-IO driver only.
Solution:
Using H5Pinsert, H5Pget, H5Pset to verify that the library indeed goes into the branch we hope for.
Using H5_HAVE_INSTRUMENT macro to isolate these changes so that it won't affect or be misused by the application.
Platforms tested:
h5committest(shanti still refused to be connected)
Parallel tests on heping somehow are skipped. Manually testing at heping. Have checked
1,2,3,4,5 processes.
Misc. update:
Bug fix
Description:
Previous checkin did a bad thing; 'make clean' failed in example directories.
Solution:
Fixed commence.am so that examples no longer break, and fixed a mistake
in conclude.am.
Platforms tested:
heping (minor makefile change)
Misc. update:
Add tests for optional APIs to support collective chunk IO
Description:
In order to test whether library picks up the user's options,
The number of chunks need to be varied for different processes,
Selection of the number of processes selected in one chunk also
need to be varied.
Solution:
Create two cases,
1. Each chunk only selected by one unique process, this case
library should use independent for collective call.
2. One-third of the processes occupies the top half of the whole domain,
The rest of the processes occupies the lower half of the domain.
The total number of chunk is a fixed number 8.
Platforms tested:
Linux 2.4 with mpich 1.2.6(only)
Since I only checked in the code that handles the selection, haven't added any new tests yet. So it won't affect any platforms.
Misc. update:
Bug fix
Description:
make check-clean didn't clean results of example tests
Solution:
Fixed Makefiles so that check-clean recurses into example directories.
Also a little Makefile cleanup.
Platforms tested:
mir, modi4, heping, copper
change the array size of collective chunking features of parallel tests.
Description:
Previously array size for collective optimization tests
including
cchunk1,
cchunk2,
cchunk3,
cchunk4,
ccontw,
ccontr,
cschunkw,
cschunkr,
ccchunkw,
ccchunkr
are fixed,
They are only valid for some good number of processors(1,2,3,4,6,8,12,16,24,32,48 etc).
Recently there are more requests for parallel tests to be valid on some odd number of processes such as 5,7,11,13 etc.
Solution:
I change the array size to be dynamic rather than static. Now the fastest change array size is a function of mpi_size. dim2 = constant *mpi_size. After some tunings, theoretically the above tests should be valid for any number of processors. However, other parallel tests still need to be tuned.
To verify the correctness of these tests, using mpirun -np 5 ./testphdf5 -b cchunk1 at heping.
Platforms tested:
h5committest(shanti is refused to be connected)
at heping, 5 and 7 processes are used to verify the correctness.
Misc. update:
Add new tests
Description:
Collective IO doesn't work for some platforms/mpio packages when more than
one process has no contributions to IO.
Solution:
1. Add a collective IO test to verify the correctness of the library when
more than one process has no contributions to IO.
2. Add the similar MPI-IO test in t_mpi to help us maintain in more platforms.
Platforms tested:
heping, mir, copper
Misc. update:
Improvement.
Description:
The t_cache takes a long long time to run and it tests HDF5 calls.
Move it to the back and let more basic tests to run first so that
basic features are tested first.
Platforms tested:
Tested in heping with pp.
code cleanup
Description:
remove two printf lines accidently added for debugging at NCSA cobalt.
Solution:
Platforms tested:
No need to test.
Misc. update:
Enhance collective chunk IO supports
Description:
Add a new test to check the correctness of the HDF5 library behavior for collective IO mode when one process doesn't have any contribution for IO.
Solution:
Platforms tested:
IBM AIX 5.2(copper)
Linux (heping) mpich-1.2.6
Misc. update:
Attempt to ensure that the parallel cache test runs at a reasonable
speed with large numbers of processors.
Description:
In some cases, the number of random locks and unlocks was a multiple of
the MPI rank.
Solution:
Use rank % 4 instead of simply rank.
Platforms tested:
copper
Misc. update:
Add a file that I forgot in my last checkin.
Description:
t_cache.c is the source file for the new parallel metadata cache test.
Solution:
See above.
Platforms tested:
h5committested before the last checkin. Will run another
h5committest shortly, but it shouldn't be necessary as there
are no new changes.
Misc. update:
1) Add parallel test for metadata cache
2) Split serial test for metadata cache into two parts
3) Fix bug in which cache was flushed needlessly when the
cache wasn't full.
4) Performance improvements
5) Update API for parallel cache coherency bug fix.
Description:
See above.
Solution:
See above.
Platforms tested:
h5committest
Misc. update:
bug fix 504
Description:
testpar/t_mpi would hang if $HDF5_NOCLEANUP is set. E.g.,
% env HDF5_NOCLEANUP=yes mpirun -np 3 ./t_mpi
This happened because the environment variables are not exported to all
mpi processes by the mpirun command. So, some attempted to do cleanup
while others don't and some hang waiting for others to act.
Solution:
Instead individual program checking getenv, they all just called h5_cleanup
no matter. h5_cleanup now uses getenv_all to check the $HDF5_NOCLEANUP if
it is in parallel mode.
Platforms tested:
h5committested, tested pp in heping too.
Feature.
Description:
Modified it so that it can be compiled outside of HDF5 library as a standalone
program. e.g., mpicc -DSTANDALONE prog.c.
Platforms tested:
Tested in Red storm and heping.
Patch.
Description:
Copper would fail with a message of
0032-113 Out of memory in routine unknown, task 0
when run with 3 processes and size 1MB in MPI-IO tests.
It seems to be a copper MPIO error.
Solution:
Reduced the upper bound of default write size to 1/2MB (but
tests only go to 1/4MB) for now, pending permenant fix from
Copper.
Platforms tested:
Copper.
Minor bug fixes.
Description:
1. Changed free() calls to HDfree()
2. Corrected behavior of -m command-line parameter
3. Changed return value to always return 0.
Solution:
2. The -m flag tells the test to run only the MPI IO tests. However, it would incorrectly make
the test run both the MPI and POSIX tests (same as the default behavior).
3. This test is known to fail on many platforms, and, even on those platforms where it usually passes,
it is known to suffer transient failures (especially with small test file sizes). It's outcome is
also very dependent on the filesystem on which the testfile is created. Corrected
the program to always return success, so that it doesn't interfere with the daily tests.
The motivation for this is that
this test has nothing to do with the HDF library and is an auxiliary test. Failures in t_posix_complaint
do not necessarily mean that parallel HDF will fail, but simply indicate something to look into,
especially on new platforms. This is now an "output only" test, and any errors will be only be visible
in the output.
Platforms tested:
copper (all of these were minor changes)
Misc. update:
bug fix.
Description:
Fixed the segmentation fault errors in modi4, copper and tg-login.
It was due to the misuse of trying to realloc a pointer returned by
getenv_all. (not supposed to.)
Also rearranged the code so that option is checked first, then check
with environment variable, then use default setup. This saves the
need to do realloc at all.
Platforms tested:
Heping, modi4, shanti, copper (copper showed a different error now.)
Bug fix.
Description:
For some strange reason, getopt() does not appear to be defined in unistd.h
on colonelk when the source is compiled with -D_POSIX_SOURCE.
Solution:
Inserted some extern's for the missing variables to make the compiler happy.
Platforms tested:
colonelk, copper
Misc. update:
Feature
Description:
Added blurb about future todo's for this test.
Added support for HDF5_PARAPREFIX to determine the directory where the test file
is stored.
Solution:
Used getenv_all to get the value of HDF5_PARAPREFIX. Note that, if a command-line
parameter is passed to the program to specify a path, it will override the value of
HDF5_PARAPREFIX.
Platforms tested:
copper, colonelk
Misc. update:
Added t_posix_compliant to the rest of the build and patched up minor
compile bugs/warnings encountered on other platforms.
Description:
It seems that <getopt.h> needs to be included to get the file to build, even though the man page
seems to indicate that <unistd.h> should be sufficient.
Solution:
Platforms tested:
copper, colonelk, sol
Feature
Description:
Added posix compliance tests.
Solution:
These tests do increasingly complicated sets of writes followed by reads.
POSIX standards say that any read that can be proven to occur after a write
must include the data in that write. These tests attempt to verify whether the
underlying filesystem and i/o layer provide such guarantees.
Platforms tested:
copper, colonelk, red storm
Misc. update:
Bug fix
Description:
Fortran type generation was broken in two ways. Fixed both.
Solution:
Firstly, there were a couple of path problems. Fixed a typo and
specified the full path of a file.
Secondly, the dependencies weren't right when building with HDF5-specific
commands (make lib, make check-s, etc.). Tweaked dependencies
to fix the problem.
Platforms tested:
mir, modi4, sleipnir
Configure feature
Description:
Added 'make trace' target.
Solution:
Added tracing to 1.7. This was done automatically in 1.6, but left out
of 1.7 until now (oops!).
Tracing in 1.7 only happens manually, when the user types 'make trace.'
Tracing automatically requires more framework than it's worth.
I also fixed a couple of tracing bugs and ran trace.
Platforms tested:
mir, sleipnir, modi4
Misc. update:
Bug fix
Description:
Before this checkin, 'gmake check-s' would fail if there was a file in
the current directory named 'check-s'.
This is fixed under gmake (not sure how to fix for other makes).
Solution:
check, progs, install, etc. are what gmake calls "phony" targets,
which means that no file should be created. These targets can be
specified by a line of the form
.PHONY: check progs install ...
Automake adds this line for targets it knows about, but HDF5 has a
lot of custom rules. This checkin adds a .PHONY line for those rules.
I believe that only gmake recognizes the .PHONY line (at least, pmake
doesn't seem to), but a partial solution is better than none.
This error should occur very rarely anyway (the user has to manually
create files with names like 'build-check-s' or '_test').
Platforms tested:
mir, sleipnir, modi4
Bug fix/feature
Description:
Added support for -shlib in h5fc and h5c++.
Made check-install use -shlib when only shared libraries have been installed.
Solution:
h5fc and h5c++ didn't recognize -shlib. Stole code from h5cc to link against
shared libraries.
When static libraries are disabled, the examples Makefiles will automatically
use the -shlib option to link against shared libraries. Thus,
--disable-static and make check-install should work together.
Platforms tested:
heping(disable-static, enable-static, fortran, c++), modi4 (disable-static, fortran, c++, parallel, enable-static)
Bug fix
Description:
Failed parallel tests now cause make to exit with an error.
Solution:
Edited config/conclude.am to throw an error if parallel test programs fail.
Platforms tested:
heping, modi4
Adding comments and code clean-up for code that tests collective irregular selection
Description:
For better maintenance in the future, Add comments to list the number for
(start,count,block,stride) for irregular selection for effective testing
collective chunk IO feature development in the future.
Solution:
Platforms tested:
Linux 2.4(heping), mostly comments, no need to test on other platforms.
Misc. update:
Bug fix
Description:
Changed configure.in to use an environment variable TR to set the path
to the tr utility.
Solution:
There are two kind of tr on Solaris with slightly different syntax.
HDF5's configure relies on the "standard" tr. Traditionally, HDF5ers
have needed to make sure that the "right" tr was found before the
wrong one in their path; now they can use an environment variable.
Platforms tested:
mir, shanti, sol
Misc. update:
Forgot to update release notes. Off to do that now.
Code clean-up for collective regular chunk IO tests.
Description:
Add descriptions for each tests for future maintenance.
Solution:
Platforms tested:
Mostly comments, No need to use h5committest.
heping(linux 2.4)
Misc. update:
Makefile bug fix
Description:
Previously, automake didn't output rules to build perform/mpi-perf or
the test/gen_* programs.
Now these can be built by typing 'make mpi-perf' (or 'make foo') or by
configuring with --enable-build-all.
Solution:
Automake doesn't like having rules for programs it doesn't build. Tricked
it by having these programs built "sometimes"--whenever the user enables
--build-all. This should be used mostly for testing and to ensure that
these helper programs compile.
***IMPORTANT***
These programs do *not* currently compile. When --enable-build-all is used
(not the default), gen_new_fill fails because it uses an old API. This is
an existing "bug" that has simply been exposed by this checkin.
Platforms tested:
sleipnir, modi4, sol
Misc. update:
bug fix.
Description:
The complex derived datatype test assumed that the fill value would be 0. This is not
the case on all systems.
Solution:
Modified the test to check against a known value in the outbuf array, instead of the fill value.
Platforms tested:
heping and MCR.
Misc. update:
bug fix.
Description:
Fixed typo in a comment. The word "file" was supposed to be "fill." The explanation of
how the complex derived datatype test works is much clearer now.
Solution:
Platforms tested:
minor change.
Misc. update:
bug fix.
Description:
When a parallel test script test fails, make would continue because the
way it was setup inside a for loop. Fixed it by issuing an exit 1 inside
the loop.
There was also a typo error in the newer command comparision that it
must be $${chkname} in order to be valid. Also, the test script itself
was not checked in the newer lists. All fixed.
Platforms tested:
h5committested and also hand tested in heping pp mode.
Repair synchronization bug in the metadata cache in PHDF5
Also repair numerous other bugs that surfaced in testing the
bug fix.
Description:
While operations modifying metadata must be collective, we allow
independant reads. This allows metadata caches on different processes
to adjust to different sizes, and to place the entries on their dirty
lists in different orders. Since only process 0 actually writes
metadata to disk (all other processes thought they did, but the writes
were discarded on the theory that they had to be collective), this made
it possible for another process to modify metadata, flush it, and then
read it back in in its original form (pre-modification) form. The
possibilities for file corruption should be obvious.
Solution:
Make the policy that only process 0 can write to file explicit, and
visible to the metadata caches. Thus only process 0 may flush dirty
entries -- all other caches must retain dirty entries until they are
informed by process 0 that the entries are clean.
Synchronization is handled by counting the bytes of dirty cache entries
created, and then synching up between the caches whenever the sum
exceeds an (eventually user specified) limit. Dirty metadata creation
is consistent across all processes because all operations modifying
metadata must be collective.
This change uncovered may bugs which are repaired in this checkin.
It also required modification of H5HL and H5O to allocate file space
on insertion rather than on flush from cache.
Platforms tested:
H5committest, heping(parallel & serial)
Misc. update:
A bug fix
Description:
MPI_Status_IGNORE is treated as a NULL pointer for mpich 1.2.4 or similar MPI packages.
It caused segmentation fault for MPI derived datatype test.
Solution:
Define MPI_STATUS status,
and pass &status into MPI_File_read and MPI_File_write.
Platforms tested:
too trivial to test.
Misc. update:
Improvement.
Description:
The test may hang if there are system failures that some processors
are not working.
Solution:
Added the ALARM calls to limit all tests be done with the default
alarm time. So, even if a process is hanging, the ALARM signal
would terminate the process.
Platforms tested:
tg-login2 of NCSA.
Misc. update:
bug fix
Description:
the MPI_File_get_size returns a different value in one of the tests in Windows
comment the code and not run the test in windows
a ULL suffix on the harcoded return VRY return number is needed on AIX
Solution:
Platforms tested:
Windows
AIX
Misc. update:
new features
Description:
add support for compiling the library and testphdf5 in Windows
Solution:
Platforms tested:
Linux
AIX
Solaris
Windows VC6
Misc. update:
New feature.
Description:
Added the time command to the make check target to report time usage
of the execute of each test and test scripts. This gives us some idea
how long each test takes and some vague idea it is compute bound or
not.
powerpc-ibm-aix5.x:
Change $RUNPARALLEL default setting to allow it being invoked by the
time command.
Platforms tested:
h5committested.
Feature
Description:
Added H5_CFLAGS, etc. to 1.7 branch.
Now compilation flags can be put in H5_*FLAGS and they'll be used when
building hdf5 but not in h5cc.
Platforms tested:
mir, sleipnir, modi4
Misc. update:
Bug fix
Description:
Disabled C++ shared libraries for Sun Workshop compiler.
Solution:
This bug only seems to happen when using the -xarch=v9 flag to compile in
64-bit mode, but disabling shared libraries entirely for this compiler is
an easier fix (I don't know how to detect 64 bit mode from the command line).
The framework for disabling shared libraries for other C++ compilers is
in place.
Platforms tested:
sol, mir, sleipnir, modi4