Bug Fix (Bug 544)
Description:
SGI Altix's MPI_File_get_size overflowed at 2GB and more.
Replaced h5_mpi_get_file_size calls by h5_get_file_size.
Tested:
Cobalt.
To activite this test,
add the command option -i.
For example, at IBM AIX, type "poe testphdf5 -i" will test the library with independent IO with file setview. It simply replaces all the collective IO tests with independent IO with file setview.
Repair synchronization bug in the metadata cache in PHDF5
Also repair numerous other bugs that surfaced in testing the
bug fix.
Description:
While operations modifying metadata must be collective, we allow
independant reads. This allows metadata caches on different processes
to adjust to different sizes, and to place the entries on their dirty
lists in different orders. Since only process 0 actually writes
metadata to disk (all other processes thought they did, but the writes
were discarded on the theory that they had to be collective), this made
it possible for another process to modify metadata, flush it, and then
read it back in in its original form (pre-modification) form. The
possibilities for file corruption should be obvious.
Solution:
Make the policy that only process 0 can write to file explicit, and
visible to the metadata caches. Thus only process 0 may flush dirty
entries -- all other caches must retain dirty entries until they are
informed by process 0 that the entries are clean.
Synchronization is handled by counting the bytes of dirty cache entries
created, and then synching up between the caches whenever the sum
exceeds an (eventually user specified) limit. Dirty metadata creation
is consistent across all processes because all operations modifying
metadata must be collective.
This change uncovered may bugs which are repaired in this checkin.
It also required modification of H5HL and H5O to allocate file space
on insertion rather than on flush from cache.
Platforms tested:
H5committest, heping(parallel & serial)
Misc. update:
bug fix
Description:
the MPI_File_get_size returns a different value in one of the tests in Windows
comment the code and not run the test in windows
a ULL suffix on the harcoded return VRY return number is needed on AIX
Solution:
Platforms tested:
Windows
AIX
Misc. update:
new features
Description:
add support for compiling the library and testphdf5 in Windows
Solution:
Platforms tested:
Linux
AIX
Solaris
Windows VC6
Misc. update:
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-)
Description:
Generally speaking, this is the "signed->unsigned" change to selections.
However, in the process of merging code back, things got stickier and stickier
until I ended up doing a big "sync the two branches up" operation. So... I
brought back all the "infrastructure" fixes from the development branch to the
release branch (which I think were actually making some improvement in
performance) as well as fixed several bugs which had been fixed in one branch,
but not the other.
I've also tagged the repository before making this checkin with the label
"before_signed_unsigned_changes".
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & fphdf5
FreeBSD 4.10 (sleipnir) w/threadsafe
FreeBSD 4.10 (sleipnir) w/backward compatibility
Solaris 2.7 (arabica) w/"purify options"
Solaris 2.8 (sol) w/FORTRAN & C++
AIX 5.x (copper) w/parallel & FORTRAN
IRIX64 6.5 (modi4) w/FORTRAN
Linux 2.4 (heping) w/FORTRAN & C++
Misc. update:
Bug fix
Description:
Relax restrictions on parallel I/O to allow compressed, chunked datasets
to be read in parallel (collective access will be degraded to independent
access, but will retrieve the information still).
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Solaris 2.7 (arabica)
IRIX64 6.5 (modi4)
h5committest
Add test to verify the fix of the parallel I/O mode confusion bug.
Description:
While the parallel I/O mode confusion bug is fixed, an automated
regression test for this bug would be useful.
Solution:
Added a modified version of the original bug demonstration program
to testphdf5.
Platforms tested:
copper
h5committested
eirene (parallel)
Misc. update:
Fix of bug/feature which caused testphdf5 to fail when run with more than
32 processes. This fix was originally applied to the 1.6 branch, and is
now ported to the 1.7 branch.
Description:
32 process limit was a hard coded constant.
Solution:
Modified most of the routines in t_mdset.c to adapt dynamically to
the current number of processes. In passing, I also tidied up a
few memory leaks.
Note that one new routine had been added to the 1.7 version of
t_mdset.c since the 1.6/1.7 split. I applied changes to this
routine as well.
Platforms tested:
h5committested
Tested on eirene with 4 and 8 processes. Couldn't go higher.
I would have like to repeat my 32 - 64 process tests on copper,
but was unable to do so as I don't have access to cu12t at
present. Perhaps I will be able to manage this in the next
few days. However, Albert wanted these changes checked in to
the 1.7 branch so he could work with them.
Misc. update:
Code cleanup.
Removed bunch of old options (r,w,v,i,b,e) that are no longer valid
or useful after adopting the general test interface. Moved the test
of sizeof MPI_Offset into the test routine itself.
Platforms tested:
Eirene and Sol using pp mode.
feature
Description:
Change testphdf5 to use the common test program syntax.
Needed to change the protocols of all test programs to
fit the requirement of the common test syntax.
Platforms tested:
"h5committested".
Also tested in sol with PP mode.
Improvement.
Description:
Complete change of the verbose control to use the routines provided by
the test/libh5test.a.
Also put in a temporary fix for the H5Eset_auto() and H5Eget_auto()
so that the Compat code are isolated in one place rather than all over
the source file.
Platforms tested:
Tested in Eirene (parallel).
Misc. update:
Added a test of fill value before any data is written to a dataset.
Rename short_dataset() as dataset_fillvalue() as it reflects better
the tests. Also removed the option of -S since the fill value test
will be tested always.
Platforms tested:
"h5committested"
Misc. update:
Feature.
Description:
Added the short_dataset test (was in v1.6 first.)
Platforms tested:
Tested in eirene (pp) only since these have been tested in v1.6 already.
Misc. update:
Bug/feature fix.
Description:
Relax restriction on parallel writing to compact datasets to allow partial
I/O.
Updates to reference manual mentioning the issues involved are delayed until
reference manual 'lock' is removed later this week.
Platforms tested:
FreeBSD 4.9 (sleipnir)
too minor to require h5committest
Bug fix
Description:
The MPI_File_set_size() routine on ASCI Red is not able to extend files
so that they are larger than 2GB.
Solution:
Add an extra macro which controls whether MPI_File_set_size() can handle
>2GB offsets or if our "older" way of reading a byte, then writing a byte at
the appropriate offset should be used.
Platforms tested:
FreeBSD 4.9 (sleipnir)
h5committest
Feature Add
Description:
Added knob so that the programmer can enable or disable GPFS
hints during runtime instead of having it only enabled at
configure/compile time. Some of the public APIs were changed
to add an extra parameter for this option...
Platforms tested:
Blue (LLNL). It only affects the MPI/POSIX driver, so no need
to test it on non-GPFS platforms.
Misc. update:
simple code cleanup.
Description:
While debug a problem in multiple_group_write(), noticed some returned
values were not checked. Added code to check on all returned code.
Platforms tested:
h5committested.
Misc. update:
Update
Description:
Updated (and in some cases added) the copyright statement.
Platforms tested:
Linux (Comment changes...only tested if they compile)
Misc. update:
Purpose:
More test.
Description:
Test independent read of groups and chunked dataset.
Solution:
This test is similar to multiple group test. So just add it in the
testphdf5.c,h.
Platforms tested:
modi4, eirene.
API name change
Description:
Change all "space time" references to "alloc time", including API functions
and macro definitions, etc.
Platforms tested:
FreeBSD 4.6 (sleipnir) w/C++
Solaris 2.7 (arabica) w/FORTRAN
IRIX64 6.5 (modi4) w/parallel & FORTRAN
Additional test
Description:
Add in a fill-value to one of the tests, to make certain that they are
handled correctly.
Platforms tested:
FreeBSD 4.6 (sleipnir) w/serial & parallel
Code cleanup
Description:
Cleaned up some compiler warnings.
Platforms tested:
FreeBSD 4.6 (sleipnir) w/serial & parallel. Will be testing on IRIX64
6.5 (modi4) in serial & parallel shortly.
Purpose:
Design for compact dataset
Description:
Compact dataset is stored in the header message for dataset layout.
Platforms tested:
arabica, eirene.
Code cleanup
Description:
Changed ifdef name from "VERBOSE" to "BARRIER_CHECKS", to better describe
what it affects.
Platforms tested:
IRIX64 6.5 (modi4) w/parallel
Bug Fix
Description:
Currently, only process 0 is writing attribute data to a file. This is
incorrect, because the raw data for attributes is cached in memory until
the object header is written and other processes are not able to read the
correct attribute information.
Solution:
Have all processes participate in writing the attribute data.
Platforms tested:
IRIX64 6.5 (modi4)
New test feature
Description:
Added create_faccess_plist() that create just MPIO or split+MPIO
file-access property list. This in turn can run parallel tests
with just MPIO or with Split-file VFD too.
Added -s option for split-file Plus MPIO tests.
For testphdf5.c: removed a bunch of old debug code that got left
in by mistake.
Platforms tested:
Modi4 and eirene parallel.
But it has uncovered errors in the library. The test program
is correct though. Checking the test program in so that it won't
get lost and can be used for debugging. Also, the -s is not used
by default during test. At least it won't abort "make check".
Purpose:
Added attribute test
Description:
attribute test is added into t_mdset.c. At this moment, testing failed
on SunOS 5.7 and Linux. But I believe the problem is from dataset
collective writing in t_dset.c.
Platforms tested:
SGI MPI, MPICH(IRIX64 N32, IRIX64 64, IRIX, Linux, SunOS).
Purpose:
Multiple-group testing
Description:
Added multiple groups under root group and multiple subgroups of
certain levels. Also writes several datasets in each group in
parallel.
Solution:
Platforms tested:
MPICH(IRIX 6.5, IRIX64 6.5(64), IRIX64 6.5(N32), Linux, SunOS 5.6)
and SGI MPI(IRIX64 6.5).
New features
Description:
Some testers found the filename lengths too short.
Changed it to use the FILENAME_MAX usually defined in stdio.h.
If not, set it to 512 which should be sufficient for users
but should not exceed any system limits.
Also added a new test parameters of ndatasets so that the tester
can specific a different number of datasets for the multiple
datasets tests.
Changed the datatype of datasets created to DOUBLE. This eliminates
the current racing conditions. But the racing bugs during conversion
still need to be tracked down and squashed.
Platforms tested:
Modi4 -64.
Simple changes
Description:
testphdf5.h:
Call MPI_Abort when error is detected. MPI_Finalized was used
before but it might hang if the test has already encountered errors.
Also, it does not do the H5Eprint any more since auto report is on.
t_mdest.c:
Changed the variable name of rank and nprocs to mpi_rank and mpi_size
so that it is the same with the other tests and can use the VRFY macro
call.
Platforms tested:
modi4-64.
Bug fix, cleanup,...
Description:
The test was doing the hyperslab select incorrectly (thinking
count was the block length.
Solution:
Fixed it to do the correct hyperslab selection.
Changed it to calculate different data for different datasets.
Changed output by rows instead by cols. It tests the purpose
of creating multiple datasets the same but runs faster.
Platforms tested:
modi4-64.
Added features
Description:
There were no automatic tests for transfering zero elements.
Solution:
t_dset.c:
Added two new patterns of ZROW (zero rows for process 0)
and ZCOL(zero columns for process 0).
ZROW test was added but it failed because the current library
does not accept it. Not compiled in now. Need to fix the
library before turning it back on again and also to add the
ZCOL test.
t_mdset.c:
Added statement to show progress. Also the MPI_Barrier() call
get processes synchornoized. It eliminates the racing condition
but this is not a permenant solution. The library code needs to
be fixed.
testphdf5.c:
Added a bunch of MPI_Type_XXX debug code. Added the -md
option to skip the multiple datasets tests. Changed the cosmitic
appearance of the banner messages.
testphdf5.h:
When an error is detected, the old way was to call MPI_Finalize()
before exiting. This sometimes hangs because some processes
may be waiting for a message of a different tag. Changed to
call MPI_Abort() for now so that the whole MPI job would
abort rather than hanging due resource limits exceeded.
Added the definition of ZROW and ZCOL.
Platforms tested:
Modi4 -64.
Increase the test size to 32. Put in a check to make sure number of
processes are not bigger than SIZE.
testphdf5.c:
Fixed a mistake in the prototype of pause_proc to reflect no arguments.