Remove all traces of MPI-POSIX VFD and GPFS detection/code.
Remove remaining traces of stream VFD.
Remove testpar/t_posix_compliant test (it's not actually verifying anything).
Clean up H5D__mpio_opt_possible() further.
Moved environment variable that disables MPI collective operations into
MPI-IO VFD (instead of it being in src/H5S.c).
A few other small code cleanups.
Tested on:
Mac OSX/64 10.9.2 (amazon) w/parallel & serial
or 2 processes.
First bug is in testpar/t_mdset.c, where the test reports an error in
addition to skipping the test if there are less than three procs. Fix
to just skip the test.
Second bug is in testpar/t_dset.c in actual_io_mode tests, where
incorrect expected value for IO mode was set if the number of procs
running the test is 1.
tested with h5committest.
Stop aliasing property to indicate internal collective metadata operations
with property to perform collective raw data operations from the application.
Tested on:
Mac OSX/64 10.8.3 (amazon) w/paralllel
HDFFV-8146 - Remove "multi-chunk IO without optimization" sub-feature from MPI I/O optimization for chunked dataset feature
Description:
The “multi-chunk IO without optimization” feature is removed and made the related xfer property (H5FD_MPIO_CHUNK_MULTI_IO) go directly to “multi-chunk-io” feature.
Also update/fix/cleanup testings (chunk collective IO and actual chunk opt mode) accordingly.
Tested:
jam (linux32-LE), koala (linux64-LE), ostrich (linuxppc64-BE), fred (mac64-LE), Windows (32-LE cmake), cmake (jam)
HDFFV-8143 Provide a routine(s) for telling the user why the library broke collective data access
Description:
Added H5Pget_mpio_no_collective_cause() function that retrive reasons why the collective I/O was broken during Read/Write IO access.
Reasons to break collective I/O:
- SET_INDEPENDENT
- DATATYPE_CONVERSION
- DATA_TRANSFORMS
- MPIPOSIX
- NOT_SIMPLE_OR_SCALAR_DATASPACES (NULL Space)
- POINT_SELECTIONS
- NOT_CONTIGUOUS_OR_CHUNKED_DATASET (Compact or External-Storage)
- FILTERS
Tested:
jam (linux32-LE), koala (linux64-LE), ostrich (linuxppc64-BE), tejeda (mac32-LE), linew (solaris-BE)
Check in "actual I/O mode" feature to trunk. Will merge back to 1.8 branch
after it bakes over the weekend.
Tested on:
FreeBSD/32 8.2 (loyalty) w/gcc4.6, w/C++ & FORTRAN, in debug mode
FreeBSD/64 8.2 (freedom) w/gcc4.6, w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (koala) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, w/threadsafe, in production mode
Linux/PPC 2.6 (heiwa) w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-ia64 2.6 (ember) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in debug mode
Mac OS X/32 10.6.8 (amazon) in debug mode
Mac OS X/32 10.6.8 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
error and wanted to exit the test program. This was not good since if only a
subset of processes called MPI_Finalize(), the other processes will likely
hang. That happened in AIX that it would waited till the alarm signal to kill
the processes. Definitely a waste of time.
Solution: Changed it to call MPI_Abort.
That showed another problem. HDF5 has setup atexit post-process to try to close
unclose objects, release resources, etc. But if the MPI processes have
encountered an error and has been aborted, it is not likely any more MPI calls
can function properly. E.g., it would attempt to free some communicators in
the HDF5 MPIO file handle. It would again hang.
Solution: need to call H5dont_atexit() to disable any atexit post-processing.
This must be done early, like before calling H5open. This is added to each
parallel test main program.
testphdf5.h:
Changed macros VRFY and MESG. Added comments too.
testphdf5.c:
t_mpi.c:
t_cache.c:
t_shapesame.c:
Added H5dont_atexit.
Tested: h5committest.
Moved the two shape same tests from testphdf5 to a separated executables,
named t_shapesame. The shape same tests runs too long for testphdf5.
In a separated executalbe, it will be easier to separate any errors in
testphdf5 sub-tests from the shape same tests.
t_shapesame.c:
Contains the shape same tests (cloned from t_rank_projection.c) plus
a duplicate of "testphdf5.c" for now. After verifying it is correct, more
cleanup is needed.
testphdf5.c:
Removed the two shape same tests (chsssdrpio & cbhsssdrpio).
Makefile.am:
Makefile.in:
Added t_shapesame as a new test executable.
Removed t_rank_projections.c from part of testphdf5.
testph5.sh.in:
Temporary added the "t_shapesame -p" test for testing shape same tests
with MPIO-Posix VFD.
Tested: h5committested, plus serial jam.
of the H5Ocache.c code to update its image of the on disk representation
of the object header on a call to the clear callback.
This wasn't an issue as long as all flushes of the object header were
made from the same process, but if an object header is modified, and
then flushed on one process and cleared on the rest, the changes were
not be reflected in the images of the on disk representation on all
processes where the object header was cleared rather than flushed.
If one of these processes did the next flush, the changes were lost in
the on disk representation.
Fixed this by causing all dirty messages and to be written to the copy
of the on disk image maintained by the object header code on both flush
and clear.
Also added associated test code in t_mdset.c.
Also checking in some cache debug code developed while chasing this bug.
Commit tested and tested (parallel) on phoenix.
Bring "shape same" changes from LBL branch to trunk. These changes
allow shapes that are the same, but projected into dataspaces with different
ranks to be detected correctly, and also contains code to project a dataspace
into greater/lesser number of dimensions, so the I/O can proceed in a faster
way.
These changes also contain several bug fixes and _lots_ of code
cleanups to the MPI datatype creation code.
Many other misc. code cleanup are included as well...
Tested on:
FreeBSD/32 6.3 (duty) in debug mode
FreeBSD/64 6.3 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (amani) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in debug mode
Mac OS X/32 10.6.3 (amazon) in debug mode
Mac OS X/32 10.6.3 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
Make similar change to windows VFD as sec2 VFD, when converting from
a family file to a single file.
Tweak file sizes expected for parallel tests.
Tested on:
tg-login3, w/parallel
Windows (post facto)
Remove trailing whitespace from C/C++ source files, with the following
script:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Tested on:
Mac OS X/32 10.5.5 (amazon)
No need for h5committest, just whitespace changes...
Description:
As part of our Windows cleanup, we try to remove windows-specific tweaks in the source code. There are many instances where Windows code is introduces via ifdef's. We re-evaluate whether they are still required, and found that many of them are not. Others we change to "feature"-specific code, rather than Windows-specific.
Tested:
VS2005 on WinXP
VS.NET on WinXP
h5committest (kagisopp, smirom, linew)
1. In H5Dwrite and H5Dread, let the data buffer point to a fake address if the application passes
in an empty buffer. This is mainly for MPIO programs that some processes may not have any
data to write or read but still participate the I/O. This solution solves some MPI's problem
like the ChaMPIon on tungsten which doesn't support empty buffer.
2. The ChaMPIon on tungsten doesn't support complex derived MPI data type correctly and collective
I/O when some processes don't have any data to write or read correctly. Detect the compiler
"cmpicc" in the system-specific config file and set the variables for these two cases to false.
The PHDF5 library already has set up a way to switch collective chunked I/O to independent
under these two cases.
3. A bug fix - During the work of the optimization for compound data I/O, the case for switching
collective chunked I/O to independent I/O was leftout. Fixed it by adding I/O caching to it in
H5D_multi_chunk_collective_io in H5Dmpio.c.
Tested on tungsten, cobalt, and kagiso for parallel; on linew and smirom for serial.
Tested platform:
Kagiso only since it is only a comment block change. If it works in one
machine, it should work in all, I hope. Still need to check the parallel
build on copper.
The h5_mpi_get_file_size() is no longer used. The unused code caused some
compiling warning messages. Removed the whole routine.
Tested in heping pp mode.
Bug Fix (Bug 544)
Description:
SGI Altix's MPI_File_get_size overflowed at 2GB and more.
Put in a temporary patch to use stat() instead to make Cobalt
passing on this test (bigdset). A better fix (like detect if
MPI_File_get_size does not work before using this is preferred.)
Tested:
Cobalt and Heping.
To activite this test,
add the command option -i.
For example, at IBM AIX, type "poe testphdf5 -i" will test the library with independent IO with file setview. It simply replaces all the collective IO tests with independent IO with file setview.
Code cleanup
Description:
Trim trailing whitespace in Makefile.am and C/C++ source files to make
diffing changes easier.
Platforms tested:
None necessary, whitespace only change
Adding parallel tests for optional collective chunk APIs
Description:
Three new APIs
"H5Pset_dxpl_mpio_chunk_opt_ratio
H5Pset_dxpl_mpio_chunk_opt_num
H5Pset_dxpl_mpio_chunk_opt"
for optional optimization choices from users
have been added to the libraries.
This check-in adds six tests to verify the funcationality and correctedness
of these APIs.
These tests need to be verified with 3 or more processors and with MPI-IO driver only.
Solution:
Using H5Pinsert, H5Pget, H5Pset to verify that the library indeed goes into the branch we hope for.
Using H5_HAVE_INSTRUMENT macro to isolate these changes so that it won't affect or be misused by the application.
Platforms tested:
h5committest(shanti still refused to be connected)
Parallel tests on heping somehow are skipped. Manually testing at heping. Have checked
1,2,3,4,5 processes.
Misc. update:
code cleanup
Description:
remove two printf lines accidently added for debugging at NCSA cobalt.
Solution:
Platforms tested:
No need to test.
Misc. update:
Enhance collective chunk IO supports
Description:
Add a new test to check the correctness of the HDF5 library behavior for collective IO mode when one process doesn't have any contribution for IO.
Solution:
Platforms tested:
IBM AIX 5.2(copper)
Linux (heping) mpich-1.2.6
Misc. update:
bug fix 504
Description:
testpar/t_mpi would hang if $HDF5_NOCLEANUP is set. E.g.,
% env HDF5_NOCLEANUP=yes mpirun -np 3 ./t_mpi
This happened because the environment variables are not exported to all
mpi processes by the mpirun command. So, some attempted to do cleanup
while others don't and some hang waiting for others to act.
Solution:
Instead individual program checking getenv, they all just called h5_cleanup
no matter. h5_cleanup now uses getenv_all to check the $HDF5_NOCLEANUP if
it is in parallel mode.
Platforms tested:
h5committested, tested pp in heping too.
Code clean-up for collective regular chunk IO tests.
Description:
Add descriptions for each tests for future maintenance.
Solution:
Platforms tested:
Mostly comments, No need to use h5committest.
heping(linux 2.4)
Misc. update:
new features
Description:
add support for compiling the library and testphdf5 in Windows
Solution:
Platforms tested:
Linux
AIX
Solaris
Windows VC6
Misc. update:
Bug #281
Description:
A dataset created in serial mode with H5D_ALLOC_TIME_INCR allocation setting
was not extendible, either explicitly by H5Dextend or implicitly by writing
to unallocated chunks. This was because parallel mode expects the allocation
mode be H5D_ALLOC_TIME_INCR only.
Solution:
Modified library to allocate more space when needed or directed if the
file is opened by parallel mode, independent of what the dataset allocation
mode is.
Platforms tested:
Heping pp.
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
Modified collective chunk Io test
Description:
One test(collective chunk IO test 4) is no longer needed. Comment out first.
The code should be cleaned up later.
Solution:
Platforms tested:
Misc. update:
Updating phase 2 work of collective IO
Description:
The current collective IO tests cannot test for number of processors bigger than
some values, this change will lift those restriction. However, the test may be slower.
Solution:
Platforms tested:
linux 2.4, AIX 5.1, Linux 2.4 IA64 and IRIX 6.5
(I haven't tested big number of processors with the restriction of the machine)
Misc. update:
Bug fix.
Description:
The irregular chunk IO tests do not work for processes sizes larger than 3.
Added a check of number of processes and skip the irregular chunk IO
tests if number of processes are larger than 3.
Revamped the tests of collective chunk IO tests too.
Platforms tested:
Tested in mir.
Misc. update:
Support collective IO for irregular selection.
Description:
Solution:
Platforms tested:
Linux with MPICH
AIX with mpcc_r
Linux with ChaMPIO
Altix with intel
Misc. update:
typo fix and small improvement.
Description:
t_coll_chunk.c:
ccdataset_vrfy() was using a wrong routine name to identify itself.
testphdf5.c:
Add a definition of NFILENAME to be the common dimension size of
FILENAME[] and filenames[][] since they must have the same first
dimension size.
Platforms tested:
h5committested.
Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-)
Description:
Generally speaking, this is the "signed->unsigned" change to selections.
However, in the process of merging code back, things got stickier and stickier
until I ended up doing a big "sync the two branches up" operation. So... I
brought back all the "infrastructure" fixes from the development branch to the
release branch (which I think were actually making some improvement in
performance) as well as fixed several bugs which had been fixed in one branch,
but not the other.
I've also tagged the repository before making this checkin with the label
"before_signed_unsigned_changes".
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & fphdf5
FreeBSD 4.10 (sleipnir) w/threadsafe
FreeBSD 4.10 (sleipnir) w/backward compatibility
Solaris 2.7 (arabica) w/"purify options"
Solaris 2.8 (sol) w/FORTRAN & C++
AIX 5.x (copper) w/parallel & FORTRAN
IRIX64 6.5 (modi4) w/FORTRAN
Linux 2.4 (heping) w/FORTRAN & C++
Misc. update: