Check in Mohamad's changes to support collective I/O on point selections,
along with some other minor cleanups.
Tested on:
Mac OSX/64 10.9.2 (amazon) w/parallel & serial
(h5committest forthcoming)
Stop aliasing property to indicate internal collective metadata operations
with property to perform collective raw data operations from the application.
Tested on:
Mac OSX/64 10.8.3 (amazon) w/paralllel
HDFFV-8146 - Remove "multi-chunk IO without optimization" sub-feature from MPI I/O optimization for chunked dataset feature
Description:
The “multi-chunk IO without optimization” feature is removed and made the related xfer property (H5FD_MPIO_CHUNK_MULTI_IO) go directly to “multi-chunk-io” feature.
Also update/fix/cleanup testings (chunk collective IO and actual chunk opt mode) accordingly.
Tested:
jam (linux32-LE), koala (linux64-LE), ostrich (linuxppc64-BE), fred (mac64-LE), Windows (32-LE cmake), cmake (jam)
HDFFV-8143 Provide a routine(s) for telling the user why the library broke collective data access
Description:
Added H5Pget_mpio_no_collective_cause() function that retrive reasons why the collective I/O was broken during Read/Write IO access.
Reasons to break collective I/O:
- SET_INDEPENDENT
- DATATYPE_CONVERSION
- DATA_TRANSFORMS
- MPIPOSIX
- NOT_SIMPLE_OR_SCALAR_DATASPACES (NULL Space)
- POINT_SELECTIONS
- NOT_CONTIGUOUS_OR_CHUNKED_DATASET (Compact or External-Storage)
- FILTERS
Tested:
jam (linux32-LE), koala (linux64-LE), ostrich (linuxppc64-BE), tejeda (mac32-LE), linew (solaris-BE)
Check in "actual I/O mode" feature to trunk. Will merge back to 1.8 branch
after it bakes over the weekend.
Tested on:
FreeBSD/32 8.2 (loyalty) w/gcc4.6, w/C++ & FORTRAN, in debug mode
FreeBSD/64 8.2 (freedom) w/gcc4.6, w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (koala) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, w/threadsafe, in production mode
Linux/PPC 2.6 (heiwa) w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-ia64 2.6 (ember) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in debug mode
Mac OS X/32 10.6.8 (amazon) in debug mode
Mac OS X/32 10.6.8 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
The shape same tests ran too long. Break them into smaller subtests
so that they can finish sub-test in a shorter time. Easier to tell
which one sub-test is taking too much time and/or errors occur in
one fo the sub-tests.
This one breaks the contig_hyperslab_dr_pio_test() into 4 smaller
sub-tests.
Tested: h5committest
John Mainzer fixed the bug and added a test which wrote file and flush a few
time; close the file then open it by serial and read simple structure. I
changed the test to two parallel running parts of ..._writer and ..._reader
and have the reader verify the file after every flush by the writer.
Tested: parallel in Jam and Amani.
of the H5Ocache.c code to update its image of the on disk representation
of the object header on a call to the clear callback.
This wasn't an issue as long as all flushes of the object header were
made from the same process, but if an object header is modified, and
then flushed on one process and cleared on the rest, the changes were
not be reflected in the images of the on disk representation on all
processes where the object header was cleared rather than flushed.
If one of these processes did the next flush, the changes were lost in
the on disk representation.
Fixed this by causing all dirty messages and to be written to the copy
of the on disk image maintained by the object header code on both flush
and clear.
Also added associated test code in t_mdset.c.
Also checking in some cache debug code developed while chasing this bug.
Commit tested and tested (parallel) on phoenix.
Bring "shape same" changes from LBL branch to trunk. These changes
allow shapes that are the same, but projected into dataspaces with different
ranks to be detected correctly, and also contains code to project a dataspace
into greater/lesser number of dimensions, so the I/O can proceed in a faster
way.
These changes also contain several bug fixes and _lots_ of code
cleanups to the MPI datatype creation code.
Many other misc. code cleanup are included as well...
Tested on:
FreeBSD/32 6.3 (duty) in debug mode
FreeBSD/64 6.3 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (amani) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in debug mode
Mac OS X/32 10.6.3 (amazon) in debug mode
Mac OS X/32 10.6.3 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
1. In H5Dwrite and H5Dread, let the data buffer point to a fake address if the application passes
in an empty buffer. This is mainly for MPIO programs that some processes may not have any
data to write or read but still participate the I/O. This solution solves some MPI's problem
like the ChaMPIon on tungsten which doesn't support empty buffer.
2. The ChaMPIon on tungsten doesn't support complex derived MPI data type correctly and collective
I/O when some processes don't have any data to write or read correctly. Detect the compiler
"cmpicc" in the system-specific config file and set the variables for these two cases to false.
The PHDF5 library already has set up a way to switch collective chunked I/O to independent
under these two cases.
3. A bug fix - During the work of the optimization for compound data I/O, the case for switching
collective chunked I/O to independent I/O was leftout. Fixed it by adding I/O caching to it in
H5D_multi_chunk_collective_io in H5Dmpio.c.
Tested on tungsten, cobalt, and kagiso for parallel; on linew and smirom for serial.
Tested platform:
Kagiso only since it is only a comment block change. If it works in one
machine, it should work in all, I hope. Still need to check the parallel
build on copper.
To activite this test,
add the command option -i.
For example, at IBM AIX, type "poe testphdf5 -i" will test the library with independent IO with file setview. It simply replaces all the collective IO tests with independent IO with file setview.
Adding parallel tests for optional collective chunk APIs
Description:
Three new APIs
"H5Pset_dxpl_mpio_chunk_opt_ratio
H5Pset_dxpl_mpio_chunk_opt_num
H5Pset_dxpl_mpio_chunk_opt"
for optional optimization choices from users
have been added to the libraries.
This check-in adds six tests to verify the funcationality and correctedness
of these APIs.
These tests need to be verified with 3 or more processors and with MPI-IO driver only.
Solution:
Using H5Pinsert, H5Pget, H5Pset to verify that the library indeed goes into the branch we hope for.
Using H5_HAVE_INSTRUMENT macro to isolate these changes so that it won't affect or be misused by the application.
Platforms tested:
h5committest(shanti still refused to be connected)
Parallel tests on heping somehow are skipped. Manually testing at heping. Have checked
1,2,3,4,5 processes.
Misc. update:
Add tests for optional APIs to support collective chunk IO
Description:
In order to test whether library picks up the user's options,
The number of chunks need to be varied for different processes,
Selection of the number of processes selected in one chunk also
need to be varied.
Solution:
Create two cases,
1. Each chunk only selected by one unique process, this case
library should use independent for collective call.
2. One-third of the processes occupies the top half of the whole domain,
The rest of the processes occupies the lower half of the domain.
The total number of chunk is a fixed number 8.
Platforms tested:
Linux 2.4 with mpich 1.2.6(only)
Since I only checked in the code that handles the selection, haven't added any new tests yet. So it won't affect any platforms.
Misc. update:
change the array size of collective chunking features of parallel tests.
Description:
Previously array size for collective optimization tests
including
cchunk1,
cchunk2,
cchunk3,
cchunk4,
ccontw,
ccontr,
cschunkw,
cschunkr,
ccchunkw,
ccchunkr
are fixed,
They are only valid for some good number of processors(1,2,3,4,6,8,12,16,24,32,48 etc).
Recently there are more requests for parallel tests to be valid on some odd number of processes such as 5,7,11,13 etc.
Solution:
I change the array size to be dynamic rather than static. Now the fastest change array size is a function of mpi_size. dim2 = constant *mpi_size. After some tunings, theoretically the above tests should be valid for any number of processors. However, other parallel tests still need to be tuned.
To verify the correctness of these tests, using mpirun -np 5 ./testphdf5 -b cchunk1 at heping.
Platforms tested:
h5committest(shanti is refused to be connected)
at heping, 5 and 7 processes are used to verify the correctness.
Misc. update:
Enhance collective chunk IO supports
Description:
Add a new test to check the correctness of the HDF5 library behavior for collective IO mode when one process doesn't have any contribution for IO.
Solution:
Platforms tested:
IBM AIX 5.2(copper)
Linux (heping) mpich-1.2.6
Misc. update:
Bug #281
Description:
A dataset created in serial mode with H5D_ALLOC_TIME_INCR allocation setting
was not extendible, either explicitly by H5Dextend or implicitly by writing
to unallocated chunks. This was because parallel mode expects the allocation
mode be H5D_ALLOC_TIME_INCR only.
Solution:
Modified library to allocate more space when needed or directed if the
file is opened by parallel mode, independent of what the dataset allocation
mode is.
Platforms tested:
Heping pp.
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
Updating phase 2 work of collective IO
Description:
The current collective IO tests cannot test for number of processors bigger than
some values, this change will lift those restriction. However, the test may be slower.
Solution:
Platforms tested:
linux 2.4, AIX 5.1, Linux 2.4 IA64 and IRIX 6.5
(I haven't tested big number of processors with the restriction of the machine)
Misc. update:
Support collective IO for irregular selection.
Description:
Solution:
Platforms tested:
Linux with MPICH
AIX with mpcc_r
Linux with ChaMPIO
Altix with intel
Misc. update:
Feature--to provide a standalone mode for t_mpi.c so that it can
be built outside of PHDF5 environment.
Description:
Move definitions that are common to all parallel test programs
to a new header file called testpar.h.
Leave only Parallel HDF5 tests related definitions in testphdf5.h.
Platforms tested:
heping (pp) and modi4(PP). Copper was down.
Misc. update:
Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-)
Description:
Generally speaking, this is the "signed->unsigned" change to selections.
However, in the process of merging code back, things got stickier and stickier
until I ended up doing a big "sync the two branches up" operation. So... I
brought back all the "infrastructure" fixes from the development branch to the
release branch (which I think were actually making some improvement in
performance) as well as fixed several bugs which had been fixed in one branch,
but not the other.
I've also tagged the repository before making this checkin with the label
"before_signed_unsigned_changes".
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & fphdf5
FreeBSD 4.10 (sleipnir) w/threadsafe
FreeBSD 4.10 (sleipnir) w/backward compatibility
Solaris 2.7 (arabica) w/"purify options"
Solaris 2.8 (sol) w/FORTRAN & C++
AIX 5.x (copper) w/parallel & FORTRAN
IRIX64 6.5 (modi4) w/FORTRAN
Linux 2.4 (heping) w/FORTRAN & C++
Misc. update:
Adding routines to test irrgular hyperslab selection inside one chunk.
Description:
For debugging purpose, tests are turned off now.
Solution:
Platforms tested:
AIX 5.1 and Linux 2.4 with parallel enabled.
Misc. update:
Adding general MPI derived datatype testing code.
Description:
The testing code will not be tested. The purpose of checking in is for
better debugging later. HDF5 routine or daily test should not be aware of this.
Solution:
Platforms tested:
Copper(AIX 5.1),
Heping(Linux 2.4 + MPICH 1.2.6).
Misc. update:
Bug fix
Description:
Relax restrictions on parallel I/O to allow compressed, chunked datasets
to be read in parallel (collective access will be degraded to independent
access, but will retrieve the information still).
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Solaris 2.7 (arabica)
IRIX64 6.5 (modi4)
h5committest
Cleanup
Description:
Remove old hack for H5Eget_auto() which is not needed any more.
Reverse H5E_set_auto_stack(H5EDEFAULT,...) back to previous
code of H5E_set_auto(...). Same for H5E_get_auto_stack.
Platforms tested:
Only tested in Eirene PP as the change is pretty straight
forward.
Misc. update:
Add test to verify the fix of the parallel I/O mode confusion bug.
Description:
While the parallel I/O mode confusion bug is fixed, an automated
regression test for this bug would be useful.
Solution:
Added a modified version of the original bug demonstration program
to testphdf5.
Platforms tested:
copper
h5committested
eirene (parallel)
Misc. update:
Code cleanup
Description:
Tweak recent "forward compatibility" changes to the H5E* API (which allowed
for the old H5E API functions to remain unchanged) by allowing for the error
stack callback function (H5E_auto_t) to also remain unchanged from the 1.6
branch. This required changing the H5E{get|set}_auto routines to have the
old style H5E_auto_t type (which didn't have a stack ID parameter) and the new
H5E{get|set}_auto_stack routines to have a newer "H5E_auto_stack_t" type (which
has a stack ID parameter). This should make the H5E API changes as forwardly
compatible as possible.
One side-affect of this change was that it was impossible to determine if
the current auto error callback was the old style (H5E_auto_t) or the new style
(H5E_auto_stack_t) of callback, so a new API function (H5Eauto_is_stack) was
adde to query this.
Platforms tested:
FreeBSD 4.10 (sleipnir)
IRIX64 6.5 (modi4)
h5committest
Make collective chunk IO test more general
Description:
Previously collective chunk IO test is only fit for processor =4 with
the dimension size to be set small; sometimes people would like to test
with more than 4 processors(5,6 or more), the test therefore failed.
Solution:
To make the test case more general, dimensional size of the data is set to be large(right now 288 for each dimension), the disjoint hyperslab selection is re-calculated. Now the test cases should pass with 5,6 or 12 processors. Note, there is nothing wrong with the implementation of the library, it is the test case that causes the failure with the number of processor greater than 4.
Platforms tested:
Only at eirene, since only the test code is modified a little and it is very slow to test the parallel case.
Misc. update:
Code cleanup
Description:
Fix another batch of minor differences between the development and release
branches.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
Code cleanup
Description:
Clean up collective chunking code a bit.
Also, add '--enable-instrument' configure flag to have a mechanism for
determining that optimized operations happened correctly in the library (instead
of just the "normal" way) by allowing 'flag' properties to be set outside the
library and set when the "right" thing happens. This is mainly for debugging
and regression checks, so we make certain we don't break optimized I/O by
accident. It's enabled by default when --enable-debug is on (which is on by
default in the development branch and off by default in the release branch),
but can also be independently controlled with its own configure flag.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
IBM p690 (copper) w/parallel
To test collective chunk IO properly.
Description:
See the previous message.
Solution:
See the previous message.
Platforms tested:
arabica(Sol 2.7), eirene(Linux), copper(AIX)
Misc. update:
To add collective chunk IO tests.
Description:
three tests are added.
1. Only one hyperslab for each process, and this hyperslab is fit in exactly one chunk.
2. non-contiguous hyperslabs in each process, these hyperslabs are fit in one chunk.
3. Single hyperslab for each process, smaller chunk is assigned. Number of chunks for
every process is equal.
Solution:
the dataset size is set to be very small, will enlarge later.
Platforms tested:
AIX 5.1(copper)
Misc. update:
Bug fix
Description:
Fix error in chunked dataset I/O where data written out wasn't read
correctly from a chunked, extendible dataset after the dataset was extended.
Also, fix parallel I/O tests to gather error results from all processes,
in order to detect errors that only occur on one process.
Solution:
Bypass chunk cache for reads as well as writes, if parallel I/O driver is
used and file is opened for writing.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
feature
Description:
Change testphdf5 to use the common test program syntax.
Needed to change the protocols of all test programs to
fit the requirement of the common test syntax.
Platforms tested:
"h5committested".
Also tested in sol with PP mode.
Improvement
Description:
Changed parsing of verbose level by the common test library routine.
Change t_mpi.c to use the Verbose control better.
Platforms tested:
verena (pp).
Misc. update:
Improvement.
Description:
Complete change of the verbose control to use the routines provided by
the test/libh5test.a.
Also put in a temporary fix for the H5Eset_auto() and H5Eget_auto()
so that the Compat code are isolated in one place rather than all over
the source file.
Platforms tested:
Tested in Eirene (parallel).
Misc. update:
Added a test of fill value before any data is written to a dataset.
Rename short_dataset() as dataset_fillvalue() as it reflects better
the tests. Also removed the option of -S since the fill value test
will be tested always.
Platforms tested:
"h5committested"
Misc. update:
Code cleanup.
Description:
The H5Eclear() in the VRFY and INFO macros are not needed.
After removing them, there is no need to have a separate
v1.6 Compat version.
Platforms tested:
"h5committested"
Misc. update:
Code cleanup, bug fixes
Description:
Wrap up rest of changes necessary for fixing the "short" MPI-I/O read
problem that Robb reported.
Platforms tested:
FreeBSD 4.9 (sleipnir)
too minor to require h5committest
Bug fix
Description:
Clean up a couple more 1.6 compat bugs that showed up when the library
was compiled with parallel support.
Platforms tested:
FreeBSD 4.9 (sleipnir) w/parallel & 1.6 compat
config not tested with h5committest