[svn-r10117] Purpose:

Bug fix.

Description:
"testphdf -p" would with data verification errors.  The reasons were
that the MPIPOSIX driver file open and close, especially the close
routine provide no "coordination" between processes.  The testphdf5
tests reuse the same file for test data file by opening using H5Fcreate
with the HDF5_FCC_TRUNC option.  The test routines do not provide any
code to ensure that all processes have finished one test before moving
to the next test.  Some "faster" process would have finished verifying
its portion of data as correct and move to the next test which opens
the same file with TRUNCATOIN which truncates the previous data file.
But some "slower" processes are still verifying the "previous" data
file which all of a sudden got truncated by the "faster" process.

Solution:
Technically, the test program should be fixed to ensure all processes
have finished one test before any is allowed to move to the next test.
OTOH, the MPIO VFD has no problem with this test because MPI-IO requires
File open and close be called collectively correct and ensure it is
returned properly.
I choose to fix the MPIPOSIX close routine to provide some sort of
coordination between processes by requiring all processes to have
completed the close of a file before it is returned to user space.
This makes the MPIPOSIX close routine behaves more like the MPIO
close routine, thus provide more protection for user applications
that fail to code in the coordination.  But having the barrier
in the MPIPOSIX close routine would penalize applications where
it is "okay" for some processes to close its file handle and race
ahead to do other things since it is not going to access this file,
therefore whether other processes are still using the file is immaterial.

Maybe this protective coordination should be optional and can be turned
off by confident users who need not this sort of protection.

Platforms tested:
"h5committested" and tested in modi4 and tesla.

Misc. update:
This commit is contained in:
Albert Cheng 2005-03-01 21:30:46 -05:00
parent 921d331fc8
commit 940955df65

View File

@ -824,6 +824,8 @@ H5FD_mpiposix_close(H5FD_t *_file)
if (HDclose(file->fd)<0)
HGOTO_ERROR(H5E_IO, H5E_CANTCLOSEFILE, FAIL, "unable to close file")
/* make sure all processes have closed the file before returning. */
MPI_Barrier(file->comm);
/* Clean up other stuff */
MPI_Comm_free(&file->comm);
H5MM_xfree(file);