(1) Add/remove comments.

(2) A temporary fix to address the test/objcopy.c: test_copy_group_deep() test failure with
the family driver.  The test failure occurs with these configurations in objcopy.c:
--with shared src messages (CONFIG_SHARE_SRC)
--without shared dst messages (CONFIG_SHARE_DST)
--with latest format for source file (CONFIG_SRC_NEW_FORMAT)
--without dense attributes (CONFIG_DENSE)
--with latest format for destination file (CONFIG_DST_NEW_FORMAT)
The temporary fix is in src/H5MFaggr.c (see comments above #ifdef REPLACE/#endif).
This commit is contained in:
Vailin Choi 2019-06-27 16:04:48 -05:00
parent 565e3d7951
commit c69b1025f5
4 changed files with 102 additions and 188 deletions

View File

@ -307,6 +307,20 @@ HDfprintf(stderr, "%s: fspace->alloc_sect_size = %Hu, fspace->sect_size = %Hu\n"
* Purpose: Release the section info, either giving ownership back to
* the cache or letting the free space header keep it.
*
* Add the fix in this routine to resolve the potential infinite loop
* problem when allocating file space for the meta data of the
* self-referential free-space managers at file closing.
*
* On file close or flushing, when the section info is modified
* and protected/unprotected, does not allow the section info size
* to shrink:
* --if the current allocated section info size in fspace->sect_size is
* larger than the previous section info size in fpsace->alloc_sect_size,
* release the section info
* --Otherwise, set the fspace->sect_size to be the same as
* fpsace->alloc_sect_size. This means fspace->sect_size may be larger
* than what is actually needed.
*
* Return: SUCCEED/FAIL
*
* Programmer: Quincey Koziol
@ -2436,84 +2450,31 @@ done:
/*-------------------------------------------------------------------------
* Function: H5FS_vfd_alloc_hdr_and_section_info_if_needed
*
* Purpose: This function is part of a hack to patch over a design
* flaw in the free space managers for file space allocation.
* Specifically, if a free space manager allocates space for
* its own section info, it is possible for it to
* go into an infinite loop as it:
*
* 1) computes the size of the section info
*
* 2) allocates file space for the section info
*
* 3) notices that the size of the section info
* has changed
*
* 4) deallocates the section info file space and
* returns to 1) above.
*
* Similarly, if it allocates space for its own header, it
* can go into an infinite loop as it:
*
* 1) allocates space for the header
*
* 2) notices that the free space manager is empty
* and thus should not have file space allocated
* to its header
*
* 3) frees the space allocated to the header
*
* 4) notices that the free space manager is not
* empty and thus must have file space allocated
* to it, and thus returns to 1) above.
*
* In a nutshell, the solution applied in this hack is to
* deallocate file space for the free space manager(s) that
* handle FSM header and/or section info file space allocations,
* wait until all other allocation/deallocation requests have
* been handled, and then test to see if the free space manager(s)
* in question are empty. If they are, do nothing. If they
* are not, allocate space for them at end of file bypassing the
* usual file space allocation calls, and thus avoiding the
* potential infinite loops.
*
* The purpose of this function is to allocate file space for
* Purpose: The purpose of this function is to allocate file space for
* the header and section info of the target free space manager
* directly from the VFD if needed. In this case the function
* also re-inserts the header and section info in the metadata
* cache with this allocation.
* if they are not allocated yet.
*
* When paged allocation is not enabled, allocation of space
* for the free space manager header and section info is
* straight forward -- we simply allocate the space directly
* from file driver.
* The previous hack in this routine to avoid the potential infinite
* loops by allocating file space directly from the end of the file
* is removed. The allocation can now be done via the usual
* file space allocation call H5MF_alloc().
*
* Note that if f->shared->alignment > 1, and EOA is not a
* multiple of the alignment, it is possible that performing
* these allocations may generate a fragment of file space in
* addition to the space allocated for the section info. This
* excess space is dropped on the floor. As shall be seen,
* it will usually be reclaimed later.
*
* When page allocation is enabled, things are more difficult,
* as there is the possibility that page buffering will be
* enabled when the free space managers are read. To allow
* for this, we must ensure that space allocated for the
* free space manager header and section info is either larger
* than a page, or resides completely within a page.
*
* Do this by allocating space for the free space header and
* section info starting at page boundaries, and extending
* allocation to the next page boundary. This of course wastes
* space, but see below.
*
* On the first free space allocation / deallocation after the
* next file open, we will read the self referential free space
* managers, float them and reduce the EOA to its value prior
* to allocation of file space for the self referential free
* space managers on the preceeding file close. This EOA value
* is stored in the free space manager superblock extension
* message.
* The design flaw is addressed by not allowing the size
* of section info to shrink. This means, when trying to allocate
* section info size X via H5MF_alloc() and the section info size
* after H5MF_alloc() changes to Y:
* --if Y is larger than X, free the just allocated file space X
* via H5MF_xfree() and set fspace->sect_size to Y.
* This routine will be called again later from
* H5MF_settle_meta_data_fsm() to allocate section info with the
* larger fpsace->sect_size Y.
* --if Y is smaller than X, no further allocation is needed and
* fspace->sect_size and fspace->alloc_sect_size are set to X.
* This means fspace->sect_size may be larger than what is
* actually needed.
*
* This routine also re-inserts the header and section info in the
* metadata cache with this allocation.
*
* Return: Success: non-negative
* Failure: negative
@ -2530,8 +2491,6 @@ H5FS_vfd_alloc_hdr_and_section_info_if_needed(H5F_t *f, H5FS_t *fspace,
hsize_t hdr_alloc_size;
hsize_t sinfo_alloc_size;
haddr_t sect_addr = HADDR_UNDEF; /* address of sinfo */
haddr_t eoa_frag_addr = HADDR_UNDEF; /* Address of fragment at EOA */
hsize_t eoa_frag_size = 0; /* Size of fragment at EOA */
haddr_t eoa = HADDR_UNDEF; /* Initial EOA for the file */
herr_t ret_value = SUCCEED; /* Return value */

View File

@ -837,39 +837,40 @@ H5F__super_read(H5F_t *f, H5P_genplist_t *fa_plist, hbool_t initial_read)
if(f->shared->eoa_fsm_fsalloc != fsinfo.eoa_pre_fsm_fsalloc)
f->shared->eoa_fsm_fsalloc = fsinfo.eoa_pre_fsm_fsalloc;
/* f->shared->eoa_pre_fsm_fsalloc must always be HADDR_UNDEF
* in the absence of persistent free space managers.
*/
/* If the following two conditions are true:
/*
* If the following two conditions are true:
* (1) skipping EOF check (skip_eof_check)
* (2) dropping free-space to the floor (f->shared->null_fsm_addr)
* skip the asserts as "eoa_pre_fsm_fsalloc" may be undefined
* skip the asserts as "eoa_fsm_fsalloc" may be undefined
* for a crashed file with persistent free space managers.
* #1 and #2 are enabled when the tool h5clear --increment
* The above two conditions are enabled when the tool h5clear --increment
* option is used.
*/
if(!skip_eof_check && !f->shared->null_fsm_addr) {
if(!skip_eof_check && !f->shared->null_fsm_addr)
HDassert((!f->shared->fs_persist) || (f->shared->eoa_fsm_fsalloc != HADDR_UNDEF));
}
/* As "eoa_pre_fsm_fsalloc" may be undefined for a crashed file
* with persistent free space managers, therefore, set
* "first_alloc_dealloc" when the condition
* "dropping free-space to the floor is true.
* This will ensure that no action is done to settle things on file
* close via H5MF_settle_meta_data_fsm() and H5MF_settle_raw_data_fsm().
*/
/* The following check is removed:
/*
* A crashed file with persistent free-space managers may have
* undefined f->shared->eoa_fsm_fsalloc.
* eoa_fsm_fsalloc is the final eoa which is saved in the free-space
* info message's eoa_pre_fsm_fsalloc field for backward compatibility.
* If the tool h5clear sets to dropping free-space to the floor
* as indicated by f->shared->null_fsm_addr, we are not going to
* perform actions to settle things on file close in the routines
* H5MF_settle_meta_data_fsm() and H5MF_settle_raw_data_fsm().
*
* We remove the following check:
* if((f->shared->eoa_pre_fsm_fsalloc != HADDR_UNDEF || null_fsm_addr) &&
* (H5F_INTENT(f) & H5F_ACC_RDWR))
* f->shared->first_alloc_dealloc = TRUE;
*
* --there is no more eoa_pre_fsm_fsalloc
* Because:
* --there is no more f->shared->eoa_pre_fsm_fsalloc and
* f->shared->first_alloc_dealloc
* --the check for null_fsm_addr is directly done in H5MF_settle_meta_data_fsm()
* and H5MF_settle_raw_data_fsm()
*/
f->shared->fs_addr[0] = HADDR_UNDEF;
for(u = 1; u < NELMTS(f->shared->fs_addr); u++)
f->shared->fs_addr[u] = fsinfo.fs_addr[u - 1];

View File

@ -51,6 +51,7 @@
#define H5MF_FSPACE_EXPAND 120 /* Percent of "normal" size to expand serialized free space size */
#define H5MF_CHECK_FSM(FSM, CF) \
HDassert(*CF == FALSE); \
if(!H5F_addr_defined(FSM->addr) || !H5F_addr_defined(FSM->sect_addr)) \
*CF = TRUE;
@ -2644,8 +2645,9 @@ H5MF_settle_raw_data_fsm(H5F_t *f, hbool_t *fsm_settled)
HDassert(f->shared);
HDassert(fsm_settled);
/* Only need to settle things if we are persisting the free space info
* and allocation/deallocation has occurred.
/*
* Only need to settle things if we are persisting free space and
* the private property in f->shared->null_fsm_addr is not enabled.
*/
if(f->shared->fs_persist && !H5F_NULL_FSM_ADDR(f)) {
hbool_t fsm_opened[H5F_MEM_PAGE_NTYPES]; /* State of FSM */
@ -2987,7 +2989,6 @@ done:
/* NEED TO remove/modify the comments in this routine */
/*-------------------------------------------------------------------------
* Function: H5MF_settle_meta_data_fsm()
*
@ -3005,7 +3006,7 @@ done:
* On entry to this function, the raw data settle routine
* (H5MF_settle_raw_data_fsm()) should have:
*
* 1) Freed the aggregators.
* 1) Freed the aggregators.
*
* 2) Freed all file space allocated to the free space managers.
*
@ -3041,12 +3042,12 @@ done:
* 1) Verify that the free space manager(s) involved in file
* space allocation for free space managers are still floating.
*
* 2) Free the aggregators.
* 2) Free the aggregators.
*
* 3) Reduce the EOA to the extent possible, and make note
* 3) Reduce the EOA to the extent possible, and make note
* of the resulting value. This value will be stored
* in the fsinfo superblock extension message and be used
* in the subsequent file open.
* in the subsequent file open.
*
* 4) Re-allocate space for any free space manager(s) that:
*
@ -3067,21 +3068,15 @@ done:
* allocation has changed the size of the section info --
* forcing us to deallocate and start the loop over again.
*
* To avoid this, simply allocate file space for these
* FSM(s) directly from the VFD layer if allocation is
* indicated. This avoids the issue by bypassing the FSMs
* in this case.
*
* Note that this may increase the size of the file needlessly.
* A better solution would be to modify the FSM code to
* The solution is to modify the FSM code to
* save empty FSMs to file, and to allow section info blocks
* to be oversized. However, given that the FSM code is
* also used by the fractal heaps, and that we are under
* severe time pressure at the moment, the above brute
* force solution is attractive.
* to be oversized. That is, only allow section info to increase
* in size, not shrink. The solution is now implemented.
*
* 5) Make note of the EOA -- used for sanity checking on
* FSM shutdown.
* 5) Make note of the EOA -- used for sanity checking on
* FSM shutdown. This is saved as eoa_pre_fsm_fsalloc in
* the free-space info message for backward compatibility
* with the 1.10 library that has the hack.
*
* Return: SUCCEED/FAIL
*
@ -3114,8 +3109,9 @@ H5MF_settle_meta_data_fsm(H5F_t *f, hbool_t *fsm_settled)
HDassert(f->shared);
HDassert(fsm_settled);
/* Only need to settle things if we are persisting the free space info
* and allocation/deallocation has occurred.
/*
* Only need to settle things if we are persisting free space and
* the private property in f->shared->null_fsm_addr is not enabled.
*/
if(f->shared->fs_persist && !H5F_NULL_FSM_ADDR(f)) {
/* Sanity check */
@ -3221,7 +3217,6 @@ H5MF_settle_meta_data_fsm(H5F_t *f, hbool_t *fsm_settled)
* Note that the aggregators will not exist if paged aggregation
* is enabled -- don't attempt to free if this is the case.
*/
/* Vailin -- is this correct? */
/* (for space not at EOF, it may be put into free space managers) */
if((!H5F_PAGED_AGGR(f)) && (H5MF_free_aggrs(f) < 0))
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTFREE, FAIL, "can't free aggregators")
@ -3230,21 +3225,7 @@ H5MF_settle_meta_data_fsm(H5F_t *f, hbool_t *fsm_settled)
if(H5MF__close_shrink_eoa(f) < 0)
HGOTO_ERROR(H5E_RESOURCE, H5E_CANTSHRINK, FAIL, "can't shrink eoa")
/* At this point, the EOA should be set to a value that contains
* the allocation for all user data, all non self referential FSMs,
* the superblock and all superblock extension messages.
*
* Make note of the current EOA. We will store this value in the
* free space manager superblock extension message. Since space for
* everything other than the self referential FSMs (and possibly the
* cache image) has been allocated at this point, this allows us to
* to float the self referential FSMs on the first file space allocation /
* deallocation and then set the EOA to this value before we handle
* the allocation / deallocation. (If a cache image exists, the
* first allocation / deallocation will be the deallocation of space
* for the cache image).
*
* WARNING: This approach settling the self referential free space
/* WARNING: This approach settling the self referential free space
* managers and allocating space for them in the file will
* not work as currently implemented with the split and
* multi file drivers, as the self referential free space
@ -3263,58 +3244,13 @@ H5MF_settle_meta_data_fsm(H5F_t *f, hbool_t *fsm_settled)
* We should be able to support the split file driver
* without a file format change. However, the code to
* do so does not exist at present.
* NOTE: not sure whether to remove or keep the above comments
*/
/* ******************* PROBLEM: ********************
*
* If the file has an alignment other than 1, and if
* the EOA is not a multiple of this alignment, allocating space
* for the section via the VFD info has the potential of generating
* a fragment that will be added to the free space manager. This
* of course undoes everything we have been doing here.
*
* Need a way around this. Obvious solution is to force the EOA to
* be a multiple of the alignment.
*
* Fortunately, alignment is typically 1, so this is a non-issue in
* most cases. In cases where the alignment is not 1, for now we
* have decided to drop the fragment on the floor.
*
* Eventually, we should fix this by modifying the on disk representations
* of free space managers to allow for empty space, so as to bypass the
* issues created by self-referential free space managers, and make
* this issue moot.
/*
* Continue allocating file space for the header and section info until
* they are all settled,
*/
/* HDassert(f->shared->alignment == 1); */
/* The free space manager(s) that handle space allocations for free
* space managers should be settled now, albeit without file space
* allocated to them. To avoid the possibility of changing the sizes
* of their section info blocks, allocate space for them now at the
* end of file via H5FD_alloc().
*
* In the past, this issue of allocating space without touching the
* free space managers has been dealt with by calling
* H5MF_aggr_vfd_alloc(), which in turn calls H5MF_aggr_alloc().
* This is problematic since (if I read the code correctly) it will
* re-constitute the metadata aggregator, which will add any leftover
* space to one of the free space managers when freed.
*
* This is a non-starter, since the entire objective is to settle the
* free space managers.
*
* Hence the decision to call H5FD_alloc() directly.
*
* As discussed in PROBLEM above, if f->shared->alignment is not 1,
* this has the possibility of generating a fragment of file space
* that would typically be inserted into one of the free space managers.
*
* This is isn't good, but due to schedule pressure, we will just drop
* the fragment on the floor for now.
*/
do {
continue_alloc_fsm = FALSE;
if(sm_hdr_fspace)
@ -3349,10 +3285,12 @@ H5MF_settle_meta_data_fsm(H5F_t *f, hbool_t *fsm_settled)
/* All free space managers should have file space allocated for them
* now, and should see no further allocations / deallocations. Store
* the pre and post file space allocation for self referential FSMs EOA
* for use when we actually write the free space manager superblock
* extension message.
* now, and should see no further allocations / deallocations.
* For backward compatibility, store the eoa in f->shared->eoa_fsm_fsalloc
* which will be set to fsinfo.eoa_pre_fsm_fsalloc when we actually write
* the free-space info message to the superblock extension.
* This will allow the 1.10 library with the hack to open the file with
* the new solution.
*/
/* Get the eoa after allocation of file space for the self referential
* free space managers. Assuming no cache image, this should be the
@ -3379,12 +3317,14 @@ done:
/*-------------------------------------------------------------------------
* Function: H5MF__continue_alloc_fsm
*
* Purpose: To determine whether any of the input FSMs has allocated its
* Purpose: To determine whether any of the input FSMs has allocated
* its "addr" and "sect_addr".
* Return TRUE or FALSE in *continue_alloc_fsm.
*
* Return: SUCCEED/FAIL
*
* Programmer: Vailin Choi
* 6/24/2019
*-------------------------------------------------------------------------
*/
static herr_t

View File

@ -183,7 +183,21 @@ HDfprintf(stderr, "%s: type = %u, size = %Hu\n", FUNC, (unsigned)type, size);
* allocate "generic" space and sub-allocate out of that, if possible.
* Otherwise just allocate through H5F__alloc().
*/
/*
* Replace the following line with the line in #ifdef REPLACE/#endif.
* The line in #ifdef REPLACE triggers the following problem:
* test/objcopy.c: test_copy_group_deep() test fails with the family driver
*
* When closing the destination file after H5Ocopy, the library flushes the fractal
* heap direct block via H5HF__cache_dblock_pre_serialize(). While doing so,
* the cache eventually adjusts/evicts ageout entries and ends up flushing out the
* same entry that is being serialized (flush_in_progress).
*/
if((f->shared->feature_flags & aggr->feature_flag) && f->shared->fs_strategy != H5F_FSPACE_STRATEGY_NONE && (!f->closing || !f->shared->fs_persist)) {
#ifdef REPLACE
if((f->shared->feature_flags & aggr->feature_flag) && f->shared->fs_strategy != H5F_FSPACE_STRATEGY_NONE && !f->closing) {
#endif
haddr_t aggr_frag_addr = HADDR_UNDEF; /* Address of aggregrator fragment */
hsize_t aggr_frag_size = 0; /* Size of aggregator fragment */
hsize_t alignment; /* Alignment of this section */