mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-27 08:39:28 +08:00
Update nbtree LP_DEAD item deletion comments.
Comments about the consequences of clearing the BTP_HAS_GARBAGE page
flag bit that apply only to VACUUM were added to code that deals with
opportunistic deletion of LP_DEAD items by commit a760893d
. The same
comment block was added to both _bt_delitems_vacuum() and
_bt_delitems_delete(). Correct _bt_delitems_delete()'s copy of the
comment block.
_bt_delitems_delete() reliably deletes items that were found by caller
to have their LP_DEAD bit set. There is no question about whether or
not unsetting the BTP_HAS_GARBAGE bit can miss some LP_DEAD items that
were set recently.
Also tweak a related section of the nbtree README.
This commit is contained in:
parent
b265aa1f39
commit
fe97c61c87
@ -559,15 +559,15 @@ writer cannot observe the incomplete split flag before the first writer
|
||||
finishes the split. If we let concurrent writers on the primary observe
|
||||
an incomplete split flag on the same page, each writer would attempt to
|
||||
complete the unfinished split, corrupting the parent page. (Similarly,
|
||||
replay of page deletion records does not hold a write lock on the leaf
|
||||
page throughout; only the primary needs to blocks out concurrent writers
|
||||
that insert on to the page being deleted.)
|
||||
replay of page deletion records does not hold a write lock on the target
|
||||
leaf page throughout; only the primary needs to block out concurrent
|
||||
writers that insert on to the page being deleted.)
|
||||
|
||||
During recovery all index scans start with ignore_killed_tuples = false
|
||||
and we never set kill_prior_tuple. We do this because the oldest xmin
|
||||
on the standby server can be older than the oldest xmin on the master
|
||||
server, which means tuples can be marked LP_DEAD even when they are
|
||||
still visible on the standby. We don't WAL log tuple LP_DEAD bits, but
|
||||
still visible on the standby. We don't WAL log tuple LP_DEAD bits, but
|
||||
they can still appear in the standby because of full page writes. So
|
||||
we must always ignore them in standby, and that means it's not worth
|
||||
setting them either. (When LP_DEAD-marked tuples are eventually deleted
|
||||
|
@ -1074,15 +1074,8 @@ _bt_delitems_delete(Relation rel, Buffer buf,
|
||||
|
||||
/*
|
||||
* Unlike _bt_delitems_vacuum, we *must not* clear the vacuum cycle ID,
|
||||
* because this is not called by VACUUM.
|
||||
*/
|
||||
|
||||
/*
|
||||
* Mark the page as not containing any LP_DEAD items. This is not
|
||||
* certainly true (there might be some that have recently been marked, but
|
||||
* weren't included in our target-item list), but it will almost always be
|
||||
* true and it doesn't seem worth an additional page scan to check it.
|
||||
* Remember that BTP_HAS_GARBAGE is only a hint anyway.
|
||||
* because this is not called by VACUUM. Just clear the BTP_HAS_GARBAGE
|
||||
* page flag, since we deleted all items with their LP_DEAD bit set.
|
||||
*/
|
||||
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
|
||||
opaque->btpo_flags &= ~BTP_HAS_GARBAGE;
|
||||
|
Loading…
Reference in New Issue
Block a user