Fix longstanding error in _bt_search(): should moveright at top of loop not

bottom.  Otherwise we fail to moveright when the root page was split while
we were "in flight" to it.  This is not a significant problem when the root
is above the leaf level, but if the root was also a leaf (ie, a single-page
index just got split) we may return the wrong leaf page to the caller,
resulting in failure to find a key that is in fact present.  Bug has existed
at least since 7.1, probably forever.
This commit is contained in:
Tom Lane 2003-07-29 22:18:38 +00:00
parent 5e3c09a114
commit 892a51c367

View File

@ -8,7 +8,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $Header: /cvsroot/pgsql/src/backend/access/nbtree/nbtsearch.c,v 1.76 2003/07/28 00:09:14 tgl Exp $ * $Header: /cvsroot/pgsql/src/backend/access/nbtree/nbtsearch.c,v 1.77 2003/07/29 22:18:38 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -62,6 +62,13 @@ _bt_search(Relation rel, int keysz, ScanKey scankey,
BlockNumber par_blkno; BlockNumber par_blkno;
BTStack new_stack; BTStack new_stack;
/*
* Race -- the page we just grabbed may have split since we read
* its pointer in the parent (or metapage). If it has, we may need
* to move right to its new sibling. Do that.
*/
*bufP = _bt_moveright(rel, *bufP, keysz, scankey, BT_READ);
/* if this is a leaf page, we're done */ /* if this is a leaf page, we're done */
page = BufferGetPage(*bufP); page = BufferGetPage(*bufP);
opaque = (BTPageOpaque) PageGetSpecialPointer(page); opaque = (BTPageOpaque) PageGetSpecialPointer(page);
@ -99,13 +106,6 @@ _bt_search(Relation rel, int keysz, ScanKey scankey,
_bt_relbuf(rel, *bufP); _bt_relbuf(rel, *bufP);
*bufP = _bt_getbuf(rel, blkno, BT_READ); *bufP = _bt_getbuf(rel, blkno, BT_READ);
/*
* Race -- the page we just grabbed may have split since we read
* its pointer in the parent. If it has, we may need to move
* right to its new sibling. Do that.
*/
*bufP = _bt_moveright(rel, *bufP, keysz, scankey, BT_READ);
/* okay, all set to move down a level */ /* okay, all set to move down a level */
stack_in = new_stack; stack_in = new_stack;
} }
@ -599,8 +599,8 @@ _bt_first(IndexScanDesc scan, ScanDirection dir)
/* /*
* At this point we are positioned at the first item >= scan key, or * At this point we are positioned at the first item >= scan key, or
* possibly at the end of a page on which all the existing items are * possibly at the end of a page on which all the existing items are
* greater than the scan key and we know that everything on later * less than the scan key and we know that everything on later
* pages is less than or equal to scan key. * pages is greater than or equal to scan key.
* *
* We could step forward in the latter case, but that'd be a waste of * We could step forward in the latter case, but that'd be a waste of
* time if we want to scan backwards. So, it's now time to examine * time if we want to scan backwards. So, it's now time to examine