mirror of
https://git.postgresql.org/git/postgresql.git
synced 2025-01-12 18:34:36 +08:00
Fix snapshot builds during promotion of hot standby node with 2PC
Some specific logic is done at the end of recovery when involving 2PC transactions: 1) Call RecoverPreparedTransactions(), to recover the state of 2PC transactions into memory (re-acquire locks, etc.). 2) ShutdownRecoveryTransactionEnvironment(), to move back to normal operations, mainly cleaning up recovery locks and KnownAssignedXids (including any 2PC transaction tracked previously). 3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is the tipping point for any process calling RecoveryInProgress() to check if the cluster is still in recovery or not. Any snapshot taken between steps 2) and 3) would be empty, causing any transaction relying on a snapshot at this point to potentially corrupt data as there could still be some 2PC transactions to track, with RecentXmin moving backwards on successive calls to GetSnapshotData() in the same transaction. As SharedRecoveryState is the point to take into account to know if it is safe to discard KnownAssignedXids, this commit moves step 2) after step 3), so as we can never finish with empty snapshots. This exists since the introduction of hot standby, so backpatch all the way down. The window with incorrect snapshots is extremely small, but I have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm member fairywren. Thomas Munro also found it independently. Special thanks to Andres Freund for taking the time to analyze this issue. Reported-by: Thomas Munro, Michael Paquier Analyzed-by: Andres Freund Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de Backpatch-through: 9.6
This commit is contained in:
parent
a0558cfa39
commit
8a4237908c
@ -8111,13 +8111,6 @@ StartupXLOG(void)
|
||||
/* Reload shared-memory state for prepared transactions */
|
||||
RecoverPreparedTransactions();
|
||||
|
||||
/*
|
||||
* Shutdown the recovery environment. This must occur after
|
||||
* RecoverPreparedTransactions(), see notes for lock_twophase_recover()
|
||||
*/
|
||||
if (standbyState != STANDBY_DISABLED)
|
||||
ShutdownRecoveryTransactionEnvironment();
|
||||
|
||||
/* Shut down xlogreader */
|
||||
if (readFile >= 0)
|
||||
{
|
||||
@ -8165,6 +8158,18 @@ StartupXLOG(void)
|
||||
UpdateControlFile();
|
||||
LWLockRelease(ControlFileLock);
|
||||
|
||||
/*
|
||||
* Shutdown the recovery environment. This must occur after
|
||||
* RecoverPreparedTransactions() (see notes in lock_twophase_recover())
|
||||
* and after switching SharedRecoveryState to RECOVERY_STATE_DONE so as
|
||||
* any session building a snapshot will not rely on KnownAssignedXids as
|
||||
* RecoveryInProgress() would return false at this stage. This is
|
||||
* particularly critical for prepared 2PC transactions, that would still
|
||||
* need to be included in snapshots once recovery has ended.
|
||||
*/
|
||||
if (standbyState != STANDBY_DISABLED)
|
||||
ShutdownRecoveryTransactionEnvironment();
|
||||
|
||||
/*
|
||||
* If there were cascading standby servers connected to us, nudge any wal
|
||||
* sender processes to notice that we've been promoted.
|
||||
|
Loading…
Reference in New Issue
Block a user