Fix WAL file replacement during cascading replication on Windows.

When the startup process restores a WAL file from the archive, it deletes
any old file with the same name and renames the new file in its place. On
Windows, however, when a file is deleted, it still lingers as long as a
process holds a file handle open on it. With cascading replication, a
walsender process can hold the old file open, so the rename() in the startup
process would fail. To fix that, rename the old file to a temporary name, to
make the original file name available for reuse, before deleting the old
file.
This commit is contained in:
Heikki Linnakangas 2012-09-05 18:10:15 -07:00
parent 2e0cc1f031
commit ab9a14e903

View File

@ -2787,7 +2787,33 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
XLogFilePath(xlogfpath, tli, segno);
if (stat(xlogfpath, &statbuf) == 0)
{
if (unlink(xlogfpath) != 0)
char oldpath[MAXPGPATH];
#ifdef WIN32
static unsigned int deletedcounter = 1;
/*
* On Windows, if another process (e.g a walsender process) holds
* the file open in FILE_SHARE_DELETE mode, unlink will succeed,
* but the file will still show up in directory listing until the
* last handle is closed, and we cannot rename the new file in its
* place until that. To avoid that problem, rename the old file to
* a temporary name first. Use a counter to create a unique
* filename, because the same file might be restored from the
* archive multiple times, and a walsender could still be holding
* onto an old deleted version of it.
*/
snprintf(oldpath, MAXPGPATH, "%s.deleted%u",
xlogfpath, deletedcounter++);
if (rename(xlogfpath, oldpath) != 0)
{
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not rename file \"%s\" to \"%s\": %m",
xlogfpath, oldpath)));
}
#else
strncpy(oldpath, xlogfpath, MAXPGPATH);
#endif
if (unlink(oldpath) != 0)
ereport(FATAL,
(errcode_for_file_access(),
errmsg("could not remove file \"%s\": %m",