From 3efd14b9df347ae6ac7791a10285583a9b131359 Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 14 Apr 2010 21:31:33 +0000 Subject: [PATCH] Fix a problem introduced by my patch of 2010-01-12 that revised the way relcache reload works. In the patched code, a relcache entry in process of being rebuilt doesn't get unhooked from the relcache hash table; which means that if a cache flush occurs due to sinval queue overrun while we're rebuilding it, the entry could get blown away by RelationCacheInvalidate, resulting in crash or misbehavior. Fix by ensuring that an entry being rebuilt has positive refcount, so it won't be seen as a target for removal if a cache flush occurs. (This will mean that the entry gets rebuilt twice in such a scenario, but that's okay.) It appears that the problem can only arise within a transaction that has previously reassigned the relfilenode of a pre-existing table, via TRUNCATE or a similar operation. Per bug #5412 from Rusty Conover. Back-patch to 8.2, same as the patch that introduced the problem. I think that the failure can't actually occur in 8.2, since it lacks the rd_newRelfilenodeSubid optimization, but let's make it work like the later branches anyway. Patch by Heikki, slightly editorialized on by me. --- src/backend/utils/cache/relcache.c | 44 +++++++++++++++++++++--------- 1 file changed, 31 insertions(+), 13 deletions(-) diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 99a7482fb9..8b79da6b67 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -8,7 +8,7 @@ * * * IDENTIFICATION - * $PostgreSQL: pgsql/src/backend/utils/cache/relcache.c,v 1.250.2.5 2010/01/13 23:07:28 tgl Exp $ + * $PostgreSQL: pgsql/src/backend/utils/cache/relcache.c,v 1.250.2.6 2010/04/14 21:31:33 tgl Exp $ * *------------------------------------------------------------------------- */ @@ -1635,18 +1635,33 @@ RelationDestroyRelation(Relation relation) * * Physically blow away a relation cache entry, or reset it and rebuild * it from scratch (that is, from catalog entries). The latter path is - * usually used when we are notified of a change to an open relation - * (one with refcount > 0). However, this routine just does whichever - * it's told to do; callers must determine which they want. + * used when we are notified of a change to an open relation (one with + * refcount > 0). * - * NB: when rebuilding, we'd better hold some lock on the relation. - * In current usages this is presumed true because it has refcnt > 0. + * NB: when rebuilding, we'd better hold some lock on the relation, + * else the catalog data we need to read could be changing under us. + * Also, a rel to be rebuilt had better have refcnt > 0. This is because + * an sinval reset could happen while we're accessing the catalogs, and + * the rel would get blown away underneath us by RelationCacheInvalidate + * if it has zero refcnt. + * + * The "rebuild" parameter is redundant in current usage because it has + * to match the relation's refcnt status, but we keep it as a crosscheck + * that we're doing what the caller expects. */ static void RelationClearRelation(Relation relation, bool rebuild) { Oid old_reltype = relation->rd_rel->reltype; + /* + * As per notes above, a rel to be rebuilt MUST have refcnt > 0; while + * of course it would be a bad idea to blow away one with nonzero refcnt. + */ + Assert(rebuild ? + !RelationHasReferenceCountZero(relation) : + RelationHasReferenceCountZero(relation)); + /* * Make sure smgr and lower levels close the relation's files, if they * weren't closed already. If the relation is not getting deleted, the @@ -1836,26 +1851,30 @@ RelationClearRelation(Relation relation, bool rebuild) static void RelationFlushRelation(Relation relation) { - bool rebuild; - if (relation->rd_createSubid != InvalidSubTransactionId) { /* * New relcache entries are always rebuilt, not flushed; else we'd * forget the "new" status of the relation, which is a useful * optimization to have. + * + * The rel could have zero refcnt here, so temporarily increment + * the refcnt to ensure it's safe to rebuild it. We can assume that + * the current transaction has some lock on the rel already. */ - rebuild = true; + RelationIncrementReferenceCount(relation); + RelationClearRelation(relation, true); + RelationDecrementReferenceCount(relation); } else { /* * Pre-existing rels can be dropped from the relcache if not open. */ - rebuild = !RelationHasReferenceCountZero(relation); - } + bool rebuild = !RelationHasReferenceCountZero(relation); - RelationClearRelation(relation, rebuild); + RelationClearRelation(relation, rebuild); + } } /* @@ -2152,7 +2171,6 @@ AtEOSubXact_RelationCache(bool isCommit, SubTransactionId mySubid, relation->rd_createSubid = parentSubid; else { - Assert(RelationHasReferenceCountZero(relation)); RelationClearRelation(relation, false); continue; }