If any error occurred while we were in the middle of reading a protocol
message from the client, we could lose sync, and incorrectly try to
interpret a part of another message as a new protocol message. That will
usually lead to an "invalid frontend message" error that terminates the
connection. However, this is a security issue because an attacker might
be able to deliberately cause an error, inject a Query message in what's
supposed to be just user data, and have the server execute it.
We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other
operations that could ereport(ERROR) in the middle of processing a message,
but a query cancel interrupt or statement timeout could nevertheless cause
it to happen. Also, the V2 fastpath and COPY handling were not so careful.
It's very difficult to recover in the V2 COPY protocol, so we will just
terminate the connection on error. In practice, that's what happened
previously anyway, as we lost protocol sync.
To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set
whenever we're in the middle of reading a message. When it's set, we cannot
safely ERROR out and continue running, because we might've read only part
of a message. PqCommReadingMsg acts somewhat similarly to critical sections
in that if an error occurs while it's set, the error handler will force the
connection to be terminated, as if the error was FATAL. It's not
implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted
to PANIC in critical sections, because we want to be able to use
PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes
advantage of that to prevent an OOM error from terminating the connection.
To prevent unnecessary connection terminations, add a holdoff mechanism
similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel
interrupts, but still allow die interrupts. The rules on which interrupts
are processed when are now a bit more complicated, so refactor
ProcessInterrupts() and the calls to it in signal handlers so that the
signal handlers always call it if ImmediateInterruptOK is set, and
ProcessInterrupts() can decide to not do anything if the other conditions
are not met.
Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund.
Backpatch to all supported versions.
Security: CVE-2015-0244
This covers alterations to buffer sizing and zeroing made between imath
1.3 and imath 1.20. Valgrind Memcheck identified the buffer overruns
and reliance on uninitialized data; their exploit potential is unknown.
Builds specifying --with-openssl are unaffected, because they use the
OpenSSL BIGNUM facility instead of imath. Back-patch to 9.0 (all
supported versions).
Security: CVE-2015-0243
Most callers pass a stack buffer. The ensuing stack smash can crash the
server, and we have not ruled out the viability of attacks that lead to
privilege escalation. Back-patch to 9.0 (all supported versions).
Marko Tiikkaja
Security: CVE-2015-0243
Prevent port/snprintf() from overflowing its local fixed-size
buffer and pad to the desired number of digits with zeros, even
if the precision is beyond the ability of the native sprintf().
port/snprintf() is only used on systems that lack a native
snprintf().
Reported by Bruce Momjian. Patch by Tom Lane. Backpatch to all
supported versions.
Security: CVE-2015-0242
Previously very long localized month and weekday strings could
overflow the allocated buffers, causing a server crash.
Reported and patch reviewed by Noah Misch. Backpatch to all
supported versions.
Security: CVE-2015-0241
Previously very long field masks for floats could access memory
beyond the existing buffer allocated to hold the result.
Reported by Andres Freund and Peter Geoghegan. Backpatch to all
supported versions.
Security: CVE-2015-0241
The previous wording claimed that the file was always in /etc, but of
course this varies with the installation layout. Write instead that it
can be found via `pg_config --sysconfdir`. Even though this is still
somewhat incorrect because it doesn't account of moved installations, it
at least conveys that the location depends on the installation.
"ECHO all" is ignored for interactive input, and has been for a very long
time, though possibly not for as long as the documentation has claimed the
opposite. Fix that, and also note that empty lines aren't echoed, which
while dubious is another longstanding behavior (it's embedded in our
regression test files for one thing). Per bug #12721 from Hans Ginzel.
In HEAD, also improve the code comments in this area, and suppress an
unnecessary fflush(stdout) when we're not echoing. That would likely
be safe to back-patch, but I'll not risk it mere hours before a release
wrap.
Coverity points out that mdc_finish returns a pointer to a local buffer
(which of course is gone as soon as the function returns), leaving open
a risk of misbehaviors possibly as bad as a stack overwrite.
In reality, the only possible call site is in process_data_packets()
which does not examine the returned pointer at all. So there's no
live bug, but nonetheless the code is confusing and risky. Refactor
to avoid the issue by letting process_data_packets() call mdc_finish()
directly instead of going through the pullf_read() API.
Although this is only cosmetic, it seems good to back-patch so that
the logic in pgp-decrypt.c stays in sync across all branches.
Marko Kreen
calc_rangesel() failed outright when comparing range variables to empty
constant ranges with < or >=, as a result of missing cases in a switch.
It also produced a bogus estimate for > comparison to an empty range.
On top of that, the >= and > cases were mislabeled throughout. For
nonempty constant ranges, they managed to produce the right answers
anyway as a result of counterbalancing typos.
Also, default_range_selectivity() omitted cases for elem <@ range,
range &< range, and range &> range, so that rather dubious defaults
were applied for these operators.
In passing, rearrange the code in rangesel() so that the elem <@ range
case is handled in a less opaque fashion.
Report and patch by Emre Hasegeli, some additional work by me
Since 9.3, when the --jobs option was introduced, using it together
with the --serializable-deferrable option generated multiple
errors. We can get correct behavior by allowing the connection
which acquires the snapshot to use SERIALIZABLE, READ ONLY,
DEFERRABLE and pass that to the workers running the other
connections using REPEATABLE READ, READ ONLY. This is a bit of a
kluge since the SERIALIZABLE behavior is achieved by running some
of the participating connections at a different isolation level,
but it is a simple and safe change, suitable for back-patching.
This will be followed by a proposal for a more invasive fix with
some slight behavioral changes on just the master branch, based on
suggestions from Andres Freund, but the kluge will be applied to
master until something is agreed along those lines.
Back-patched to 9.3, where the --jobs option was added.
Based on report from Alexander Korotkov
In 804b6b6db4 we modified
BuildIndexValueDescription to pay attention to which columns are visible
to the user, but unfortunatley that commit neglected to consider indexes
which are built on expressions.
Handle error-reporting of violations of constraint indexes based on
expressions by not returning any detail when the user does not have
table-level SELECT rights.
Backpatch to 9.0, as the prior commit was.
Pointed out by Tom.
connectby() didn't adequately check that the constructed SQL query returns
what it's expected to; in fact, since commit 08c33c426b it wasn't
checking that at all. This could result in a null-pointer-dereference
crash if the constructed query returns only one column instead of the
expected two. Less excitingly, it could also result in surprising data
conversion failures if the constructed query returned values that were
not I/O-conversion-compatible with the types specified by the query
calling connectby().
In all branches, insist that the query return at least two columns;
this seems like a minimal sanity check that can't break any reasonable
use-cases.
In HEAD, insist that the constructed query return the types specified by
the outer query, including checking for typmod incompatibility, which the
code never did even before it got broken. This is to hide the fact that
the implementation does a conversion to text and back; someday we might
want to improve that.
In back branches, leave that alone, since adding a type check in a minor
release is more likely to break things than make people happy. Type
inconsistencies will continue to work so long as the actual type and
declared type are I/O representation compatible, and otherwise will fail
the same way they used to.
Also, in all branches, be on guard for NULL results from the constructed
query, which formerly would cause null-pointer dereference crashes.
We now print the row with the NULL but don't recurse down from it.
In passing, get rid of the rather pointless idea that
build_tuplestore_recursively() should return the same tuplestore that's
passed to it.
Michael Paquier, adjusted somewhat by me
GetLockConflicts() has for a long time not properly terminated the
returned array. During normal processing the returned array is zero
initialized which, while not pretty, is sufficient to be recognized as
a invalid virtual transaction id. But the HotStandby case is more than
aesthetically broken: The allocated (and reused) array is neither
zeroed upon allocation, nor reinitialized, nor terminated.
Not having a terminating element means that the end of the array will
not be recognized and that recovery conflict handling will thus read
ahead into adjacent memory. Only terminating when hitting memory
content that looks like a invalid virtual transaction id. Luckily
this seems so far not have caused significant problems, besides making
recovery conflict more expensive.
Discussion: 20150127142713.GD29457@awork2.anarazel.de
Backpatch into all supported branches.
When gin_fuzzy_search_limit was used, we could jump out of startScan()
without calling startScanKey(). That was harmless in 9.3 and below, because
startScanKey()() didn't do anything interesting, but in 9.4 it initializes
information needed for skipping entries (aka GIN fast scans), and you
readily get a segfault if it's not done. Nevertheless, it was clearly wrong
all along, so backpatch all the way to 9.1 where the early return was
introduced.
(AFAICS startScanKey() did nothing useful in 9.3 and below, because the
fields it initialized were already initialized in ginFillScanKey(), but I
don't dare to change that in a minor release. ginFillScanKey() is always
called in gingetbitmap() even though there's a check there to see if the
scan keys have already been initialized, because they never are; ginrescan()
free's them.)
In the passing, remove unnecessary if-check from the second inner loop in
startScan(). We already check in the first loop that the condition is true
for all entries.
Reported by Olaf Gawenda, bug #12694, Backpatch to 9.1 and above, although
AFAICS it causes a live bug only in 9.4.
Commit 804b6b6db4 added the build of a
range table in copy.c to initialize the EState es_range_table since it
can be needed in error paths. Unfortunately, that commit didn't
appreciate that some code paths might end up not initializing the rte
which is used to build the range table.
Fix that and clean up a couple others things along the way- build it
only once and don't explicitly set it on the !is_from path as it
doesn't make any sense there (cstate is palloc0'd, so this isn't an
issue from an initializing standpoint either).
The prior commit went back to 9.0, but this only goes back to 9.1 as
prior to that the range table build happens immediately after building
the RTE and therefore doesn't suffer from this issue.
Pointed out by Robert.
While building error messages to return to the user,
BuildIndexValueDescription, ExecBuildSlotValueDescription and
ri_ReportViolation would happily include the entire key or entire row in
the result returned to the user, even if the user didn't have access to
view all of the columns being included.
Instead, include only those columns which the user is providing or which
the user has select rights on. If the user does not have any rights
to view the table or any of the columns involved then no detail is
provided and a NULL value is returned from BuildIndexValueDescription
and ExecBuildSlotValueDescription. Note that, for key cases, the user
must have access to all of the columns for the key to be shown; a
partial key will not be returned.
Back-patch all the way, as column-level privileges are now in all
supported versions.
This has been assigned CVE-2014-8161, but since the issue and the patch
have already been publicized on pgsql-hackers, there's no point in trying
to hide this commit.
Commit 145343534c arranged to store numeric
NaN values as short-header numerics, but the field access macros did not
get the memo: they thought only "SHORT" numerics have short headers.
Most of the time this makes no difference because we don't access the
weight or dscale of a NaN; but numeric_send does that. As pointed out
by Andrew Gierth, this led to fetching uninitialized bytes.
AFAICS this could not have any worse consequences than that; in particular,
an unaligned stored numeric would have been detoasted by PG_GETARG_NUMERIC,
so that there's no risk of a fetch off the end of memory. Still, the code
is wrong on its own terms, and it's not hard to foresee future changes that
might expose us to real risks. So back-patch to all affected branches.
Some fields of the sinfo struct are modified within PG_TRY and then
referenced within PG_CATCH, so as with recent patch to async.c, "volatile"
is necessary for strict POSIX compliance; and that propagates to a couple
of subroutines as well as materializeQueryResult() itself. I think the
risk of actual issues here is probably higher than in async.c, because
storeQueryResult() is likely to get inlined into materializeQueryResult(),
leaving the compiler free to conclude that its stores into sinfo fields are
dead code.
The "callargs" variable is modified within PG_TRY and then referenced
within PG_CATCH, which is exactly the coding pattern we've now found
to be unsafe. Marking "callargs" volatile would be problematic because
it is passed by reference to some Tcl functions, so fix the problem
by not modifying it within PG_TRY. We can just postpone the free()
till we exit the PG_TRY construct, as is already done elsewhere in this
same file.
Also, fix failure to free(callargs) when exiting on too-many-arguments
error. This is only a minor memory leak, but a leak nonetheless.
In passing, remove some unnecessary "volatile" markings in the same
function. Those doubtless are there because gcc 2.95.3 whinged about
them, but we now know that its algorithm for complaining is many bricks
shy of a load.
This is certainly a live bug with compilers that optimize similarly
to current gcc, so back-patch to all active branches.
The "pos" variable is modified within PG_TRY and then referenced
within PG_CATCH, so for strict POSIX conformance it must be marked
volatile. Superficially the code looked safe because pos's address
was taken, which was sufficient to force it into memory ... but it's
not sufficient to ensure that the compiler applies updates exactly
where the program text says to. The volatility marking has to extend
into a couple of subroutines too, but I think that's probably a good
thing because the risk of out-of-order updates is mostly in those
subroutines not asyncQueueReadAllNotifications() itself. In principle
the compiler could have re-ordered operations such that an error could
be thrown while "pos" had an incorrect value.
It's unclear how real the risk is here, but for safety back-patch
to all active branches.
strncpy() has a well-deserved reputation for being unsafe, so make an
effort to get rid of nearly all occurrences in HEAD.
A large fraction of the remaining uses were passing length less than or
equal to the known strlen() of the source, in which case no null-padding
can occur and the behavior is equivalent to memcpy(), though doubtless
slower and certainly harder to reason about. So just use memcpy() in
these cases.
In other cases, use either StrNCpy() or strlcpy() as appropriate (depending
on whether padding to the full length of the destination buffer seems
useful).
I left a few strncpy() calls alone in the src/timezone/ code, to keep it
in sync with upstream (the IANA tzcode distribution). There are also a
few such calls in ecpg that could possibly do with more analysis.
AFAICT, none of these changes are more than cosmetic, except for the four
occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength
source leads to a non-null-terminated destination buffer and ensuing
misbehavior. These don't seem like security issues, first because no stack
clobber is possible and second because if your values of sslcert etc are
coming from untrusted sources then you've got problems way worse than this.
Still, it's undesirable to have unpredictable behavior for overlength
inputs, so back-patch those four changes to all active branches.
Move random() and setseed() to a separate table, to have them grouped
together. Also add a notice that random() is not cryptographically secure.
Back-patch of commit 75fdcec145 into
all supported versions, per discussion of the need to document that
random() is just a wrapper around random(3).
This results in a very substantial reduction in disk space usage during
"make check-world", since that sequence involves creation of numerous
temporary installations. It should also help a bit in the buildfarm, even
though the buildfarm script doesn't create as many temp installations,
because the current script misses deleting some of them; and anyway it
seems better to do this once in one place rather than expecting that
script to get it right every time.
In 9.4 and HEAD, also undo the unwise choice in commit b1aebbb6a8
to report strerror(errno) after a rmtree() failure. rmtree has already
reported that, possibly for multiple failures with distinct errnos; and
what's more, by the time it returns there is no good reason to assume
that errno still reflects the last reportable error. So reporting errno
here is at best redundant and at worst badly misleading.
Back-patch to all supported branches, so that future revisions of the
buildfarm script can rely on this behavior.
Per discussion, change the log level of this message to be LOG not WARNING.
The main point of this change is to avoid causing buildfarm run failures
when the stats collector is exceptionally slow to respond, which it not
infrequently is on some of the smaller/slower buildfarm members.
This change does lose notice to an interactive user when his stats query
is looking at out-of-date stats, but the majority opinion (not necessarily
that of yours truly) is that WARNING messages would probably not get
noticed anyway on heavily loaded production systems. A LOG message at
least ensures that the problem is recorded somewhere where bulk auditing
for the issue is possible.
Also, instead of an untranslated "pgstat wait timeout" message, provide
a translatable and hopefully more understandable message "using stale
statistics instead of current ones because stats collector is not
responding". The original text was written hastily under the assumption
that it would never really happen in practice, which we now know to be
unduly optimistic.
Back-patch to all active branches, since we've seen the buildfarm issue
in all branches.
pg_dump.c:dumDatabase() called ArchiveEntry() with the results of a a
query that was PQclear()ed a couple lines earlier.
Backpatch to 9.2 where security labels for shared objects where
introduced.
Previously, the xml value resulting from an xpath query would not have
namespace declarations if the namespace declarations were attached to
an ancestor element in the input xml value. That means the output value
was not correct XML. Fix that by running the result value through
xmlCopyNode(), which produces the correct namespace declarations.
Author: Ali Akbar <the.apaan@gmail.com>
Previous fix mapped "Norwegian (Bokmål)" locale, which contains a non-ASCII
character, to the pure ASCII alias "norwegian-bokmal". However, it turns
out that more recent versions of the CRT library, in particular MSVCR110
(Visual Studio 2012), changed the behaviour of setlocale() so that if
you pass "norwegian-bokmal" to setlocale, it returns "Norwegian_Norway".
That meant trouble, when setlocale(..., NULL) first returned
"Norwegian (Bokmål)_Norway", which we mapped to "norwegian-bokmal_Norway",
but another call to setlocale(..., "norwegian-bokmal_Norway") returned
"Norwegian_Norway". That caused PostgreSQL to think that they are different
locales, and therefore not compatible. That caused initdb to fail at
CREATE DATABASE.
Older CRT versions seem to accept "Norwegian_Norway" too, so change the
mapping to return "Norwegian_Norway" instead of "norwegian-bokmal".
Backpatch to 9.2 like the previous attempt. We haven't made a release that
includes the previous fix yet, so we don't need to worry about changing the
locale of existing clusters from "norwegian-bokmal" to "Norwegian_Norway".
(Doing any mapping like this at all requires changing the locale of
existing databases; the release notes need to include instructions for
that).
Commit 894459e59f revealed this option to
be broken for NLS builds on Darwin, but "make -C contrib/unaccent check"
and the buildfarm client rely on it. Fix that configuration by
redefining the option to imply LANG=C on Darwin. In passing, use LANG=C
instead of LANG=en on Windows; since only postmaster startup uses that
value, testers are unlikely to notice the change. Back-patch to 9.0,
like the predecessor commit.
Up to now, the "child" executor state trees generated for EvalPlanQual
rechecks have simply shared the ResultRelInfo arrays used for the original
execution tree. However, this leads to dangling-pointer problems, because
ExecInitModifyTable() is all too willing to scribble on some fields of the
ResultRelInfo(s) even when it's being run in one of those child trees.
This trashes those fields from the perspective of the parent tree, because
even if the generated subtree is logically identical to what was in use in
the parent, it's in a memory context that will go away when we're done
with the child state tree.
We do however want to share information in the direction from the parent
down to the children; in particular, fields such as es_instrument *must*
be shared or we'll lose the stats arising from execution of the children.
So the simplest fix is to make a copy of the parent's ResultRelInfo array,
but not copy any fields back at end of child execution.
Per report from Manuel Kniep. The added isolation test is based on his
example. In an unpatched memory-clobber-enabled build it will reliably
fail with "ctid is NULL" errors in all branches back to 9.1, as a
consequence of junkfilter->jf_junkAttNo being overwritten with $7f7f.
This test cannot be run as-is before that for lack of WITH syntax; but
I have no doubt that some variant of this problem can arise in older
branches, so apply the code change all the way back.
As of 9.3, ruleutils.c goes to some lengths to ensure that table and column
aliases used in its output are unique. Of course this takes more time than
was required before, which in itself isn't fatal. However, EXPLAIN was set
up so that recalculation of the unique aliases was repeated for each
subexpression printed in a plan. That results in O(N^2) time and memory
consumption for large plan trees, which did not happen in older branches.
Fortunately, the expensive work is the same across a whole plan tree,
so there is no need to repeat it; we can do most of the initialization
just once per query and re-use it for each subexpression. This buys
back most (not all) of the performance loss since 9.2.
We need an extra ExplainState field to hold the precalculated deparse
context. That's no problem in HEAD, but in the back branches, expanding
sizeof(ExplainState) seems risky because third-party extensions might
have local variables of that struct type. So, in 9.4 and 9.3, introduce
an auxiliary struct to keep sizeof(ExplainState) the same. We should
refactor the APIs to avoid such local variables in future, but that's
material for a separate HEAD-only commit.
Per gripe from Alexey Bashtanov. Back-patch to 9.3 where the issue
was introduced.
Previously, read() might have returned a length equal to the buffer
length, and then the subsequent store to buf[len] would write a
zero-byte one byte past the end. This doesn't seem likely to be
a security issue, but there's some chance it could result in
pg_standby misbehaving.
Spotted by Coverity; patch by Michael Paquier, reviewed by me.
Commit b94ce6e80 reordered postmaster's startup sequence so that the
tempfile directory is only cleaned up after all the necessary state
for pg_ctl is collected. Unfortunately the chosen location is after
the syslogger has been started; which normally is fine, except for
!WIN32 EXEC_BACKEND builds, which pass information to children via
files in the temp directory.
Move the call to RemovePgTempFiles() to just before the syslogger has
started. That's the first child we fork.
Luckily EXEC_BACKEND is pretty much only used by endusers on windows,
which has a separate method to pass information to children. That
means the real world impact of this bug is very small.
Discussion: 20150113182344.GF12272@alap3.anarazel.de
Backpatch to 9.1, just as the previous commit was.
I noticed the "vacuum" regression test taking really significantly longer
than it used to on a slow machine. Investigation pointed the finger at
commit e415b469b3, which added creation of
an index using an extremely expensive index function. That function was
evidently meant to be applied only twice ... but the test re-used an
existing test table, which up till a couple lines before that had had over
two thousand rows. Depending on timing of the concurrent regression tests,
the intervening VACUUMs might have been unable to remove those
recently-dead rows, and then the index build would need to create index
entries for them too, leading to the wrap_do_analyze() function being
executed 2000+ times not twice. Avoid this by using a different table
that is guaranteed to have only the intended two rows in it.
Back-patch to 9.0, like the commit that created the problem.
The mechanism added in commit dbdf9679d7
for associating the correct translation domain with errcontext strings
potentially fails in cases where errcontext() is used within an ereport()
macro. Such usage was not originally envisioned for errcontext(), but we
do have a few places that do it. In this situation, the intended comma
expression becomes just a couple of arguments to errfinish(), which the
compiler might choose to evaluate right-to-left.
Fortunately, in such cases the textdomain for the errcontext string must
be the same as for the surrounding ereport. So we can fix this by letting
errstart initialize context_domain along with domain; then it will have
the correct value no matter which order the calls occur in. (Note that
error stack callback functions are not invoked until errfinish, so normal
usage of errcontext won't affect what happens for errcontext calls within
the ereport macro.)
In passing, make sure that errcontext calls within the main backend set
context_domain to something non-NULL. This isn't a live bug because
NULL would select the current textdomain() setting which should be the
right thing anyway --- but it seems better to handle this completely
consistently with the regular domain field.
Per report from Dmitry Voronin. Backpatch to 9.3; before that, there
wasn't any attempt to ensure that errcontext strings were translated
in an appropriate domain.
Back in ed0b409, PGPROC was split and moved to static variables in
procarray.c, with procs in ProcArrayStruct replaced by an array of
integers representing process numbers (pgprocnos), with -1 indicating a
dead process which has yet to be removed. Access to procArray is
generally done under ProcArrayLock and therefore most code does not have
to concern itself with -1 entries.
However, MinimumActiveBackends intentionally does not take
ProcArrayLock, which means it has to be extra careful when accessing
procArray. Prior to ed0b409, this was handled by checking for a NULL
in the pointer array, but that check was no longer valid after the
split. Coverity pointed out that the check could never happen and so
it was removed in 5592eba. That didn't make anything worse, but it
didn't fix the issue either.
The correct fix is to check for pgprocno == -1 and skip over that entry
if it is encountered.
Back-patch to 9.2, since there can be attempts to access the arrays
prior to their start otherwise. Note that the changes prior to 9.4 will
look a bit different due to the change in 5592eba.
Note that MinimumActiveBackends only returns a bool for heuristic
purposes and any pre-array accesses are strictly read-only and so there
is no security implication and the lack of fields complaints indicates
it's very unlikely to run into issues due to this.
Pointed out by Noah.
For some reason I overlooked in GETTEXT_TRIGGERS that the right argument
be read by gettext in 7fcbf6a405. This
will drop the translation percentages for the backend all the way back
to 9.3 ...
Problem reported by Heikki.
Darwin --enable-nls builds use a substitute setlocale() that may start a
thread. Buildfarm member orangutan experienced BackendList corruption
on account of different postmaster threads executing signal handlers
simultaneously. Furthermore, a multithreaded postmaster risks undefined
behavior from sigprocmask() and fork(). Emit LOG messages about the
problem and its workaround. Back-patch to 9.0 (all supported versions).
Typical server invocations already achieved that. Invalid locale
settings in the initial postmaster environment interfered, as could
malloc() failure. Setting "LC_MESSAGES=pt_BR.utf8 LC_ALL=invalid" in
the postmaster environment will now choose C-locale messages, not
Brazilian Portuguese messages. Most localized programs, including all
PostgreSQL frontend executables, do likewise. Users are unlikely to
observe changes involving locale categories other than LC_MESSAGES.
CheckMyDatabase() ensures that we successfully set LC_COLLATE and
LC_CTYPE; main() sets the remaining three categories to locale "C",
which almost cannot fail. Back-patch to 9.0 (all supported versions).
vacuum()'s static variable handling makes it non-reentrant; an ensuing
null pointer deference crashed the backend. Back-patch to 9.0 (all
supported versions).
The corner case where a relcache invalidation tried to rebuild the
entry for a referenced relation but couldn't find it in the catalog
wasn't correct.
The code tried to RelationCacheDelete/RelationDestroyRelation the
entry. That didn't work when assertions are enabled because the latter
contains an assertion ensuring the refcount is zero. It's also more
generally a bad idea, because by virtue of being referenced somebody
might actually look at the entry, which is possible if the error is
trapped and handled via a subtransaction abort.
Instead just error out, without deleting the entry. As the entry is
marked invalid, the worst that can happen is that the invalid (and at
some point unused) entry lingers in the relcache.
Discussion: 22459.1418656530@sss.pgh.pa.us
There should be no way to hit this case < 9.4 where logical decoding
introduced a bug that can hit this. But since the code for handling
the corner case is there it should do something halfway sane, so
backpatch all the the way back. The logical decoding bug will be
handled in a separate commit.
This never worked, I think. Per report from Marc Munro.
In passing, fix funny spacing in the COMMENT ON command as a result of
excess space in the "label" string.