This reverts commit 16304a0134, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb which is no longer needed.
Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic. (It's a bit surprising that the same
does not happen on Linux.) Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.
This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166. However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism. In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information. So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.
We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf(). That seems to
be an unalloyed improvement.
Security: CVE-2015-3166
The point of the assertion is to ensure that the arrays allocated in stack
are large enough, but the check was one item short.
This won't matter in practice because MaxIndexTuplesPerPage is an
overestimate, so you can't have that many items on a page in reality.
But let's be tidy.
Spotted by Anastasia Lubennikova. Backpatch to all supported versions, like
the patch that added the assertion.
In 9.1 and earlier, it is possible for index_getnext() to try to examine
a heap buffer for possible HOT-prune when in recovery; this causes a
problem when a multixact is found in a tuple's Xmax, because
GetMultiXactIdMembers refuses to run when in recovery, raising an error:
ERROR: cannot GetMultiXactIdMembers() during recovery
This can be solved easily by having MultiXactIdIsRunning always return
false when in recovery, which is reasonable because a HOT standby cannot
acquire further tuple locks nor update/delete tuples.
(Note: it doesn't look like this specific code path has a problem in
9.2, because instead of doing HeapTupleSatisfiesUpdate directly,
heap_hot_search_buffer uses HeapTupleIsSurelyDead instead. Still, there
may be other paths affected by the same bug, for instance in pgrowlocks,
and the multixact code hasn't changed; so apply the same fix
throughout.)
Apply this fix to 9.0 through 9.2. In 9.3 the multixact code has been
changed completely and is no longer subject to this problem.
Per report from Marko Tiikkaja,
https://www.postgresql.org/message-id/54EB3283.2080305@joh.to
Analysis by Andres Freund
This has been the predominant outcome. When the output of decrypting
with a wrong key coincidentally resembled an OpenPGP packet header,
pgcrypto could instead report "Corrupt data", "Not text data" or
"Unsupported compression algorithm". The distinct "Corrupt data"
message added no value. The latter two error messages misled when the
decrypted payload also exhibited fundamental integrity problems. Worse,
error message variance in other systems has enabled cryptologic attacks;
see RFC 4880 section "14. Security Considerations". Whether these
pgcrypto behaviors are likewise exploitable is unknown.
In passing, document that pgcrypto does not resist side-channel attacks.
Back-patch to 9.0 (all supported versions).
Security: CVE-2015-3167
PostgreSQL already checked the vast majority of these, missing this
handful that nearly cannot fail. If putenv() failed with ENOMEM in
pg_GSS_recvauth(), authentication would proceed with the wrong keytab
file. If strftime() returned zero in cache_locale_time(), using the
unspecified buffer contents could lead to information exposure or a
crash. Back-patch to 9.0 (all supported versions).
Other unchecked calls to these functions, especially those in frontend
code, pose negligible security concern. This patch does not address
them. Nonetheless, it is always better to check return values whose
specification provides for indicating an error.
In passing, fix an off-by-one error in strftime_win32()'s invocation of
WideCharToMultiByte(). Upon retrieving a value of exactly MAX_L10N_DATA
bytes, strftime_win32() would overrun the caller's buffer by one byte.
MAX_L10N_DATA is chosen to exceed the length of every possible value, so
the vulnerable scenario probably does not arise.
Security: CVE-2015-3166
All known standard library implementations of these functions can fail
with ENOMEM. A caller neglecting to check for failure would experience
missing output, information exposure, or a crash. Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers. The wrappers do not return after an error, so their callers
need not check. Back-patch to 9.0 (all supported versions).
Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.
Injecting the wrappers implicitly is a compromise between patch scope
and design goals. I would prefer to edit each call site to name a
wrapper explicitly. libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort(). All that would be painfully
invasive for a back-patched security fix, hence this compromise.
Security: CVE-2015-3166
Reentering this function with the right timing caused a double free,
typically crashing the backend. By synchronizing a disconnection with
the authentication timeout, an unauthenticated attacker could achieve
this somewhat consistently. Call be_tls_close() solely from within
proc_exit_prepare(). Back-patch to 9.0 (all supported versions).
Benkocs Norbert Attila
Security: CVE-2015-3165
Previously, this prevented promoted standby servers from being upgraded
because of a missing WAL history file. (Timeline 1 doesn't need a
history file, and we don't copy WAL files anyway.)
Report by Christian Echerer(?), Alexey Klyukin
Backpatch through 9.0
This patch causes pg_upgrade to error out during its check phase if:
(1) template0 is marked connectable
or
(2) any other database is marked non-connectable
This is done because, in the first case, pg_upgrade would fail because
the pg_dumpall --globals restore would fail, and in the second case, the
database would not be restored, leading to data loss.
Report by Matt Landry (1), Stephen Frost (2)
Backpatch through 9.0
DST law changes in Egypt, Mongolia, Palestine.
Historical corrections for Canada and Chile.
Revised zone abbreviation for America/Adak (HST/HDT not HAST/HADT).
Commit 81c45081 introduced a new RBM_ZERO_AND_LOCK mode to ReadBuffer, which
takes a lock on the buffer before zeroing it. However, you cannot take a
lock on a local buffer, and you got a segfault instead. The version of that
patch committed to master included a check for !isLocalBuf, and therefore
didn't crash, but oddly I missed that in the back-patched versions. This
patch adds that check to the back-branches too.
RBM_ZERO_AND_LOCK mode is only used during WAL replay, and in hash indexes.
WAL replay only deals with shared buffers, so the only way to trigger the
bug is with a temporary hash index.
Reported by Artem Ignatyev, analysis by Tom Lane.
If a row that potentially violates a deferred exclusion constraint is
HOT-updated later in the same transaction, the exclusion constraint would
be reported as violated when the check finally occurs, even if the row(s)
the new row originally conflicted with have since been removed. This
happened because the wrong TID was passed to check_exclusion_constraint(),
causing the live HOT-updated row to be seen as a conflicting row rather
than recognized as the row-under-test.
Per bug #13148 from Evan Martin. It's been broken since exclusion
constraints were invented, so back-patch to all supported branches.
As discussed, the default setting of include_realm=0 can be dangerous in
multi-realm environments because it is then impossible to differentiate
users with the same username but who are from two different realms.
Recommend include_realm=1 and note that the default setting may change
in a future version of PostgreSQL and therefore users may wish to
explicitly set include_realm to avoid issues while upgrading.
The Service Control Manager should be notified regularly during a shutdown
that takes a long time. Previously we would increaes the counter, but forgot
to actually send the notification to the system. The loop counter was also
incorrectly initalized in the event that the startup of the system took long
enough for it to increase, which could cause the shutdown process not to wait
as long as expected.
Krystian Bigaj, reviewed by Michael Paquier
pg_win32_is_junction() was a typo for pgwin32_is_junction(). open()
was used not only in a two-argument form, which breaks on Windows,
but also where BasicOpenFile() should have been used.
Per reports from Andrew Dunstan and David Rowley.
Otherwise, if there's another crash, some writes from after the first
crash might make it to disk while writes from before the crash fail
to make it to disk. This could lead to data corruption.
Back-patch to all supported versions.
Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised
by me.
An outer join appearing within the RHS of an antijoin can't commute with
the antijoin, but somehow I missed teaching make_outerjoininfo() about
that. In Teodor Sigaev's recent trouble report, this manifests as a
"could not find RelOptInfo for given relids" error within eqjoinsel();
but I think silently wrong query results are possible too, if the planner
misorders the joins and doesn't happen to trigger any internal consistency
checks. It's broken as far back as we had antijoins, so back-patch to all
supported branches.
Each of the libraries incorporates src/port files, which often check
FRONTEND. Build systems disagreed on whether to build libpgtypes this
way. Only libecpg incorporates files that rely on it today. Back-patch
to 9.0 (all supported versions) to forestall surprises.
When the startup process recovers transactions by scanning pg_twophase
directory, it should clear MyLockedGxact after it's done processing each
transaction. Like we do during normal operation, at PREPARE TRANSACTION.
Otherwise, if the startup process exits due to an error, it will try to
clear the locking_backend field of the last recovered transaction. That's
usually harmless, but if the error happens in MarkAsPreparing, while
holding TwoPhaseStateLock, the shmem-exit hook will try to acquire
TwoPhaseStateLock again, and deadlock with itself.
This fixes bug #13128 reported by Grant McAlister. The bug was introduced
by commit bb38fb0d, so backpatch to all supported versions like that
commit.
SLRU_SEGMENTS_PER_PAGE -> SLRU_PAGES_PER_SEGMENT
I introduced this ancient typo in subtrans.c and later propagated it to
multixact.c. I fixed the latter in f741300c, but only back to 9.3;
backpatch to all supported branches for consistency.
After a timeline switch, we would leave behind recycled WAL segments that
are in the future, but on the old timeline. After promotion, and after they
become old enough to be recycled again, we would notice that they don't have
a .ready or .done file, create a .ready file for them, and archive them.
That's bogus, because the files contain garbage, recycled from an older
timeline (or prealloced as zeros). We shouldn't archive such files.
This could happen when we're following a timeline switch during replay, or
when we switch to new timeline at end-of-recovery.
To fix, whenever we switch to a new timeline, scan the data directory for
WAL segments on the old timeline, but with a higher segment number, and
remove them. Those don't belong to our timeline history, and are most
likely bogus recycled or preallocated files. They could also be valid files
that we streamed from the primary ahead of time, but in any case, they're
not needed to recover to the new timeline.
It was previously possible to have the launcher re-execute its main loop
before shutting down if some other signal was received or an error
occurred after getting SIGTERM, as reported by Qingqing Zhou.
While investigating, Tom Lane further noticed that if autovacuum had
been disabled in the config file, it would misbehave by trying to start
a new worker instead of bailing out immediately -- it would consider
itself as invoked in emergency mode.
Fix both problems by checking the shutdown flag in a few more places.
These problems have existed since autovacuum was introduced, so
backpatch all the way back.
While gcc doesn't complain if you declare a function "static" and then
define it not-static, other compilers do; and in any case the code is
highly misleading this way. Add the missing "static" keywords to a
couple of recent patches. Per buildfarm member pademelon.
Previously we would re-use input subexpressions in all expression trees
attached to a Join plan node. However, if it's an outer join and the
subexpression appears in the nullable-side input, this is potentially
incorrect for apparently-matching subexpressions that came from above
the outer join (ie, targetlist and qpqual expressions), because the
executor will treat the subexpression value as NULL when maybe it should
not be.
The case is fairly hard to hit because (a) you need a non-strict
subexpression (else NULL is correct), and (b) we don't usually compute
expressions in the outputs of non-toplevel plan nodes. But we might do
so if the expressions are sort keys for a mergejoin, for example.
Probably in the long run we should make a more explicit distinction between
Vars appearing above and below an outer join, but that will be a major
planner redesign and not at all back-patchable. For the moment, just hack
set_join_references so that it will not match any non-Var expressions
coming from nullable inputs to expressions that came from above the join.
(This is somewhat overkill, in that a strict expression could still be
matched, but it doesn't seem worth the effort to check that.)
Per report from Qingqing Zhou. The added regression test case is based
on his example.
This has been broken for a very long time, so back-patch to all active
branches.
Commit ed9cc2b5df made it unnecessary to pass
start_nblkno to _hash_splitbucket(), and for that matter unnecessary to
have the internal nblkno variable either. My compiler didn't complain
about that, but some did. I also rearranged the use of oblkno a bit to
make that case more parallel.
Report and initial patch by Petr Jelinek, rearranged a bit by me.
Back-patch to all branches, like the previous patch.
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
To implement this, routines previously private to libpq have been
duplicated so that psql can decide what looks like a conninfo/URI
string. In back branches, just duplicate the same code all the way back
to 9.2, where URIs where introduced; 9.0 and 9.1 have a simpler version.
In master, the routines are moved to src/common and renamed.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
You're required to write either RANGE or ROWS to start a frame clause,
but the documentation incorrectly implied this is optional. Noted by
David Johnston.
As with initdb these programs need to run with a restricted token, and
if they don't pg_upgrade will fail when run as a user with Adminstrator
privileges.
Backpatch to all live branches. On the development branch the code is
reorganized so that the restricted token code is now in a single
location. On the stable bramches a less invasive change is made by
simply copying the relevant code to pg_upgrade.c and pg_resetxlog.c.
Patches and bug report from Muhammad Asif Naeem, reviewed by Michael
Paquier, slightly edited by me.
_hash_splitbucket() obtained the base page of the new bucket by calling
_hash_getnewbuf(), but it held no exclusive lock that would prevent some
other process from calling _hash_getnewbuf() at the same time. This is
contrary to _hash_getnewbuf()'s API spec and could in fact cause failures.
In practice, we must only call that function while holding write lock on
the hash index's metapage.
An additional problem was that we'd already modified the metapage's bucket
mapping data, meaning that failure to extend the index would leave us with
a corrupt index.
Fix both issues by moving the _hash_getnewbuf() call to just before we
modify the metapage in _hash_expandtable().
Unfortunately there's still a large problem here, which is that we could
also incur ENOSPC while trying to get an overflow page for the new bucket.
That would leave the index corrupt in a more subtle way, namely that some
index tuples that should be in the new bucket might still be in the old
one. Fixing that seems substantially more difficult; even preallocating as
many pages as we could possibly need wouldn't entirely guarantee that the
bucket split would complete successfully. So for today let's just deal
with the base case.
Per report from Antonin Houska. Back-patch to all active branches.
Slow functions in index expressions might cause this loop to take long
enough to make it worth being cancellable. Probably it would be enough
to call CHECK_FOR_INTERRUPTS here, but for consistency with other
per-sample-row loops in this file, let's use vacuum_delay_point.
Report and patch by Jeff Janes. Back-patch to all supported branches.
ExecOpenScanRelation assumed that any relation listed in the ExecRowMark
list has been locked by InitPlan; but this is not true if the rel's
markType is ROW_MARK_COPY, which is possible if it's a foreign table.
In most (possibly all) cases, failure to acquire a lock here isn't really
problematic because the parser, planner, or plancache would have taken the
appropriate lock already. In principle though it might leave us vulnerable
to working with a relation that we hold no lock on, and in any case if the
executor isn't depending on previously-taken locks otherwise then it should
not do so for ROW_MARK_COPY relations.
Noted by Etsuro Fujita. Back-patch to all active versions, since the
inconsistency has been there a long time. (It's almost certainly
irrelevant in 9.0, since that predates foreign tables, but the code's
still wrong on its own terms.)
It's all very well to claim that a simplistic sort is fast in easy
cases, but O(N^2) in the worst case is not good ... especially if the
worst case is as easy to hit as "descending order input". Replace that
bit with our standard qsort.
Per bug #12866 from Maksym Boguk. Back-patch to all active branches.
GNU readline defines the return value of write_history() as "zero if OK,
else an errno code". libedit's version of that function used to have a
different definition (to wit, "-1 if error, else the number of lines
written to the file"). We tried to work around that by checking whether
errno had become nonzero, but this method has never been kosher according
to the published API of either library. It's reportedly completely broken
in recent Ubuntu releases: psql bleats about "No such file or directory"
when saving ~/.psql_history, even though the write worked fine.
However, libedit has been following the readline definition since somewhere
around 2006, so it seems all right to finally break compatibility with
ancient libedit releases and trust that the return value is what readline
specifies. (I'm not sure when the various Linux distributions incorporated
this fix, but I did find that OS X has been shipping fixed versions since
10.5/Leopard.)
If anyone is still using such an ancient libedit, they will find that psql
complains it can't write ~/.psql_history at exit, even when the file was
written correctly. This is no worse than the behavior we're fixing for
current releases.
Back-patch to all supported branches.
The SGML docs claimed that 1-byte integers could be sent or received with
the "isint" options, but no such behavior has ever been implemented in
pqGetInt() or pqPutInt(). The in-code documentation header for PQfn() was
even less in tune with reality, and the code itself used parameter names
matching neither the SGML docs nor its libpq-fe.h declaration. Do a bit
of additional wordsmithing on the SGML docs while at it.
Since the business about 1-byte integers is a clear documentation bug,
back-patch to all supported branches.
We were using "user mapping for user XYZ" as description for user mappings, but
that's ambiguous because users can have mappings on multiple foreign
servers; therefore change it to "for user XYZ on server UVW" instead.
Object identities for user mappings are also updated in the same way, in
branches 9.3 and above.
The incomplete description string was introduced together with the whole
SQL/MED infrastructure by commit cae565e503 of 8.4 era, so backpatch all
the way back.