The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly. Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve. Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.
Andres Freund, reviewed by Tom Lane and Noah Misch.
Security: CVE-2014-0061
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role. Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members. Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function. The restriction on
SECURITY DEFINER is not security-critical. However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see. Back-patch to 8.4 (all supported versions).
The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions. An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.
The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor. Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.
Reviewed by Andres Freund and Tom Lane. Reported, independently, by
Jonas Sundman and Noah Misch.
Security: CVE-2014-0060
DST law changes in Jordan; historical changes in Cuba.
Also, remove the zones Asia/Riyadh87, Asia/Riyadh88, and Asia/Riyadh89.
Per the upstream announcement:
The files solar87, solar88, and solar89 are no longer distributed.
They were a negative experiment -- that is, a demonstration that
tz data can represent solar time only with some difficulty and error.
Their presence in the distribution caused confusion, as Riyadh
civil time was generally not solar time in those years.
The documentation suggested using "echo | psql", but not the often-superior
alternative of a here-document. Also, be more direct about suggesting
that people avoid -c for multiple commands. Per discussion.
Adjust handleCopyOut() to stop trying to write data once it's failed
one time. For typical cases such as out-of-disk-space or broken-pipe,
additional attempts aren't going to do anything but waste time, and
in any case clean truncation of the output seems like a better behavior
than randomly dropping blocks in the middle.
Also remove dubious (and misleadingly documented) attempt to force our way
out of COPY_OUT state if libpq didn't do that. If we did have a situation
like that, it'd be a bug in libpq and would be better fixed there, IMO.
We can hope that commit fa4440f516 took care
of any such problems, anyway.
Also fix longstanding bug in handleCopyIn(): PQputCopyEnd() only supports
a non-null errormsg parameter in protocol version 3, and will actively
fail if one is passed in version 2. This would've made our attempts
to get out of COPY_IN state after a failure into infinite loops when
talking to pre-7.4 servers.
Back-patch the COPY_OUT state change business back to 9.2 where it was
introduced, and the other two fixes into all supported branches.
We used the length of the input string, not the de-escaped string, as
the trigger for NAMEDATALEN truncation. AFAICS this would only result
in sometimes printing a phony truncation warning; but it's just luck
that there was no worse problem, since we were violating the API spec
for truncate_identifier(). Per bug #9204 from Joshua Yanovski.
This has been wrong since the Unicode-identifier support was added,
so back-patch to all supported branches.
We have a practice of providing a "bread crumb" trail between the minor
versions where the migration section actually tells you to do something.
Historically that was just plain text, eg, "see the release notes for
9.2.4"; but if you're using a browser or PDF reader, it's a lot nicer
if it's a live hyperlink. So use "<xref>" instead. Any argument against
doing this vanished with the recent decommissioning of plain-text release
notes.
Vik Fearing
In pqSendSome, if the connection is already closed at entry, discard any
queued output data before returning. There is no possibility of ever
sending the data, and anyway this corresponds to what we'd do if we'd
detected a hard error while trying to send(). This avoids possible
indefinite bloat of the output buffer if the application keeps trying
to send data (or even just keeps trying to do PQputCopyEnd, as psql
indeed will).
Because PQputCopyEnd won't transition out of PGASYNC_COPY_IN state
until it's successfully queued the COPY END message, and pqPutMsgEnd
doesn't distinguish a queuing failure from a pqSendSome failure,
this omission allowed an infinite loop in psql if the connection closure
occurred when we had at least 8K queued to send. It might be worth
refactoring so that we can make that distinction, but for the moment
the other changes made here seem to offer adequate defenses.
To guard against other variants of this scenario, do not allow
PQgetResult to return a PGRES_COPY_XXX result if the connection is
already known dead. Make sure it returns PGRES_FATAL_ERROR instead.
Per report from Stephen Frost. Back-patch to all active branches.
In a database that's not yet reached consistency, it's possible that some
segments of a relation are not full-size but are not the last ones either.
Because of the way smgrnblocks() works, asking for a new page with P_NEW
will fill in the last not-full-size segment --- and if that makes it full
size, the apparent EOF of the relation will increase by more than one page,
so that the next P_NEW request will yield a page past the next consecutive
one. This breaks the relation-extension logic in XLogReadBufferExtended,
possibly allowing a page update to be applied to some page far past where
it was intended to go. This appears to be the explanation for reports of
table bloat on replication slaves compared to their masters, and probably
explains some corrupted-slave reports as well.
Fix the loop to check the page number it actually got, rather than merely
Assert()'ing that dead reckoning got it to the desired place. AFAICT,
there are no other places that make assumptions about exactly which page
they'll get from P_NEW.
Problem identified by Greg Stark, though this is not the same as his
proposed patch.
It's been like this for a long time, so back-patch to all supported
branches.
Providing this information as plain text was doubtless worth the trouble
ten years ago, but it seems likely that hardly anyone reads it in this
format anymore. And the effort required to maintain these files (in the
form of extra-complex markup rules in the relevant parts of the SGML
documentation) is significant. So, let's stop doing that and rely solely
on the other documentation formats.
Per discussion, the plain-text INSTALL instructions might still be worth
their keep, so we continue to generate that file.
Rather than remove HISTORY and src/test/regress/README from distribution
tarballs entirely, replace them with simple stub files that tell the reader
where to find the relevant documentation. This is mainly to avoid possibly
breaking packaging recipes that expect these files to exist.
Back-patch to all supported branches, because simplifying the markup
requirements for release notes won't help much unless we do it in all
branches.
In p_isdigit and other character class test functions generated by the
p_iswhat macro, the code path for non-C locales with multibyte encodings
contained a bogus pointer cast that would accidentally fail to malfunction
if types wchar_t and wint_t have the same width. Apparently that is true
on most platforms, but not on recent Cygwin releases. Remove the cast,
as it seems completely unnecessary (I think it arose from a false analogy
to the need to cast to unsigned char when dealing with the <ctype.h>
functions). Per bug #8970 from Marco Atzeri.
In the same functions, the code path for C locale with a multibyte encoding
simply ANDed each wide character with 0xFF before passing it to the
corresponding <ctype.h> function. This could result in false positive
answers for some non-ASCII characters, so use a range test instead.
Noted by me while investigating Marco's complaint.
Also, remove some useless though not actually buggy maskings and casts
in the hand-coded p_isalnum and p_isalpha functions, which evidently
got tested a bit more carefully than the macro-generated functions.
The preferred method is to use "cc -shared", and this allows binaries
to be rebased if required, unlike dllwrap.
Backpatch to 9.0 where we have buildfarm coverage.
There are still some issues with Cygwin, especially modern Cygwin, but
this helps us get closer to good support.
Marco Atzeri.
This has long been done by the MSVC build system, and has caused
confusion in the past when programs like psql have failed to start
because they can't find the DLL. If it's in the same directory as it now
will be they will find it.
Backpatch to all live branches.
Evidence from buildfarm member crake suggests that the new test_shm_mq
module is routinely crashing the server due to the arrival of a SIGUSR1
after the shared memory segment has been unmapped. Although processes
using the new dynamic background worker facilities are more likely to
receive a SIGUSR1 around this time, the problem is also possible on older
branches, so I'm back-patching the parts of this change that apply to
older branches as far as they apply.
It's already generally the case that code checks whether these pointers
are NULL before deferencing them, so the important thing is mostly to
make sure that they do get set to NULL before they become invalid. But
in master, there's one case in procsignal_sigusr1_handler that lacks a
NULL guard, so add that.
Patch by me; review by Tom Lane.
Various places were supposing that errno could be expected to hold still
within an ereport() nest or similar contexts. This isn't true necessarily,
though in some cases it accidentally failed to fail depending on how the
compiler chanced to order the subexpressions. This class of thinko
explains recent reports of odd failures on clang-built versions, typically
missing or inappropriate HINT fields in messages.
Problem identified by Christian Kruse, who also submitted the patch this
commit is based on. (I fixed a few issues in his patch and found a couple
of additional places with the same disease.)
Back-patch as appropriate to all supported branches.
In the platform that doesn't support Unix-domain socket, when
neither host nor hostaddr are specified, the default host
'localhost' is used to connect to the server and PQhost() must
return that, but it didn't. This patch fixes PQhost() so that
it returns the default host in that case.
Also this patch fixes PQhost() so that it doesn't return
Unix-domain socket directory path in the platform that doesn't
support Unix-domain socket.
Back-patch to all supported versions.
A while back, 2c92edad48 allowed
type_func_name_keywords to be used in more places, including role
identifiers. Unfortunately, that commit missed out on cases where
name_list was used for lists-of-roles, eg: for DROP ROLE. This
resulted in the unfortunate situation that you could CREATE a role
with a type_func_name_keywords-allowed identifier, but not DROP it
(directly- ALTER could be used to rename it to something which
could be DROP'd).
This extends allowing type_func_name_keywords to places where role
lists can be used.
Back-patch to 9.0, as 2c92edad48 was.
We've always allowed CREATE TABLE to create tables in the database's default
tablespace without checking for CREATE permissions on that tablespace.
Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE
didn't pick up on that exception.
This changes ALTER TABLE ... SET TABLESPACE to allow the database's default
tablespace without checking for CREATE rights on that tablespace, just as
CREATE TABLE works today. Users could always do this through a series of
commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so
let's fix the oversight in SET TABLESPACE's original implementation.
The psql Makefile was not creating $(datadir) before installing
psqlrc.sample there.
In most cases, the directory would be created in some other way, but for
the documented from-source client-only installation procedure, it could
fail.
Reported-by: Mike Blackwell <mike.blackwell@rrd.com>
In ordinary operation, VACUUM must be careful to take a cleanup lock on
each leaf page of a btree index; this ensures that no indexscans could
still be "in flight" to heap tuples due to be deleted. (Because of
possible index-tuple motion due to concurrent page splits, it's not enough
to lock only the pages we're deleting index tuples from.) In Hot Standby,
the WAL replay process must likewise lock every leaf page. There were
several bugs in the code for that:
* The replay scan might come across unused, all-zero pages in the index.
While btree_xlog_vacuum itself did the right thing (ie, nothing) with
such pages, xlogutils.c supposed that such pages must be corrupt and
would throw an error. This accounts for various reports of replication
failures with "PANIC: WAL contains references to invalid pages". To
fix, add a ReadBufferMode value that instructs XLogReadBufferExtended
not to complain when we're doing this.
* btree_xlog_vacuum performed the extra locking if standbyState ==
STANDBY_SNAPSHOT_READY, but that's not the correct test: we won't open up
for hot standby queries until the database has reached consistency, and
we don't want to do the extra locking till then either, for fear of reading
corrupted pages (which bufmgr.c would complain about). Fix by exporting a
new function from xlog.c that will report whether we're actually in hot
standby replay mode.
* To ensure full coverage of the index in the replay scan, btvacuumscan
would emit a dummy WAL record for the last page of the index, if no
vacuuming work had been done on that page. However, if the last page
of the index is all-zero, that would result in corruption of said page,
since the functions called on it weren't prepared to handle that case.
There's no need to lock any such pages, so change the logic to target
the last normal leaf page instead.
The first two of these bugs were diagnosed by Andres Freund, the other one
by me. Fixes based on ideas from Heikki Linnakangas and myself.
This has been wrong since Hot Standby was introduced, so back-patch to 9.0.
Allow for the possibility that folding a string to lower case makes it
longer (due to replacing a character with a longer multibyte character).
This doesn't change the number of trigrams that will be extracted, but
it does affect the required size of an intermediate buffer in
generate_trgm(). Per bug #8821 from Ufuk Kayserilioglu.
Also install some checks that the input string length is not so large
as to cause overflow in the calculations of palloc request sizes.
Back-patch to all supported versions.
Per reports from Andres Freund and Luke Campbell, a server failure during
set_pglocale_pgservice results in a segfault rather than a useful error
message, because the infrastructure needed to use ereport hasn't been
initialized; specifically, MemoryContextInit hasn't been called.
One known cause of this is starting the server in a directory it
doesn't have permission to read.
We could try to prevent set_pglocale_pgservice from using anything that
depends on palloc or elog, but that would be messy, and the odds of future
breakage seem high. Moreover there are other things being called in main.c
that look likely to use palloc or elog too --- perhaps those things
shouldn't be there, but they are there today. The best solution seems to
be to move the call of MemoryContextInit to very early in the backend's
real main() function. I've verified that an elog or ereport occurring
immediately after that is now capable of sending something useful to
stderr.
I also added code to elog.c to print something intelligible rather than
just crashing if MemoryContextInit hasn't created the ErrorContext.
This could happen if MemoryContextInit itself fails (due to malloc
failure), and provides some future-proofing against someone trying to
sneak in new code even earlier in server startup.
Back-patch to all supported branches. Since we've only heard reports of
this type of failure recently, it may be that some recent change has made
it more likely to see a crash of this kind; but it sure looks like it's
broken all the way back.
The standard typanalyze functions skip over values whose detoasted size
exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during
ANALYZE. However, we (I think I, actually :-() failed to consider the
possibility that *every* non-null value in a column is too wide. While
compute_minimal_stats() seems to behave reasonably anyway in such a case,
compute_scalar_stats() just fell through and generated no pg_statistic
entry at all. That's unnecessarily pessimistic: we can still produce
valid stanullfrac and stawidth values in such cases, since we do include
too-wide values in the average-width calculation. Furthermore, since the
general assumption in this code is that too-wide values are probably all
distinct from each other, it seems reasonable to set stadistinct to -1
("all distinct").
Per complaint from Kadri Raudsepp. This has been like this since roughly
neolithic times, so back-patch to all supported branches.
While working on most platforms the old way sometimes created alignment
problems. This should fix it. Also the regresion tests were updated to test for
the reported case.
Report and fix by MauMau <maumau307@gmail.com>
In commit c1352052ef, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not. This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type. There may be
other examples as well. To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.
Per bug #8228 from David Johnston. This has been broken since 8.0,
so patch all supported branches.
When starting WAL replay from an online checkpoint, the last replayed WAL
record variable was initialized using the checkpoint record's location, even
though the records between the REDO location and the checkpoint record had
not been replayed yet. That was noted as "slightly confusing" but harmless
in the comment, but in some cases, it fooled CheckRecoveryConsistency to
incorrectly conclude that we had already reached a consistent state
immediately at the beginning of WAL replay. That caused the system to accept
read-only connections in hot standby mode too early, and also PANICs with
message "WAL contains references to invalid pages".
Fix by initializing the variables to the REDO location instead.
In 9.2 and above, change CheckRecoveryConsistency() to use
lastReplayedEndRecPtr variable when checking if backup end location has
been reached. It was inconsistently using EndRecPtr for that check, but
lastReplayedEndRecPtr when checking min recovery point. It made no
difference before this patch, because in all the places where
CheckRecoveryConsistency was called the two variables were the same, but
it was always an accident waiting to happen, and would have been wrong
after this patch anyway.
Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0,
where hot standby was introduced.
There was an apparent attempt to limit the target database for
pg_restore to version 7.1.0 or later. Due to a leading zero this
was interpreted as an octal number, which allowed targets with
version numbers down to 2.87.36. The lowest actual release above
that was 6.0.0, so that was effectively the limit.
Since the success of the restore attempt will depend primarily on
on what statements were generated by the dump run, we don't want
pg_restore trying to guess whether a given target should be allowed
based on version number. Allow a connection to any version. Since
it is very unlikely that anyone would be using a recent version of
pg_restore to restore to a pre-6.0 database, this has little to no
practical impact, but it makes the code less confusing to read.
Issue reported and initial patch suggestion from Joel Jacobson
based on an article by Andrey Karpov reporting on issues found by
PVS-Studio static code analyzer. Final patch based on analysis by
Tom Lane. Back-patch to all supported branches.
The bug would only show up if the C sockaddr structure contained
zero in the first byte for a valid address; otherwise it would
fail to fail, which is probably why it went unnoticed for so long.
Patch submitted by Joel Jacobson after seeing an article by Andrey
Karpov in which he reports finding this through static code
analysis using PVS-Studio. While I was at it I moved a definition
of a local variable referenced in the buggy code to a more local
context.
Backpatch to all supported branches.
When locale is "ja_JP.SJIS", nl_langinfo(CODESET) returns "SHIFT_JIS"
on some platforms, at least on RedHat Linux. So the encoding/locale
match table (encoding_match_list) needs the entry. Otherwise client
encoding is set to SQL_ASCII.
Back patch to all supported branches.
This prevents a possible longjmp out of the signal handler if a timeout
or SIGINT occurs while something within the handler has transiently set
ImmediateInterruptOK. For safety we must hold off the timeout or cancel
error until we're back in mainline, or at least till we reach the end of
the signal handler when ImmediateInterruptOK was true at entry. This
syncs these functions with the logic now present in handle_sig_alarm.
AFAICT there is no live bug here in 9.0 and up, because I don't think we
currently can wait for any heavyweight lock inside these functions, and
there is no other code (except read-from-client) that will turn on
ImmediateInterruptOK. However, that was not true pre-9.0: in older
branches ProcessIncomingNotify might block trying to lock pg_listener, and
then a SIGINT could lead to undesirable control flow. It might be all
right anyway given the relatively narrow code ranges in which NOTIFY
interrupts are enabled, but for safety's sake I'm back-patching this.
Make the COPY test, which loads most of the large static tables used in
the tests, also explicitly ANALYZE those tables. This allows us to get
rid of various ad-hoc, and rather redundant, ANALYZE commands that had
gotten stuck into various test scripts over time to ensure we got
consistent plan choices. (We could have done a database-wide ANALYZE,
but that would cause stats to get attached to the small static tables
too, which results in plan changes compared to the historical behavior.
I'm not sure that's a good idea, so not going that far for now.)
Back-patch to 9.0, since 9.0 and 9.1 are currently sometimes failing
regression tests for lack of an "ANALYZE tenk1" in the subselect test.
There's no need for this in 8.4 since we didn't print any plans back
then.
An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...)
could produce an invalid plan that results in a crash at execution time,
if the planner attempts to flatten the outer IN into a semi-join.
This happens because convert_testexpr() was not expecting any nested
SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging
to the inner SubLink. (I think the comment denying that this case could
happen was wrong when written; it's certainly been wrong for quite a long
time, since very early versions of the semijoin flattening logic.)
Per report from Teodor Sigaev. Back-patch to all supported branches.
Previous commit e5de601267 modified dblink
to ensure client encoding matched the server. However the added
PQsetClientEncoding() call added significant overhead. Restore original
performance in the common case where client encoding already matches
server encoding by doing nothing in that case. Applies to all active
branches.
Issue reported and work sponsored by Zonar Systems.
Current OpenSSL code includes a BIO_clear_retry_flags() step in the
sock_write() function. Either we failed to copy the code correctly, or
they added this since we copied it. In any case, lack of the clear step
appears to be the cause of the server lockup after connection loss reported
in bug #8647 from Valentine Gogichashvili. Assume that this is correct
coding for all OpenSSL versions, and hence back-patch to all supported
branches.
Diagnosis and patch by Alexander Kukushkin.
Insertion to a non-leaf GIN page didn't make a full-page image of the page,
which is wrong. The code used to do it correctly, but was changed (commit
853d1c3103) because the redo-routine didn't
track incomplete splits correctly when the page was restored from a full
page image. Of course, that was not right way to fix it, the redo routine
should've been fixed instead. The redo-routine was surreptitiously fixed
in 2010 (commit 4016bdef8a), so all we need
to do now is revert the code that creates the record to its original form.
This doesn't change the format of the WAL record.
Backpatch to all supported versions.
The backpatch of a95335b544d9c8377e9dc7a399d8e9a155895f82 to 9.2, 9.1
and 9.0 was incomplete, missing changes to xlog.c, primarily the call
to TrimMultiXact(). Testing presumably didn't show a problem without
these changes because TrimMultiXact() performs defense-in-depth work,
which is not strictly necessary.
It also missed moving StartupMultiXact() which would have been
problematic if a restartpoing happened in exactly the wrong moment,
causing a transient error.
Andres Freund