Contrary to the comment here, POSIX does not guarantee atomicity of a
read(), if another process calls write() concurrently. Or at least Linux
does not. Add locking to load_relmap_file() to avoid the race condition.
Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case.
Backpatch-through: 9.6, all supported versions.
Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org
For no obvious reason, isolationtester has always insisted that
session and step names be written with double quotes. This is
fairly tedious and does little for test readability, especially
since the names that people actually choose almost always look
like normal identifiers. Hence, let's tweak the lexer to allow
SQL-like identifiers not only double-quoted strings.
(They're SQL-like, not exactly SQL, because I didn't add any
case-folding logic. Also there's no provision for U&"..." names,
not that anyone's likely to care.)
There is one incompatibility introduced by this change: if you write
"foo""bar" with no space, that used to be taken as two identifiers,
but now it's just one identifier with an embedded quote mark.
I converted all the src/test/isolation/ specfiles to remove
unnecessary double quotes, but stopped there because my
eyes were glazing over already.
Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.
Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us
The syntax summaries for CREATE FUNCTION and allied commands
made it look like LEAKPROOF is an alternative to
IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal
option. Improve that.
Per gripe from aazamrafeeque0. Thanks to David Johnston for
suggestions.
Discussion: https://postgr.es/m/162444349581.694.5818572718530259025@wrigleys.postgresql.org
Our uses of gss_display_status() and gss_display_name() assumed
that the gss_buffer_desc strings returned by those functions are
null-terminated. It appears that they generally are, given the
lack of field complaints up to now. However, the available
documentation does not promise this, and some man pages
for gss_display_status() show examples that rely on the
gss_buffer_desc.length field instead of expecting null
termination. Also, we now have a report that on some
implementations, clang's address sanitizer is of the opinion
that the byte after the specified length is undefined.
Hence, change the code to rely on the length field instead.
This might well be cosmetic rather than fixing any real bug, but
it's hard to be sure, so back-patch to all supported branches.
While here, also back-patch the v12 changes that made pg_GSS_error
deal honestly with multiple messages available from
gss_display_status.
Per report from Sudheer H R.
Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com
Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too. Also there was no visual separation of a query's result
from subsequent isolationtester output. This made test result
files confusing and hard to read.
To improve matters, let's use libpq's PQprint() function. Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.
Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.
Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us
We've long contended with isolation test results that aren't entirely
stable. Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures. I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.
We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders. This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session. We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm. It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.
Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.
Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us
Commits 84f5c2908 et al missed the need to cover plpgsql's "simple
expression" code path. If the first thing we execute after a
COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command,
we must explicitly do EnsurePortalSnapshotExists() to make sure we have
an outer snapshot. Note that it wouldn't be good enough to just push a
snapshot for the duration of the expression execution: what comes back
might be toasted, so we'd better have a snapshot protecting it.
The test case demonstrating this fact cheats a bit by marking a SQL
function immutable even though it fetches from a table. That's
nothing that users haven't been seen to do, though.
Per report from Jim Nasby. Back-patch to v11, like the previous fix.
Discussion: https://postgr.es/m/378885e4-f85f-fc28-6c91-c4d1c080bf26@amazon.com
If users provide old style pre-standardized Windows locale names in a
CREATE COLLATION command, the OS is unable to provide version
information. Continue without capturing version information, rather
than exposing an OS error.
This was originally done in commit 9f12a3b9 for 14 only, to support
future features that might encounter old style names from initdb's
default. It wasn't done in 13 because I didn't consider that users
might actually want to use the old format explicitly (something we
should consider blocking in a future release with a better error
message, but that's not a policy we've decided on yet).
Back-patch to 13, based on the field complaint in pgsql-bugs #17058.
Reported-by: Yasushi Yamashita <developer@yamashi-ta.jp>
Discussion: https://postgr.es/m/17058-b49f5793c912c5aa%40postgresql.org
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that. If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error. Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.
Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.
Per bug #17062 from Alexander Lakhin. It's been broken all along,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
This works fine in the "simple Query" code path; but if the
statement is in the plan cache then it's corrupted for future
re-execution. Apply copyObject() to protect the original
tree from modification, as we've done elsewhere.
This narrow fix is applied only to the back branches. In HEAD,
the problem was fixed more generally by commit 7c337b6b5; but
that changed ProcessUtility's API, so it's infeasible to
back-patch.
Per bug #17053 from Charles Samborski.
Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
The fast default code added in Release 11 omitted to check that the
table a fast default was being added to was a plain table. Thus one
could be added to a foreign table, which predicably blows up. Here we
perform that check.
In addition, on the back branches, since some of these might have
escaped into the wild, if we encounter a missing value for
an attribute of something other than a plain table we ignore it.
Fixes bug #17056
Backpatch to release 11,
Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
We use a tuple conversion map for partitions when replicating using an
ancestor's schema to convert tuples from partition's type to the
ancestor's. Before this map got initialized, we were processing
invalidation messages which access this map.
This issue happens only in version 13 as in HEAD we already have a code
that initializes each relation entry before we can process any
invalidation message. This issue is introduced by commit d250568121 in
version 13.
Reported-by: Tom Lane, as per buildfarm meber skink
Author: Amit Langote
Reviewed-by: Dilip Kumar, Amit Kapila
Discussion: https://www.postgresql.org/message-id/648020.1623854904@sss.pgh.pa.us
One of the error paths left *members uninitialized. That's not a live
bug, because most callers don't look at *members when the function
returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
"if (members != NULL) pfree(members)", but AFAICS it never passes an
invalid 'multi' value so it should not reach that error case.
The callers are also a bit inconsistent in their expectations.
heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
not a live bug either, because the function should never return 0, but
add an Assert for that to make it more clear. I left the callers alone
for now.
I also moved the line where we set *nmembers. It wasn't wrong before,
but I like to do that right next to the 'return' statement, to make it
clear that it's always set on return.
Also remove one unreachable return statement after ereport(ERROR), for
brevity and for consistency with the similar if-block right after it.
Author: Greg Nancarrow with the additional changes by me
Backpatch-through: 9.6, all supported versions
In a synchronous logical setup, locking [user] catalog tables can cause
deadlock. This is because logical decoding of transactions can lock
catalog tables to access them so exclusively locking those in transactions
can lead to deadlock. To avoid this users must refrain from having
exclusive locks on catalog tables.
Author: Takamichi Osumi
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked. I managed to break this rule
while refactoring code in 9dbf2b7d7. There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed. Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.
Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
Since commit c24dcd0cfd, we have been using pg_pread() to read the WAL
file, which doesn't change the seek position (unless we fall back to
the implementation in src/port/pread.c). Update comment accordingly.
Backpatch-through: 12, where we started to use pg_pread()
TestLib::perl2host can take a file argument as well as a directory
argument, so that code becomes substantially simpler. Also add comments
on why we're using forward slashes, and why we're setting
PERL_BADLANG=0.
Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net
During decoding for speculative inserts, we were relying for cleaning
toast hash on confirmation records or next change records. But that
could lead to multiple problems (a) memory leak if there is neither a
confirmation record nor any other record after toast insertion for a
speculative insert in the transaction, (b) error and assertion failures
if the next operation is not an insert/update on the same table.
The fix is to start queuing spec abort change and clean up toast hash
and change record during its processing. Currently, we are queuing the
spec aborts for both toast and main table even though we perform cleanup
while processing the main table's spec abort record. Later, if we have a
way to distinguish between the spec abort record of toast and the main
table, we can avoid queuing the change for spec aborts of toast tables.
Reported-by: Ashutosh Bapat
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
This should have been updated in d2d8a229b, but it was overlooked.
According to 31a877f18 which added it, this file is meant to show the
results you get under default_transaction_isolation = serializable.
We've largely lost track of that goal in other isolation tests, but
as long as we've got this one, it should be right.
Noted while fooling about with the isolationtester.
Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC. (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.) This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.
To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1. This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too. It's not terribly critical since we don't
do anything with the result except possibly print it. (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead. That's not okay for a back-patched
bug fix, though.)
Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0. This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.
Per report from Wells Oliver. Back-patch to all supported
versions, since any of them might need to run with a newer glibc.
Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
Translate path slashes on target directory path. This was confusing old
branches, but is applied to all branches for the sake of uniformity.
Perl is perfectly able to understand paths with forward slashes.
Along the way, restore the previous archive_wait query, for the sake of
uniformity with other tests, per gripe from Tom Lane.
This is similar to the work done in 8279f68 for TestLib.pm, where
environment variables set may cause unwanted failures if using a
temporary installation with pg_regress. The list of variables reset is
adjusted in each stable branch depending on what is supported.
Comments are added to remember that the lists in TestLib.pm and
pg_regress.c had better be kept in sync.
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz
Backpatch-through: 9.6
Several TAP tests use poll_query_until() to wait for the postmaster
to restart. They were checking to see if a trivial query
(e.g. "SELECT 1") succeeds. However, that's problematic in the wake
of commit 11e9caff8, because now that we feed said query to psql
via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql
quits before reading the query. Hence, we can't use a nonempty
query in cases where we need to wait for connection failures to
stop happening.
Per the precedent of commits c757a3da0 and 6d41dd045, we can pass
"undef" as the query in such cases to ensure that IPC::Run has
nothing to write. However, then we have to say that the expected
output is empty, and this exposes a deficiency in poll_query_until:
if psql fails altogether and returns empty stdout, poll_query_until
will treat that as a success! That's because, contrary to its
documentation, it makes no actual check for psql failure, looking
neither at the exit status nor at stderr.
To fix that, adjust poll_query_until to insist on empty stderr as
well as a stdout match. (I experimented with checking exit status
instead, but it seems that psql often does exit(1) in cases that we
need to consider successes. That might be something to fix someday,
but it would be a non-back-patchable behavior change.)
Back-patch to v10. The test cases needing this exist only as far
back as v11, but it seems wise to keep poll_query_until's behavior
the same in v10, in case we back-patch another such test case in
future. (9.6 does not currently need this change, because in that
branch poll_query_until can't be told to accept empty stdout as
a success case.)
Per assorted buildfarm failures, mostly on hoverfly.
Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
Previously, a zero value for the relfilenode resulted in
a confusing error message about "unexpected duplicate".
This function returns NULL for other invalid relfilenode
values, so zero should be treated likewise.
It's been like this all along, so back-patch to all supported
branches.
Justin Pryzby
Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com
Using an Assert to check the validity of incoming messages is an
extremely poor decision. In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.
To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated. Hence, all the new messages here
use errmsg_internal(). A couple of old messages are changed likewise
for consistency.
Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().
Most of these mistakes are new with the "streaming replication"
patch (commit 464824323), but a couple go back a long way.
Back-patch as appropriate.
Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us
Commit caba8f0d43 wasn't quite right for msys, as demonstrated by
several buildfarm animals, including jacana and fairywren. We need to
use the msys perl in the archive command, but call it in such a way that
Windows will understand the path. Furthermore, inside the copy script we
need to convert a Windows path to an msys path.
ab55d74 has introduced some tests with rows found as missing in logical
replication subscriptions for partitioned tables, relying on a logic
with a lookup of the logs generated, scanning the whole file. This
commit makes the logic more precise, by scanning the logs only from the
position before the key queries are run to the position where we check
for the logs. This will reduce the risk of issues with log patterns
overlapping with each other if those tests get more complicated in the
future.
Per discussion with Tom Lane.
Discussion: https://postgr.es/m/YMP+Gx2S8meYYHW4@paquier.xyz
Backpatch-through: 13
This variable was present in the list added by 9d660670, but it is not
supported by this branch. Issue noticed while diving into a similar
change for pg_regress.c.
Backpatch-through: 9.6
We were already reporting it, but only after the parallel workers were
finished, which is visibly much later than what happens in a serial
build.
With this change we report it when the leader starts its own sort phase
when participating in the build (the normal case). Now this might
happen a little later than when the workers start their sorting phases,
but a) communicating the actual phase start from workers is likely to be
a hassle, and b) the sort phase start is pretty fuzzy anyway, since
sorting per se is actually initiated by tuplesort.c internally earlier
than tuplesort_performsort() is called.
Backpatch to pg12, where the progress reporting code for CREATE INDEX
went in.
Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Greg Nancarrow <gregn4422@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/1128176d-1eee-55d4-37ca-e63644422adb
apply_handle_tuple_routing(), having detected and reported that
the tuple it needed to update didn't exist, tried to update that
tuple anyway, leading to a null-pointer dereference.
logicalrep_partition_open() failed to ensure that the
LogicalRepPartMapEntry it built for a partition was fully
independent of that for the partition root, leading to
trouble if the root entry was later freed or rebuilt.
Meanwhile, on the publisher's side, pgoutput_change() sometimes
attempted to apply execute_attr_map_tuple() to a NULL tuple.
The first of these was reported by Sergey Bernikov in bug #17055;
I found the other two while developing some test cases for this
sadly under-tested code.
Diagnosis and patch for the first issue by Amit Langote; patches
for the others by me; new test cases by me. Back-patch to v13
where this logic came in.
Discussion: https://postgr.es/m/17055-9ba800ec8522668b@postgresql.org
The code added to mark replication slots invalid in commit c655077639
had the race condition that a slot can be dropped or advanced
concurrently with checkpointer trying to invalidate it. Rewrite the
code to close those races.
The changes to ReplicationSlotAcquire's API added with c655077639 are
not necessary anymore. To avoid an ABI break in released branches, this
commit leaves that unchanged; it'll be changed in a master-only commit
separately.
Backpatch to 13, where this code first appeared.
Reported-by: Andres Freund <andres@anarazel.de>
Author: Andres Freund <andres@anarazel.de>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20210408001037.wfmk6jud36auhfqm@alap3.anarazel.de
It turns out that worker.c's code path for TRUNCATE was also
careless about establishing a snapshot while executing user-defined
code, allowing the checks added by commit 84f5c2908 to fail when
a trigger is fired in that context.
We could just wrap Push/PopActiveSnapshot around the truncate call,
but it seems better to establish a policy of holding a snapshot
throughout execution of a replication step. To help with that and
possible future requirements, replace the previous ensure_transaction
calls with pairs of begin/end_replication_step calls.
Per report from Mark Dilger. Back-patch to v11, like the previous
changes.
Discussion: https://postgr.es/m/B4A3AF82-79ED-4F4C-A4E5-CD2622098972@enterprisedb.com
This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.
Commit ee994272ca introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.
Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.
Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com
The set of subcommands supported by \dAp, \do and \dy was described
incorrectly in psql's --help. The documentation was already consistent
with the code.
Reported-by: inoas, from IRC
Author: Matthijs van der Vleuten
Reviewed-by: Neil Chen
Discussion: https://postgr.es/m/6a984e24-2171-4039-9050-92d55e7b23fe@www.fastmail.com
Backpatch-through: 9.6
Further thought about bug #17050 suggests that it's a good idea
to use CURSOR_OPT_NO_SCROLL for the implicit cursor opened by
a plpgsql FOR-over-query loop. This ensures that, if somebody
commits inside the loop, PersistHoldablePortal won't try to
rewind and re-read the cursor. While we'd have selected NO_SCROLL
anyway if FOR UPDATE/SHARE appears in the query, there are other
hazards with volatile functions; and in any case, it's silly to
expend effort storing rows that we know for certain won't be needed.
(While here, improve the comment in exec_run_select, which was a bit
confused about the rationale for when we can use parallel mode.
Cursor operations aren't a hazard for nameless portals.)
This wasn't an issue until v11, which introduced the possibility
of persisting such cursors. Hence, back-patch to v11.
Per bug #17050 from Алексей Булгаков.
Discussion: https://postgr.es/m/17050-f77aa827dc85247c@postgresql.org
PersistHoldablePortal has long assumed that it should store the
entire output of the query-to-be-persisted, which requires rewinding
and re-reading the output. This is problematic if the query is not
stable: we might get different row contents, or even a different
number of rows, which'd confuse the cursor state mightily.
In the case where the cursor is NO SCROLL, this is very easy to
solve: just store the remaining query output, without any rewinding,
and tweak the portal's cursor state to match. Aside from removing
the semantic problem, this could be significantly more efficient
than storing the whole output.
If the cursor is scrollable, there's not much we can do, but it
was already the case that scrolling a volatile query's result was
pretty unsafe. We can just document more clearly that getting
correct results from that is not guaranteed.
There are already prohibitions in place on using SCROLL with
FOR UPDATE/SHARE, which is one way for a SELECT query to have
non-stable results. We could imagine prohibiting SCROLL when
the query contains volatile functions, but that would be
expensive to enforce. Moreover, it could break applications
that work just fine, if they have functions that are in fact
stable but the user neglected to mark them so. So settle for
documenting the hazard.
While this problem has existed in some guise for a long time,
it got a lot worse in v11, which introduced the possibility
of persisting plpgsql cursors (perhaps implicit ones) even
when they violate the rules for what can be marked WITH HOLD.
Hence, I've chosen to back-patch to v11 but not further.
Per bug #17050 from Алексей Булгаков.
Discussion: https://postgr.es/m/17050-f77aa827dc85247c@postgresql.org
Mis-merge introduced by e2f21ff, where pgwin32_setenv() was listed but
not defined in win32env.c. This had no consequences as this routine
does not exist in this branch.
Only REL_12_STABLE and REL_13_STABLE got that wrong.
Backpatch-through: 12
If autovacuum comes along just after we fill table test_seg with
some data, it will update the stats to the point where we prefer
a plain indexscan over a bitmap scan, breaking the expected
output (as well as the point of the test case). To fix, just
force a bitmap scan to be chosen here.
This has evidently been wrong since commit de1d042f5. It's not
clear why we just recently saw any buildfarm failures due to it;
but prairiedog has failed twice on this test in the past week.
Hence, backpatch to v11 where this test case came in.
An incorrectly-encoded multibyte character near the end of a string
could cause various processing loops to run past the string's
terminating NUL, with results ranging from no detectable issue to
a program crash, depending on what happens to be in the following
memory.
This isn't an issue in the server, because we take care to verify
the encoding of strings before doing any interesting processing
on them. However, that lack of care leaked into client-side code
which shouldn't assume that anyone has validated the encoding of
its input.
Although this is certainly a bug worth fixing, the PG security team
elected not to regard it as a security issue, primarily because
any untrusted text should be sanitized by PQescapeLiteral or
the like before being incorporated into a SQL or psql command.
(If an app fails to do so, the same technique can be used to
cause SQL injection, with probably much more dire consequences
than a mere client-program crash.) Those functions were already
made proof against this class of problem, cf CVE-2006-2313.
To fix, invent PQmblenBounded() which is like PQmblen() except it
won't return more than the number of bytes remaining in the string.
In HEAD we can make this a new libpq function, as PQmblen() is.
It seems imprudent to change libpq's API in stable branches though,
so in the back branches define PQmblenBounded as a macro in the files
that need it. (Note that just changing PQmblen's behavior would not
be a good idea; notably, it would completely break the escaping
functions' defense against this exact problem. So we just want a
version for those callers that don't have any better way of handling
this issue.)
Per private report from houjingyi. Back-patch to all supported branches.
The Msys shell mangles certain patterns in its command line, so avoid
handing arbitrary SQL to psql on the command line and instead use
IPC::Run's redirection facility for stdin. This pattern is already
mostly whats used, but query_poll_until() was not doing the right thing.
Problem discovered on the buildfarm when a new TAP test failed on msys.
The internal SQL queries used by REFRESH MATERIALIZED VIEW CONCURRENTLY
include some aliases for its diff and temporary relations with
rather-generic names: diff, newdata, newdata2 and mv. Depending on the
queries used for the materialized view, using CONCURRENTLY could lead to
some internal failures if the query and those internal aliases conflict.
Those names have been chosen in 841c29c8. This commit switches instead
to a naming pattern which is less likely going to cause conflicts, based
on an idea from Thomas Munro, by appending _$ to those aliases. This is
not perfect as those new names could still conflict, but at least it has
the advantage to keep the code readable and simple while reducing the
likelihood of conflicts to be close to zero.
Reported-by: Mathis Rudolf
Author: Bharath Rupireddy
Reviewed-by: Bernd Helmle, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/109c267a-10d2-3c53-b60e-720fcf44d9e8@credativ.de
Backpatch-through: 9.6
Various environment variables were not getting reset in the TAP tests,
which would cause failures depending on the tests or the environment
variables involved. For example, PGSSL{MAX,MIN}PROTOCOLVERSION could
cause failures in the SSL tests. Even worse, a junk value of
PGCLIENTENCODING makes a server startup fail. The list of variables
reset is adjusted in each stable branch depending on what is supported.
While on it, simplify a bit the code per a suggestion from Andrew
Dunstan, using a list of variables instead of doing single deletions.
Reviewed-by: Andrew Dunstan, Daniel Gustafsson
Discussion: https://postgr.es/m/YLbjjRpucIeZ78VQ@paquier.xyz
Backpatch-through: 9.6