When decoding the old version of an UPDATE or DELETE change, and if that
tuple was bigger than MaxHeapTupleSize, we either Assert'ed out, or
failed in more subtle ways in non-assert builds. Normally individual
tuples aren't bigger than MaxHeapTupleSize, with big datums toasted.
But that's not the case for the old version of a tuple for logical
decoding; the replica identity is logged as one piece. With the default
replica identity btree limits that to small tuples, but that's not the
case for FULL.
Change the tuple buffer infrastructure to separate allocate over-large
tuples, instead of always going through the slab cache.
This unfortunately requires changing the ReorderBufferTupleBuf
definition, we need to store the allocated size someplace. To avoid
requiring output plugins to recompile, don't store HeapTupleHeaderData
directly after HeapTupleData, but point to it via t_data; that leaves
rooms for the allocated size. As there's no reason for an output plugin
to look at ReorderBufferTupleBuf->t_data.header, remove the field. It
was just a minor convenience having it directly accessible.
Reported-By: Adam Dratwiński
Discussion: CAKg6ypLd7773AOX4DiOGRwQk1TVOQKhNwjYiVjJnpq8Wo+i62Q@mail.gmail.com
Somehow I managed to flip the order of restoring old & new tuples when
de-spooling a change in a large transaction from disk. This happens to
only take effect when a change is spooled to disk which has old/new
versions of the tuple. That only is the case for UPDATEs where he
primary key changed or where replica identity is changed to FULL.
The tests didn't catch this because either spooled updates, or updates
that changed primary keys, were tested; not both at the same time.
Found while adding tests for the following commit.
Backpatch: 9.4, where logical decoding was added
Logical decoding's reorderbuffer keeps transactions in an LSN ordered
list for efficiency. To make that's efficiently possible upper-level
xids are forced to be logged before nested subtransaction xids. That
only works though if these records are all looked at: Unfortunately we
didn't do so for e.g. row level locks, which are otherwise uninteresting
for logical decoding.
This could lead to errors like:
"ERROR: subxact logged without previous toplevel record".
It's not sufficient to just look at row locking records, the xid could
appear first due to a lot of other types of records (which will trigger
the transaction to be marked logged with MarkCurrentTransactionIdLoggedIfAny).
So invent infrastructure to tell reorderbuffer about xids seen, when
they'd otherwise not pass through reorderbuffer.c.
Reported-By: Jarred Ward
Bug: #13844
Discussion: 20160105033249.1087.66040@wrigleys.postgresql.org
Backpatch: 9.4, where logical decoding was added
Add four new SQL accessible functions: pg_control_system(),
pg_control_checkpoint(), pg_control_recovery(), and pg_control_init()
which expose a subset of the control file data.
Along the way move the code to read and validate the control file to
src/common, where it can be shared by the new backend functions
and the original pg_controldata frontend program.
Patch by me, significant input, testing, and review by Michael Paquier.
Previously recovery_min_apply_delay was applied even before recovery
had reached consistency. This could cause us to wait a long time
unexpectedly for read-only connections to be allowed. It's problematic
because the standby was useless during that wait time.
This patch changes recovery_min_apply_delay so that it's applied once
the database has reached the consistent state. That is, even if the delay
is set, the standby tries to replay WAL records as fast as possible until
it has reached consistency.
Author: Michael Paquier
Reviewed-By: Julien Rouhaud
Reported-By: Greg Clough
Backpatch: 9.4, where recovery_min_apply_delay was added
Bug: #13770
Discussion: http://www.postgresql.org/message-id/20151111155006.2644.84564@wrigleys.postgresql.org
Historically, the wait_for_stats() function in this test has simply checked
for a report of an indexscan on tenk2, corresponding to the last command
issued before we expect stats updates to appear. However, with parallel
query that indexscan could be done by a parallel worker that will emit
its stats counters to the collector before the session's main backend does
(a full second before, in fact, thanks to the "pg_sleep(1.0)" added by
commit 957d08c81f). That leaves a sizable window in which an
autovacuum-triggered write of the stats files would present a state in
which the indexscan on tenk2 appears to have been done, but none of the
write updates performed by the test have been. This is evidently the
explanation for intermittent failures seen by me and on buildfarm member
mandrill.
To fix, we should check separately for both the tenk2 seqscan and indexscan
counts, since those might be reported by different processes that could be
delayed arbitrarily on an overloaded test machine. And we need to check
for at least one update-related count. If we ever allow parallel workers
to do writes, this will get even more complicated ... but in view of all
the other hard problems that will entail, I don't feel a need to solve this
one today.
Per research by Rahila Syed and myself; part of this patch is Rahila's.
A simple SELECT is handled by PortalRunSelect, not ProcessQuery. Also,
the previous indentation was unclear: change it so that a deeper level
of indentation indicates that the outer function calls the inner one.
Stas Kelvich
Originally, we didn't have nworkers_launched, so code that used parallel
contexts had to be preprared for the possibility that not all of the
workers requested actually got launched. But now we can count on knowing
the number of workers that were successfully launched, which can shave
off a few cycles and simplify some code slightly.
Amit Kapila, reviewed by Haribabu Kommi, per a suggestion from Peter
Geoghegan.
Now it's possible to load recent version of Hunspell for several languages.
To handle these dictionaries Hunspell patch adds support for:
* FLAG long - sets the double extended ASCII character flag type
* FLAG num - sets the decimal number flag type (from 1 to 65535)
* AF parameter - alias for flag's set
Also it moves test dictionaries into separate directory.
Author: Artur Zakirov with editorization by me
The existing code confuses the byte length of the string (which is
relevant when passing it to pg_strncasecmp) with the character length
of the string (which is relevant when it is used with the SQL substring
function). Separate those two concepts.
Report and patch by Kyotaro Horiguchi, reviewed by Thomas Munro and
reviewed and further revised by me.
Previously, we included NULLS FIRST when appropriate but relied on the
default behavior to be NULLS LAST. This is, however, not true for a
sort in descending order and seems like a fragile assumption anyway.
Report by Rajkumar Raghuwanshi. Patch by Ashutosh Bapat. Review
comments from Michael Paquier and Tom Lane.
This makes the flag more visible for testers using the default file as a
template, increasing the likelyhood that the test suite will be run.
Also have the flag be displayed in the fake "configure" output, if set.
This patch is two new lines only, but perltidy decides to shift things
around which makes it appear a bit bigger.
Author: Michaël Paquier
Reviewed-by: Craig Ringer
Discussion: https://www.postgresql.org/message-id/CAB7nPqRet6UAP2APhZAZw%3DVhJ6w-Q-gGLdZkrOqFgd2vc9-ZDw%40mail.gmail.com
Otherwise running installcheck-force on a server with
synchronous_commit=off will result in the tests failing. All the other
tests already do so...
Backpatch: 9.4, where logical decoding was added
This makes the psql() method much more capable: it captures both stdout
and stderr; it now returns the psql exit code rather than stdout; a
timeout can now be specified, as can ON_ERROR_STOP behavior; it gained a
new "on_error_die" (defaulting to off) parameter to raise an exception
if there's any problem. Finally, additional parameters to psql can be
passed if there's need for further tweaking.
For convenience, a new safe_psql() method retains much of the old
behavior of psql(), except that it uses on_error_die on, so that
problems like syntax errors in SQL commands can be detected more easily.
Many existing TAP test files now use safe_psql, which is what is really
wanted. A couple of ->psql() calls are now added in the commit_ts
tests, which verify that the right thing is happening on certain errors.
Some ->command_fails() calls in recovery tests that were verifying that
psql failed also became ->psql() calls now.
Author: Craig Ringer. Some tweaks by Álvaro Herrera
Reviewed-By: Michaël Paquier
Also, mention in README that Perl files should be perltidy'ed. This
isn't really the best place (since we have Perl files elsewhere in the
tree) and this is already in pgindent's README, but this subdir is
likely to get hacked a whole lot more than the other Perl files, so it
seems okay to spend two lines on this.
Author: Craig Ringer
One test was relying on method remove_tree that isn't implemented in the
oldest Perl we support; fix it by using the older rmtree instead.
Another test had a typo in a SQL command, which isn't noticed because
the PostgresNode->psql() method doesn't check that queries return
correctly. That's undesirable and will also be fixed later on, but for
now let's make the test actually work.
Author: Craig Ringer
606c0123d6 attempted to reduce cost of index scans using > and <
strategies, though got that completely wrong in a few complex cases.
Revert whole patch until we find a safe optimization.
When adding replication origins in 5aa235042, I somehow managed to set
the timestamp of decoded transactions to InvalidXLogRecptr when decoding
one made without a replication origin. Fix that, and the wrong type of
the new commit_time variable.
This didn't trigger a regression test failure because we explicitly
don't show commit timestamps in the regression tests, as they obviously
are variable. Add a test that checks that a decoded commit's timestamp
is within minutes of NOW() from before the commit.
Reported-By: Weiping Qu
Diagnosed-By: Artur Zakirov
Discussion: 56D4197E.9050706@informatik.uni-kl.de,
56D42918.1010108@postgrespro.ru
Backpatch: 9.5, where 5aa235042 originates.
A thinko concerning nesting depth caused json_to_record() to produce bogus
output if a field of its input object contained a sub-object with a field
name matching one of the requested output column names. Per bug #13996
from Johann Visagie.
I added a regression test case based on his example, plus parallel tests
for json_to_recordset, jsonb_to_record, jsonb_to_recordset. The latter
three do not exhibit the same bug (which suggests that we may be missing
some opportunities to share code...) but testing seems like a good idea
in any case.
Back-patch to 9.4 where these functions were introduced.
Commits 9ff60273e3 and dbe2328959 adjusted the declarations
of some core functions referenced by contrib/tsearch2's install script,
forgetting that in a pg_upgrade situation, we'll be trying to restore
operator class definitions that reference the old signatures. We've
hit this problem before; solve it in the same way as before, namely by
installing stub functions that have the expected signature and just
invoke the correct function. Per report from Jeff Janes.
(Someday we ought to stop supporting contrib/tsearch2, but I'm not
sure today is that day.)
This makes it easier to relate the temporary data dirs to each node in
a test script.
Author: Kyotaro Horiguchi
Reviewed-By: Craig Ringer, Alvaro Herrera
PL/Tcl appears to contain logic to convert strings between the database
encoding and UTF8, which is the only encoding modern Tcl will deal with.
However, that code has been disabled since commit 034895125d, which
made it "#if defined(UNICODE_CONVERSION)" and neglected to provide any way
for that symbol to become defined. That might have been all right back
in 2001, but these days we take a dim view of allowing strings with
incorrect encoding into the database.
Remove the conditional compilation, fix warnings about signed/unsigned char
conversions, clean up assorted places that didn't bother with conversions.
(Notably, there were lots of assumptions that database table and field
names didn't need conversion...)
Add a regression test based on plpython_unicode. It's not terribly
thorough, but better than no test at all.
As of commit 2878220682, PL/Tcl will not
compile against pre-8.0 Tcl, whereas it used to work (more or less anyway)
with quite prehistoric versions. As long as we're moving these goalposts,
let's reinstall them at someplace that has some thought behind it. This
commit sets the minimum allowed Tcl version at 8.4, and rips out some bits
of compatibility cruft that are in consequence no longer needed. Reasons
for requiring 8.4 include:
* 8.4 was released in 2002; there seems little reason to believe that
anyone would want to use older versions with Postgres 9.6+.
* We have no buildfarm members testing anything older than 8.4, and
thus no way to know if it's broken.
* We need at least 8.1 to allow enforcement of database encoding
security (8.1 standardized Tcl on using UTF8 internally, before that
it was pretty unpredictable).
* Some versions between 8.1 and 8.4 allowed the backend to become
multithreaded, which is disastrous. We need at least 8.4 to be able
to disable the Tcl notifier subsystem to prevent that.
A small side benefit is that we can make the code more readable by
doing s/CONST84/const/g.
The original implementation of Tcl was all strings, but they improved
performance significantly by introducing typed "objects" (integers,
lists, code, etc). It's past time we made use of that; that happened
in Tcl 8.0 which was released in 1997.
This patch also modernizes some of the error-reporting code, which may
cause small changes in the spelling of complaints about bad calls to
PL/Tcl-provided commands.
Jim Nasby and Karl Lehenbauer, reviewed by Victor Wagner
Commit 7132810c (Retain tempdirs for failed tests) used Test::More's
is_passing method, but that was added in Test::More 0.89_01 which is
sometime later than Perl 5.10.1. Popular platforms such as RHEL6 don't
have that, nevermind some of our older dinosaurs. Do it the hard way.
Michael Paquier, based on research by Craig Ringer
The new bit indicates whether every tuple on the page is already frozen.
It is cleared only when the all-visible bit is cleared, and it can be
set only when we vacuum a page and find that every tuple on that page is
both visible to every transaction and in no need of any future
vacuuming.
A future commit will use this new bit to optimize away full-table scans
that would otherwise be triggered by XID wraparound considerations. A
page which is merely all-visible must still be scanned in that case, but
a page which is all-frozen need not be. This commit does not attempt
that optimization, although that optimization is the goal here. It
seems better to get the basic infrastructure in place first.
Per discussion, it's very desirable for pg_upgrade to automatically
migrate existing VM forks from the old format to the new format. That,
too, will be handled in a follow-on patch.
Masahiko Sawada, reviewed by Kyotaro Horiguchi, Fujii Masao, Amit
Kapila, Simon Riggs, Andres Freund, and others, and substantially
revised by me.
Test composite-type arguments and the argisnull and spi_lastoid Tcl
commmands. This stuff was not covered before, but needs to be exercised
since the upcoming Tcl object-conversion patch changes these code paths
(and broke at least one of them).
These tests verify that 1) WAL replay preserves the stored value,
2) a streaming standby server replays the value obtained from the
master, and 3) the behavior is sensible in the face of repeated
configuration changes.
One annoyance is that tmp_check/ subdir from the TAP tests is clobbered
when the pg_regress test runs in the same subdirectory. This is
bothersome but not too terrible a problem, since the pg_regress test is
not run anyway if the TAP tests fail (unless "make -k" is used).
I had these tests around since commit 69e7235c93e2; add them now that we
have the recovery test framework in place.
Fabien Coelho, reviewed mostly by Michael Paquier and me, but also by
Heikki Linnakangas, BeomYong Lee, Kyotaro Horiguchi, Oleksander
Shulgin, and Álvaro Herrera.
I noticed that the async-notify test results in log messages like these:
LOG: could not send data to client: Broken pipe
FATAL: connection to client lost
This is because it unceremoniously disconnects a client session that is
about to have some NOTIFY messages delivered to it. Such log messages
during a regression test might well cause people to go looking for a
problem that doesn't really exist (it did cause me to waste some time that
way). We can shut it up by adding an UNLISTEN command to session teardown.
Patch HEAD only; this doesn't seem significant enough to back-patch.
This error message was written with only ON SELECT rules in mind, but since
then we also made RETURNING-clause targetlists go through the same logic.
This means that you got a rather off-topic error message if you tried to
add a rule with RETURNING to a table having dropped columns. Ideally we'd
just support that, but some preliminary investigation says that it might be
a significant amount of work. Seeing that Nicklas Avén's complaint is the
first one we've gotten about this in the ten years or so that the code's
been like that, I'm unwilling to put much time into it. Instead, improve
the error report by issuing a different message for RETURNING cases, and
revise the associated comment based on this investigation.
Discussion: 1456176604.17219.9.camel@jordogskog.no