VACUUM FULL and CLUSTER can be used to enforce the use of the existing
compression method of a toastable column if a value currently stored is
compressed with a method that does not match the column's defined
method. The code in charge of decompressing and recompressing toast
values at rewrite left around the detoasted values, causing an
accumulation of memory allocated in TopTransactionContext.
When processing large relations, this could cause the system to run out
of memory. The detoasted values are not needed once their tuple is
rewritten, and this commit ensures that the necessary cleanup happens.
Issue introduced by bbe0a81d. The comments of the area are reordered a
bit while on it.
Reported-by: Andres Freund
Analyzed-by: Andres Freund
Author: Michael Paquier
Reviewed-by: Dilip Kumar
Discussion: https://postgr.es/m/20210521211929.pcehg6f23icwstdb@alap3.anarazel.de
The error messages, docs, and one of the options were using
'parallel degree' to indicate parallelism used by vacuum command. We
normally use 'parallel workers' at other places so change it for parallel
vacuum accordingly.
Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com
SSL renegotiation is already disabled as of 48d23c72, however this does
not prevent the server to comply with a client willing to use
renegotiation. In the last couple of years, renegotiation had its set
of security issues and flaws (like the recent CVE-2021-3449), and it
could be possible to crash the backend with a client attempting
renegotiation.
This commit takes one extra step by disabling renegotiation in the
backend in the same way as SSL compression (f9264d15) or tickets
(97d3a0b0). OpenSSL 1.1.0h has added an option named
SSL_OP_NO_RENEGOTIATION able to achieve that. In older versions
there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that
was undocumented, and could be set within the SSL object created when
the TLS connection opens, but I have decided not to use it, as it feels
trickier to rely on, and it is not official. Note that this option is
not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL
object are hidden to applications.
SSL renegotiation concerns protocols up to TLSv1.2.
Per original report from Robert Haas, with a patch based on a suggestion
by Andres Freund.
Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/YKZBXx7RhU74FlTE@paquier.xyz
Backpatch-through: 9.6
Result Cache, added in 9eacee2e6 neglected to properly adjust the plan
references in setrefs.c. This could lead to the following error during
EXPLAIN:
ERROR: cannot decompile join alias var in plan tree
Fix that.
Bug: 17030
Reported-by: Hans Buschmann
Discussion: https://postgr.es/m/17030-5844aecae42fe223@postgresql.org
The wraparound failsafe mechanism added by commit 1e55e7d1 handled the
one-pass strategy case (i.e. the "table has no indexes" case) by adding
a dedicated failsafe check. This made up for the fact that the usual
one-pass checks inside lazy_vacuum_all_indexes() cannot ever be reached
during a one-pass strategy VACUUM.
This approach failed to account for two-pass VACUUMs that opt out of
index vacuuming up-front. The INDEX_CLEANUP off case in the only case
that works like that.
Fix this by performing a failsafe check every 4GB during the first scan
of the heap, regardless of the details of the VACUUM. This eliminates
the special case, and will make the failsafe trigger more reliably.
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Andres Freund <andres@anarazel.de>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20210424002921.pb3t7h6frupdqnkp@alap3.anarazel.de
Code added in 9e215378d to disable building of Result Cache paths when
not all join conditions are part of the parameterization of a unique
join failed to first check if the inner path's param_info was set before
checking the param_info's ppi_clauses.
Add a check for NULL values here and just bail on trying to build the
path if param_info is NULL. lateral_vars are not considered when
deciding if the join is unique, so we're not missing out on doing the
optimization when there are lateral_vars and no param_info.
Reported-by: Coverity, via Tom Lane
Discussion: https://postgr.es/m/457998.1621779290@sss.pgh.pa.us
Now that attcompression is just a char, there's a lot of wasted
padding space after it. Move it into the group of char-wide
columns to save a net of 4 bytes per pg_attribute entry. While
we're at it, swap the order of attstorage and attalign to make for
a more logical grouping of these columns.
Also re-order actions in related code to match the new field ordering.
This patch also fixes one outright bug: equalTupleDescs() failed to
compare attcompression. That could, for example, cause relcache
reload to fail to adopt a new value following a change.
Michael Paquier and Tom Lane, per a gripe from Andres Freund.
Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
Emit a LOG message when the postmaster stops because of a failure in
the startup process. There already is a similar message if we exit
for that reason during PM_STARTUP phase, so it seems inconsistent
that there was none if the startup process fails later on.
Also emit a LOG message when the postmaster stops after a crash
because restart_after_crash is disabled. This seems potentially
helpful in case DBAs (or developers) forget that that's set.
Also, it was the only remaining place where the postmaster would
do an abnormal exit without any comment as to why.
In passing, remove an unreachable call of ExitPostmaster(0).
Discussion: https://postgr.es/m/194914.1621641288@sss.pgh.pa.us
If we redirected a replicated tuple operation into a partition child
table, and then tried to fire AFTER triggers for that event, the
relation cache entry for the child table was already closed. This has
no visible ill effects as long as the entry is still there and still
valid, but an unluckily-timed cache flush could result in a crash or
other misbehavior.
To fix, postpone the ExecCleanupTupleRouting call (which is what
closes the child table) until after we've fired triggers. This
requires a bit of refactoring so that the cleanup function can
have access to the necessary state.
In HEAD, I took the opportunity to simplify some of worker.c's
function APIs based on use of the new ApplyExecutionData struct.
However, it doesn't seem safe/practical to back-patch that aspect,
at least not without a lot of analysis of possible interactions
with a04daa97a.
In passing, add an Assert to afterTriggerInvokeEvents to catch
such cases. This seems worthwhile because we've grown a number
of fairly unstructured ways of calling AfterTriggerEndQuery.
Back-patch to v13, where worker.c grew the ability to deal with
partitioned target tables.
Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
In the wake of 84f5c2908, it's no longer necessary for plpgsql to
handle SET/RESET specially. The point of that was just to avoid
taking a new transaction snapshot prematurely, which the regular code
path through _SPI_execute_plan() now does just fine (in fact better,
since it now does the right thing for LOCK too). Hence, rip out a
few lines of code, going back to the old way of treating SET/RESET
as a generic SQL command. This essentially reverts all but the
test cases from b981275b6.
Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
When the planner considered using a Result Cache node to cache results
from the inner side of a Nested Loop Join, it failed to consider that the
inner path's parameterization may not be the entire join condition. If
the join was marked as inner_unique then we may accidentally put the cache
in singlerow mode. This meant that entries would be marked as complete
after caching the first row. That was wrong as if only part of the join
condition was parameterized then the uniqueness of the unique join was not
guaranteed at the Result Cache's level. The uniqueness is only guaranteed
after Nested Loop applies the join filter. If subsequent rows were found,
this would lead to:
ERROR: cache entry already complete
This could have been fixed by only putting the cache in singlerow mode if
the entire join condition was parameterized. However, Nested Loop will
only read its inner side so far as the first matching row when the join is
unique, so that might mean we never get an opportunity to mark cache
entries as complete. Since non-complete cache entries are useless for
subsequent lookups, we just don't bother considering a Result Cache path
in this case.
In passing, remove the XXX comment that claimed the above ERROR might be
better suited to be an Assert. After there being an actual case which
triggered it, it seems better to keep it an ERROR.
Reported-by: David Christensen
Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
This was previously allowed, but I think that was just an oversight.
It's a clear violation of the rule that a generated column cannot
depend on itself or other generated columns. Moreover, because the
code was relying on the assumption that no such cross-references
exist, it was pretty easy to crash ALTER TABLE and perhaps other
places. Even if you managed not to crash, you got quite unstable,
implementation-dependent results.
Per report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.
Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
We consider this supported (though I've got my doubts that it's a
good idea, because tableoid is not immutable). However, several
code paths failed to fill the field in soon enough, causing such
a GENERATED expression to see zero or the wrong value. This
occurred when ALTER TABLE adds a new GENERATED column to a table
with existing rows, and during regular INSERT or UPDATE on a
foreign table with GENERATED columns.
Noted during investigation of a report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.
Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
The original implementation of intra-procedure transactions just
cavalierly did that, ignoring the fact that this left us executing in
a rather different environment than normal. In particular, it turns
out that handling of toasted datums depends rather critically on there
being an outer ActiveSnapshot: otherwise, when SPI or the core
executor pop whatever snapshot they used and return, it's unsafe to
dereference any toasted datums that may appear in the query result.
It's possible to demonstrate "no known snapshots" and "missing chunk
number N for toast value" errors as a result of this oversight.
Historically this outer snapshot has been held by the Portal code,
and that seems like a good plan to preserve. So add infrastructure
to pquery.c to allow re-establishing the Portal-owned snapshot if it's
not there anymore, and add enough bookkeeping support that we can tell
whether it is or not.
We can't, however, just re-establish the Portal snapshot as part of
COMMIT/ROLLBACK. As in normal transaction start, acquiring the first
snapshot should wait until after SET and LOCK commands. Hence, teach
spi.c about doing this at the right time. (Note that this patch
doesn't fix the problem for any PLs that try to run intra-procedure
transactions without using SPI to execute SQL commands.)
This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
rename that to allow_nonatomic.
replication/logical/worker.c also needs some fixes, because it wasn't
careful to hold a snapshot open around AFTER trigger execution.
That code doesn't use a Portal, which I suspect someday we're gonna
have to fix. But for now, just rearrange the order of operations.
This includes back-patching the recent addition of finish_estate()
to centralize the cleanup logic there.
This also back-patches commit 2ecfeda3e into v13, to improve the
test coverage for worker.c (it was that test that exposed that
worker.c's snapshot management is wrong).
Per bug #15990 from Andreas Wicht. Back-patch to v11 where
intra-procedure COMMIT was added.
Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
While applying the truncate change, the logical apply worker acquires
RowExclusiveLock on the relation being truncated. This allowed truncate on
the relation at a time by two apply workers which lead to a deadlock. The
reason was that one of the workers after updating the pg_class tuple tries
to acquire SHARE lock on the relation and started to wait for the second
worker which has acquired RowExclusiveLock on the relation. And when the
second worker tries to update the pg_class tuple, it starts to wait for
the first worker which leads to a deadlock. Fix it by acquiring
AccessExclusiveLock on the relation before applying the truncate change as
we do for normal truncate operation.
Author: Peter Smith, test case by Haiying Tang
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 11
Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com
exec_for_query() normally tries to prefetch a few rows at a time
from the query being iterated over, so as to reduce executor
entry/exit overhead. Unfortunately this is unsafe if we have
COMMIT or ROLLBACK within the loop, because there might be
TOAST references in the data that we prefetched but haven't
yet examined. Immediately after the COMMIT/ROLLBACK, we have
no snapshots in the session, meaning that VACUUM is at liberty
to remove recently-deleted TOAST rows.
This was originally reported as a case triggering the "no known
snapshots" error in init_toast_snapshot(), but even if you miss
hitting that, you can get "missing toast chunk", as illustrated
by the added isolation test case.
To fix, just disable prefetching in non-atomic contexts. Maybe
there will be performance complaints prompting us to work harder
later, but it's not clear at the moment that this really costs
much, and I doubt we'd want to back-patch any complicated fix.
In passing, adjust that error message in init_toast_snapshot()
to be a little clearer about the likely cause of the problem.
Patch by me, based on earlier investigation by Konstantin Knizhnik.
Per bug #15990 from Andreas Wicht. Back-patch to v11 where
intra-procedure COMMIT was added.
Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
A lamentable oversight on my part meant that when PostgresVersion.pm was
added in commit 4c4eaf3d19 provision to install it was not added to the
Makefile, so it was not installed along with the other perl modules.
"typename" is a C++ keyword, so pg_upgrade.h fails to compile in C++.
Fortunately, there seems no likely reason for somebody to need to
do that. Nonetheless, it's project policy that all .h files should
pass cpluspluscheck, so rename the argument to fix that.
Oversight in 57c081de0; back-patch as that was. (The policy requiring
pg_upgrade.h to pass cpluspluscheck only goes back to v12, but it
seems best to keep this code looking the same in all branches.)
Older versions of perl on Windows don't like the list form of pipe open,
and perlcritic doesn't like the string form of open, so we avoid both
with a simpler formulation using qx{}.
Per complaint from Amit Kapila.
Recently we refactored things so that pg_regress makes the
"testtablespace" subdirectory used by the core regression tests,
instead of doing that in the makefiles. That had the undesirable
side effect of making such a subdirectory in every directory that
has "input" or "output" test files. Since these subdirectories
remain empty, git doesn't complain about them, but nonetheless
they're clutter.
To fix, invent an explicit --make-testtablespace-dir switch,
so that pg_regress only makes the subdirectory when explicitly
told to.
Discussion: https://postgr.es/m/2854388.1621284789@sss.pgh.pa.us
One of the tests for the pgbench permute() function added by
6b258e3d68 fails on some 32-bit platforms, due to variations in the
floating point computations in getrand(). The remaining tests give
sufficient coverage, so just remove the failing test.
Reported by Christoph Berg. Analysis by Thomas Munro and Tom Lane.
Based on patch by Fabien Coelho.
Discussion: https://postgr.es/m/YKQnUoYV63GRJBDD@msg.df7cb.de
If a promotion is triggered while recovery is paused, the paused state ends
and promotion continues. But previously in that case
pg_get_wal_replay_pause_state() returned 'paused' wrongly while a promotion
was ongoing.
This commit changes a standby promotion so that it marks the recovery
pause state as 'not paused' when it's triggered, to fix the issue.
Author: Fujii Masao
Reviewed-by: Dilip Kumar, Kyotaro Horiguchi
Discussion: https://postgr.es/m/f706876c-4894-0ba5-6f4d-79803eeea21b@oss.nttdata.com
We were not waiting for a publisher to catch up with the subscriber after
creating a subscription. Now, it can happen that apply worker starts
replication even after we have disabled the subscription in the test. This
will make the test expect that there is no active slot whereas there
exists one. Fix this symptom by allowing the publisher to wait for
catching up with the subscription.
It is not a good idea to ensure if the slot is still active by checking
for walsender existence as we release the slot after we clean up the
walsender related memory. Fix that by checking the slot status in
pg_replication_slots.
Also, it is better to avoid repeated enabling/disabling of the
subscription.
Finally, we make autovacuum off for this test to avoid any empty
transaction appearing in the test while consuming changes.
Reported-by: as per buildfarm
Author: Vignesh C
Reviewed-by: Amit Kapila, Michael Paquier
Discussion: https://postgr.es/m/CAA4eK1+uW1UGDHDz-HWMHMen76mKP7NJebOTZN4uwbyMjaYVww@mail.gmail.com
1) Previously there were both pgstat_send_wal() and pgstat_report_wal()
in order to send WAL activity to the stats collector. With the former being
used by wal writer, the latter by most other processes. They were a bit
redundant and so this commit merges them into pgstat_send_wal() to
simplify the code.
2) Previously WAL global statistics counters were calculated and then
compared with zero-filled buffer in order to determine whether any WAL
activity has happened since the last submission. These calculation and
comparison were not cheap. This was regularly exercised even in read-only
workloads. This commit fixes the issue by making some WAL activity
counters directly be checked to determine if there's WAL activity stats
to send.
3) Previously pgstat_report_stat() did not check if there's WAL activity
stats to send as part of the "Don't expend a clock check if nothing to do"
check at the top. It's probably rare to have pending WAL stats without
also passing one of the other conditions, but for safely this commit
changes pgstat_report_stats() so that it checks also some WAL activity
counters at the top.
This commit also adds the comments about the design of WAL stats.
Reported-by: Andres Freund
Author: Masahiro Ikeda
Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi, Andres Freund, Fujii Masao
Discussion: https://postgr.es/m/20210324232224.vrfiij2rxxwqqjjb@alap3.anarazel.de
This is an oversight from bbe0a81d, where the equivalent option exists
in pg_dump. This is useful to be able to reset the compression methods
cluster-wide when restoring the data based on default_toast_compression.
Reviewed-by: Daniel Gustafsson, Tom Lane
Discussion: https://postgr.es/m/YKHC+qCJvzCRVCpY@paquier.xyz