Add a new TAP test under src/test/recovery to run the standard
regression tests while a streaming replica replays the WAL. This
provides a basic workout for WAL decoding and redo code, and compares
the replicated result.
Optionally, enable (expensive) wal_consistency_checking if listed in
the env variable PG_TEST_EXTRA.
Reviewed-by: 綱川 貴之 (Takayuki Tsunakawa) <tsunakawa.takay@fujitsu.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
The tablespace tests in the main regression tests have been changed to
use "in-place" tablespaces, so that they work when streamed to a replica
on the same host. Add a new TAP test that exercises tablespaces with
absolute paths, for coverage.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
Remove the machinery from pg_regress that manages the testtablespace
directory. Instead, use "in-place" tablespaces, because they work
correctly when there is a streaming replica running on the same host.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
Provide a developer-only GUC allow_in_place_tablespaces, disabled by
default. When enabled, tablespaces can be created with an empty
LOCATION string, meaning that they should be created as a directory
directly beneath pg_tblspc. This can be used for new testing scenarios,
in a follow-up patch. Not intended for end-user usage, since it might
confuse backup tools that expect symlinks.
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGKpRWQ9SxdxxDmTBCJoR0YnFpMBe7kyzY8SUQk%2BHeskxg%40mail.gmail.com
Get rid of the three-valued logic for the Boolean variables to track
whether the value was been specified and what the new value should be.
Instead, we can use the "dfoo" variables to determine whether the
value was specified and should be applied. This was already done in
some cases, so this makes this more uniform and removes one layer of
indirection.
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
Corruption of redirect item pointers often only becomes visible well after
being corrupted, as e.g. bug #17255 shows: In the original reproducer,
gigabyte of WAL were between the source of the corruption and the corruption
becoming visible.
To make it easier to find / prevent such bugs, verify whether redirect
pointers are sensible at the end of heap_page_prune_execute(). 5cd7eb1f1c
introduced related assertions while modifying the page, but they can't easily
detect marking the target of an existing redirect as unused. Sometimes the
corruption will be detected later, but that's harder to diagnose.
Author: Andres Freund <andres@andres@anarazel.de>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb@alap3.anarazel.de
Since dc7420c2c9 the horizon used for pruning is determined "lazily". A more
accurate horizon is built on-demand, rather than in GetSnapshotData(). If a
horizon computation is triggered between two HeapTupleSatisfiesVacuum() calls
for the same tuple, the result can change from RECENTLY_DEAD to DEAD.
heap_page_prune() can process the same tid multiple times (once following an
update chain, once "directly"). When the result of HeapTupleSatisfiesVacuum()
of a tuple changes from RECENTLY_DEAD during the first access, to DEAD in the
second, the "tuple is DEAD and doesn't chain to anything else" path in
heap_prune_chain() can end up marking the target of a LP_REDIRECT ItemId
unused.
Initially not easily visible,
Once the target of a LP_REDIRECT ItemId is marked unused, a new tuple version
can reuse it. At that point the corruption may become visible, as index
entries pointing to the "original" redirect item, now point to a unrelated
tuple.
To fix, compute HTSV for all tuples on a page only once. This fixes the entire
class of problems of HTSV changing inside heap_page_prune(). However,
visibility changes can obviously still occur between HTSV checks inside
heap_page_prune() and outside (e.g. in lazy_scan_prune()).
The computation of HTSV is now done in bulk, in heap_page_prune(), rather than
on-demand in heap_prune_chain(). Besides being a bit simpler, it also is
faster: Memory accesses can happen sequentially, rather than in the order of
HOT chains.
There are other causes of HeapTupleSatisfiesVacuum() results changing between
two visibility checks for the same tuple, even before dc7420c2c9. E.g.
HEAPTUPLE_INSERT_IN_PROGRESS can change to HEAPTUPLE_DEAD when a transaction
aborts between the two checks. None of the these other visibility status
changes are known to cause corruption, but heap_page_prune()'s approach makes
it hard to be confident.
A patch implementing a more fundamental redesign of heap_page_prune(), which
fixes this bug and simplifies pruning substantially, has been proposed by
Peter Geoghegan in
https://postgr.es/m/CAH2-WzmNk6V6tqzuuabxoxM8HJRaWU6h12toaS-bqYcLiht16A@mail.gmail.com
However, that redesign is larger change than desirable for backpatching. As
the new design still benefits from the batched visibility determination
introduced in this commit, it makes sense to commit this narrower fix to 14
and master, and then commit Peter's improvement in master.
The precise sequence required to trigger the bug is complicated and hard to do
exercise in an isolation test (until we have wait points). Due to that the
isolation test initially posted at
https://postgr.es/m/20211119003623.d3jusiytzjqwb62p%40alap3.anarazel.de
and updated in
https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb%40alap3.anarazel.de
isn't committable.
A followup commit will introduce additional assertions, to detect problems
like this more easily.
Bug: #17255
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Debugged-By: Andres Freund <andres@anarazel.de>
Debugged-By: Peter Geoghegan <pg@bowt.ie>
Author: Andres Freund <andres@andres@anarazel.de>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb@alap3.anarazel.de
Backpatch: 14-, the oldest branch containing dc7420c2c9
Commit 7745bc352 intended to ensure that whole-row Vars would be
printed with "::type" decoration in all contexts where plain
"var.*" notation would result in star-expansion, notably in
ROW() and VALUES() constructs. However, it missed the case of
INSERT with a single-row VALUES, as reported by Timur Khanjanov.
Nosing around ruleutils.c, I found a second oversight: the
code for RowCompareExpr generates ROW() notation without benefit
of an actual RowExpr, and naturally it wasn't in sync :-(.
(The code for FieldStore also does this, but we don't expect that
to generate strictly parsable SQL anyway, so I left it alone.)
Back-patch to all supported branches.
Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az
The build summary was disabled unintentionally by setting the build verbosity
lower.
While at it, also add ForceNoAlign to prevent msbuild from introducing
linebreaks in the middle of filenames etc - they make it harder to copy
output.
Discussion: https://postgr.es/m/20220113175554.u6gw7olrdfzivl3n@alap3.anarazel.de
This is similar to b69aba7, except that this completes the work for
HMAC with a new routine called pg_hmac_error() that would provide more
context about the type of error that happened during a HMAC computation:
- The fallback HMAC implementation in hmac.c relies on cryptohashes, so
in some code paths it is necessary to return back the error generated by
cryptohashes.
- For the OpenSSL implementation (hmac_openssl.c), the logic is very
similar to cryptohash_openssl.c, where the error context comes from
OpenSSL if one of its internal routines failed, with different error
codes if something internal to hmac_openssl.c failed or was incorrect.
Any in-core code paths that use the centralized HMAC interface are
related to SCRAM, for errors that are unlikely going to happen, with
only SHA-256. It would be possible to see errors when computing some
HMACs with MD5 for example and OpenSSL FIPS enabled, and this commit
would help in reporting the correct errors but nothing in core uses
that. So, at the end, no backpatch to v14 is done, at least for now.
Errors in SCRAM related to the computation of the server key, stored
key, etc. need to pass down the potential error context string across
more layers of their respective call stacks for the frontend and the
backend, so each surrounding routine is adapted for this purpose.
Reviewed-by: Sergey Shinderuk
Discussion: https://postgr.es/m/Yd0N9tSAIIkFd+qi@paquier.xyz
This commit changes the term "WAL receiver" to "WAL receiver (process)"
in the glossary, so that users can easily understand WAL receiver is
obviously a process.
Author: Fujii Masao
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/51b76596-ab4f-c9f0-1005-60ce1bb14b6b@oss.nttdata.com
Commit 9dc718bd added a "logically unchanged by UPDATE" hinting
mechanism, which is currently used within nbtree indexes only (see
commit d168b666). This mechanism determined whether or not the incoming
item is a logically unchanged duplicate (a duplicate needed only for
MVCC versioning purposes) once per row updated per non-HOT update. This
approach led to memory leaks which were noticeable with an UPDATE
statement that updated sufficiently many rows, at least on tables that
happen to have an expression index.
On HEAD, fix the issue by adding a cache to the executor's per-index
IndexInfo struct.
Take a different approach on Postgres 14 to avoid an ABI break: simply
pass down the hint to all indexes unconditionally with non-HOT UPDATEs.
This is deemed acceptable because the hint is currently interpreted
within btinsert() as "perform a bottom-up index deletion pass if and
when the only alternative is splitting the leaf page -- prefer to delete
any LP_DEAD-set items first". nbtree must always treat the hint as a
noisy signal about what might work, as a strategy of last resort, with
costs imposed on non-HOT updaters. (The same thing might not be true
within another index AM that applies the hint, which is why the original
behavior is preserved on HEAD.)
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Klaudie Willis <Klaudie.Willis@protonmail.com>
Diagnosed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/261065.1639497535@sss.pgh.pa.us
Backpatch: 14-, where the hinting mechanism was added.
When building append paths, we've been looking only at startup and total
costs for the paths. When building fractional paths that may eliminate
the cheapest one, because it may be dominated by two separate paths (one
for startup, one for total cost).
This extends generate_orderedappend_paths() to also consider which paths
have lowest fractional cost. Currently we only consider paths matching
pathkeys - in the future this may be improved by also considering paths
that are only partially sorted, with an incremental sort on top.
Original report of an issue by Arne Roland, patch by me (based on a
suggestion by Tom Lane).
Reviewed-by: Arne Roland, Zhihong Yu
Discussion: https://postgr.es/m/e8f9ec90-546d-e948-acce-0525f3e92773%40enterprisedb.com
Discussion: https://postgr.es/m/1581042da8044e71ada2d6e3a51bf7bb%40index.de
This should have been added for the benefit of GetPublicationRelations;
let's add it now.
I couldn't measure a performance difference in the TAP tests, but that
may be because the tests use very few publications.
Discussion: https://postgr.es/m/202201120041.p24wvsfcsope@alvherre.pgsql
The recent refactoring done in ac7c807 makes this move possible and
simple, as this just moves some code around. This reduces the size of
elog.c by 7%.
Author: Michael Paquier, Sehrope Sarkuni
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
simply moves the routines related to csvlog into their own file
This refactors the following routines and facilities coming from
elog.c, to ease their use across multiple log destinations:
- Start timestamp, including its reset, to store when a process has been
started.
- The log timestamp, associated to an entry (the same timestamp is used
when logging across multiple destinations).
- Routine deciding if a query can be logged or not.
- The backend type names, depending on the process that logs any
information (postmaster, bgworker name or just GetBackendTypeDesc() with
a regular backend).
- Write of logs using the logging piped protocol, with the log collector
enabled.
- Error severity converted to a string.
These refactored routines will be used for some follow-up changes
to move all the csvlog logic into its own file and to potentially add
JSON as log destination, reducing the overall size of elog.c as the end
result.
Author: Michael Paquier, Sehrope Sarkuni
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAH7T-aqswBM6JWe4pDehi1uOiufqe06DJWaU5=X7dDLyqUExHg@mail.gmail.com
Oversight in commit e2f0f8ed. Also add this file to the exclusion lists
in headerscheck and cpluscpluscheck, because Unix systems don't have a
header it includes.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2760528.1641929756%40sss.pgh.pa.us
If we get ENOENT while trying to read an extension control file,
report that as a missing extension (with a HINT to install it)
rather than as a filesystem access problem. The message wording
was extensively bikeshedded in hopes of pointing people to the
idea that they need to do a software installation before they
can install the extension into the current database.
Nathan Bossart, with review/wording suggestions from Daniel
Gustafsson, Chapman Flack, and myself
Discussion: https://postgr.es/m/3950D56A-4E47-48E7-BF9B-F5F22E268BE7@amazon.com
The point of this patch is to reduce inclusion spam by not needing
to #include <netdb.h> or <pwd.h> in port.h (which is read by every
compile in our tree). To do that, we must remove port.h's
declarations of pqGetpwuid and pqGethostbyname.
pqGethostbyname is only used, and is only ever likely to be used,
in src/port/getaddrinfo.c --- which isn't even built on most
platforms, making pqGethostbyname dead code for most people.
Hence, deal with that by just moving it into getaddrinfo.c.
To clean up pqGetpwuid, invent a couple of simple wrapper
functions with less-messy APIs. This allows removing some
duplicate error-handling code, too.
In passing, remove thread.c from the MSVC build, since it
contains nothing we use on Windows.
Noted while working on 376ce3e40.
Discussion: https://postgr.es/m/1634252654444.90107@mit.edu
Previously, invoking pg_terminate_backend() or pg_cancel_backend()
with the postmaster PID produced a "PID XXXX is not a PostgresSQL
server process" warning, which does not make sense. Change to
"backend process" to make the message more exact.
Nathan Bossart, based on an idea from Bharath Rupireddy with
input from Tom Lane and Euler Taveira
Discussion: https://www.postgresql.org/message-id/flat/CALj2ACW7Rr-R7mBcBQiXWPp=JV5chajjTdudLiF5YcpW-BmHhg@mail.gmail.com
Experimenting with FIPS mode enabled, I saw
regression=# \password joe
Enter new password for user "joe":
Enter it again:
could not encrypt password: disabled for FIPS
out of memory
because PQencryptPasswordConn was still of the opinion that "out of
memory" is always appropriate to print.
Minor oversight in b69aba745. Like that one, back-patch to v14.
Previously pg_log_backend_memory_contexts() could request to
log the memory contexts of backends, but not of auxiliary processes
such as checkpointer. This commit enhances the function so that
it can also send the request to auxiliary processes. It's useful to
look at the memory contexts of those processes for debugging purpose
and better understanding of the memory usage pattern of them.
Note that pg_log_backend_memory_contexts() cannot send the request
to logger or statistics collector. Because this logging request
mechanism is based on shared memory but those processes aren't
connected to that.
Author: Bharath Rupireddy
Reviewed-by: Vignesh C, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACU1nBzpacOK2q=a65S_4+Oaz_rLTsU1Ri0gf7YUmnmhfQ@mail.gmail.com
The existing cryptohash facility was causing problems in some code paths
related to MD5 (frontend and backend) that relied on the fact that the
only type of error that could happen would be an OOM, as the MD5
implementation used in PostgreSQL ~13 (the in-core implementation is
used when compiling with or without OpenSSL in those older versions),
could fail only under this circumstance.
The new cryptohash facilities can fail for reasons other than OOMs, like
attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
would cause incorrect reports to show up.
This commit extends the cryptohash APIs so as callers of those routines
can fetch more context when an error happens, by using a new routine
called pg_cryptohash_error(). The error states are stored within each
implementation's internal context data, so as it is possible to extend
the logic depending on what's suited for an implementation. The default
implementation requires few error states, but OpenSSL could report
various issues depending on its internal state so more is needed in
cryptohash_openssl.c, and the code is shaped so as we are always able to
grab the necessary information.
The core code is changed to adapt to the new error routine, painting
more "const" across the call stack where the static errors are stored,
particularly in authentication code paths on variables that provide
log details. This way, any future changes would warn if attempting to
free these strings. The MD5 authentication code was also a bit blurry
about the handling of "logdetail" (LOG sent to the postmaster), so
improve the comments related that, while on it.
The origin of the problem is 87ae969, that introduced the centralized
cryptohash facility. Extra changes are done for pgcrypto in v14 for the
non-OpenSSL code path to cope with the improvements done by this
commit.
Reported-by: Michael Mühlbeyer
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
Backpatch-through: 14
I had a brain fade in commit d32899157, and used 2:30AM as the
example timestamp for both spring-forward and fall-back cases.
But it's not actually ambiguous at all in the fall-back case,
because that transition is from 2AM to 1AM under USA rules.
Fix the example to use 1:30AM, which *is* ambiguous.
Noted while answering a question from Aleksander Alekseev.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/2191355.1641828552@sss.pgh.pa.us
Try to disable ASLR when building in EXEC_BACKEND mode, to avoid random
memory mapping failures while testing. For developer use only, no
effect on regular builds.
Suggested-by: Andres Freund <andres@anarazel.de>
Tested-by: Bossart, Nathan <bossartn@amazon.com>
Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de
When we need to identify the home directory on non-Windows, first
consult getenv("HOME"). If that's empty or unset, fall back
on our previous method of checking the <pwd.h> database.
Preferring $HOME allows the user to intentionally point at some
other directory, and it seems to be in line with the behavior of
most other utilities. However, we shouldn't rely on it completely,
as $HOME is likely to be unset when running as a daemon.
Anders Kaseorg
Discussion: https://postgr.es/m/1634252654444.90107@mit.edu
Since this function is defined to accept pg_node_tree values, it could
get applied to any nodetree that can appear in a cataloged pg_node_tree
column. Some such cases can't be supported --- for example, its API
doesn't allow providing referents for more than one relation --- but
we should try to throw a user-facing error rather than an internal
error when encountering such a case.
In support of this, extend expression_tree_walker/mutator to be sure
they'll work on any such node tree (which basically means adding
support for relpartbound node types). That allows us to run pull_varnos
and check for the case of multiple relations before we start processing
the tree. The alternative of changing the low-level error thrown for an
out-of-range varno isn't appealing, because that could mask actual bugs
in other usages of ruleutils.
Per report from Justin Pryzby. This is basically cosmetic, so no
back-patch.
Discussion: https://postgr.es/m/20211219205422.GT17618@telsasoft.com
If contrib/btree_gist is used to make a GIST index on a char(N)
(bpchar) column, and that column is retrieved via an index-only
scan, what came out had all trailing spaces removed. Since
that doesn't happen in any other kind of table scan, this is
clearly a bug. The cause is that gbt_bpchar_compress() strips
trailing spaces (using rtrim1) before a new index entry is made.
That was probably a good idea when this code was first written,
but since we invented index-only scans, it's not so good.
One answer could be to mark this opclass as incapable of index-only
scans. But to do so, we'd need an extension module version bump,
followed by manual action by DBAs to install the updated version
of btree_gist. And it's not really a desirable place to end up,
anyway.
Instead, let's fix the code by removing the unwanted space-stripping
action and adjusting the opclass's comparison logic to ignore
trailing spaces as bpchar normally does. This will not hinder
cases that work today, since index searches with this logic will
act the same whether trailing spaces are stored or not. It will
not by itself fix the problem of getting space-stripped results
from index-only scans, of course. Users who care about that can
REINDEX affected indexes after installing this update, to immediately
replace all improperly-truncated index entries. Otherwise, it can
be expected that the index's behavior will change incrementally as
old entries are replaced by new ones.
Per report from Alexander Lakhin. Back-patch to all supported branches.
Discussion: https://postgr.es/m/696c995b-b37f-5526-f45d-04abe713179f@gmail.com
This addresses some problems in the describe queries used for extended
statistics:
- Two schema qualifications for the text type were missing for \dX.
- The list of extended statistics listed for a table through \d was
ordered based on the object OIDs, but it is more consistent with the
other commands to order by namespace and then by object name.
- A couple of aliases were not used in \d. These are removed.
This is similar to commits 1f092a3 and 07f8a9e.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220107022235.GA14051@telsasoft.com
Backpatch-through: 14
Prevent logical replication workers from performing insert, update,
delete, truncate, or copy commands on tables unless the subscription
owner has permission to do so.
Prevent subscription owners from circumventing row-level security by
forbidding replication into tables with row-level security policies
which the subscription owner is subject to, without regard to whether
the policy would ordinarily allow the INSERT, UPDATE, DELETE or
TRUNCATE which is being replicated. This seems sufficient for now, as
superusers, roles with bypassrls, and target table owners should still
be able to replicate despite RLS policies. We can revisit the
question of applying row-level security policies on a per-row basis if
this restriction proves too severe in practice.
Author: Mark Dilger
Reviewed-by: Jeff Davis, Andrew Dunstan, Ronan Dunklau
Discussion: https://postgr.es/m/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53%40enterprisedb.com
pg_basebackup.c relies on the compression level to not be 0 to decide if
compression should be used, but 000f3adf missed the fact that the
default compression (Z_DEFAULT_COMPRESSION) is -1, which would be used
if specifying --gzip without --compress.
While on it, add some coverage for --gzip, as this is rather easy to
miss.
Reported-by: Christoph Berg
Discussion: https://postgr.es/m/YdhRDMLjabtXOnhY@msg.df7cb.de
Instead of using a hardcoded or default path to the perl file the .bat
file is a wrapper for, we use a path that means the file is found in
the same directory as the .bat file.
Patch by Anton Voloshin, slightly tweaked by me.
Backpatch to all live branches
Discussion: https://postgr.es/m/2b7a674b-5fb0-d264-75ef-ecc7a31e54f8@postgrespro.ru
Commit db7d1a7b05 missed a couple of places that needed adjustment now
we are not building pgcrypto when openssl is not configured, causing
contrib installcheck to fail in such a case.