49876 Commits

Author SHA1 Message Date
Etsuro Fujita
fba8305733 Update comment in portal.h.
We store tuples into the portal's tuple store for a PORTAL_ONE_MOD_WITH
query as well.

Back-patch to all supported branches.

Reviewed by Andy Fan.

Discussion: https://postgr.es/m/CAPmGK14HVYBZYZtHabjeCd-e31VT%3Dwx6rQNq8QfehywLcpZ2Hw%40mail.gmail.com
2024-08-01 17:45:09 +09:00
Tom Lane
41fb45a482 Revert "Allow parallel workers to cope with a newly-created session user ID."
This reverts commit 68855c03878c0c90227e24533ca40127da3578cd.

Some buildfarm animals are failing with "cannot change
"client_encoding" during a parallel operation".  It looks like
assign_client_encoding is unhappy at being asked to roll back a
client_encoding setting after a parallel worker encounters a
failure.  There must be more to it though: why didn't I see this
during local testing?  In any case, it's clear that moving the
RestoreGUCState() call is not as side-effect-free as I thought.
Given that the bug f5f30c22e intended to fix has gone unreported
for years, it's not something that's urgent to fix; I'm not
willing to risk messing with it further with only days to our
next release wrap.
2024-07-31 20:56:30 -04:00
Tom Lane
68855c0387 Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
	BEGIN;
	CREATE USER foo;
	SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

Per bug #18545 from Andrey Rachitskiy.  Back-patch to all
supported branches, because this has been wrong for awhile.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-07-31 18:54:10 -04:00
David Rowley
447e25c049 Doc: mention executor memory usage for enable_partitionwise* GUCs
Prior to this commit, the docs for enable_partitionwise_aggregate and
enable_partitionwise_join mentioned the additional overheads enabling
these causes for the query planner, but they mentioned nothing about the
possible surge in work_mem-consuming executor nodes that could end up in
the final plan.  Dimitrios reported the OOM killer intervened on his
query as a result of using enable_partitionwise_aggregate=on.

Here we adjust the docs to mention the possible increase in the number of
work_mem-consuming executor nodes that can appear in the final plan as a
result of enabling these GUCs.

Reported-by: Dimitrios Apostolou
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/3603c380-d094-136e-e333-610914fb3e80%40gmx.net
Discussion: https://postgr.es/m/CAApHDvoZ0_yqwPFEpb6h261L76BUpmh5GxBQq0LeRzQ5Jh3zzg@mail.gmail.com
Backpatch-through: 12, oldest supported version
2024-08-01 01:28:48 +12:00
Peter Eisentraut
4070489999 libpq: Use strerror_r instead of strerror
Commit 453c4687377 introduced a use of strerror() into libpq, but that
is not thread-safe.  Fix by using strerror_r() instead.

In passing, update some of the code comments added by 453c4687377, as
we have learned more about the reason for the change in OpenSSL that
started this.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca
2024-07-28 09:26:48 +02:00
Daniel Gustafsson
e6dd0b8637 Fix building with MSVC for TLS session disabling
Commit 274bbced85 omitted the required changes for the MSVC build
system in v16 through v12. Per buildfarm animal hamerkop.

Discussion: https://postgr.es/m/7919238F-723C-4113-9742-EBCE7A76A6B4@yesql.se
2024-07-26 19:10:37 +02:00
Daniel Gustafsson
ac77add23b Fix macro placement in pg_config.h.in
Commit 274bbced85383e831dde accidentally placed the pg_config.h.in
for SSL_CTX_set_num_tickets on the wrong line wrt where autoheader
places it.  Fix by re-arranging and backpatch to the same level as
the original commit.

Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru>
Discussion: https://postgr.es/m/48cebe8c3eaf308bae253b1dbf4e4a75@postgrespro.ru
Backpatch-through: v12
2024-07-26 14:16:40 +02:00
Daniel Gustafsson
32121c077d Disable all TLS session tickets
OpenSSL supports two types of session tickets for TLSv1.3, stateless
and stateful. The option we've used only turns off stateless tickets
leaving stateful tickets active. Use the new API introduced in 1.1.1
to disable all types of tickets.

Backpatch to all supported versions.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20240617173803.6alnafnxpiqvlh3g@awork3.anarazel.de
Backpatch-through: v12
2024-07-26 11:09:45 +02:00
Tom Lane
551ea63aaa Doc: fix misleading syntax synopses for targetlists.
In the syntax synopses for SELECT, INSERT, UPDATE, etc,
SELECT ... and RETURNING ... targetlists were missing { ... }
braces around an OR (|) operator.  That allows misinterpretation
which could lead to confusion.

David G. Johnston, per gripe from masondeanm@aol.com.

Discussion: https://postgr.es/m/172193970148.915373.2403176471224676074@wrigleys.postgresql.org
2024-07-25 19:52:08 -04:00
Alvaro Herrera
e29203a60b
Fix a missing article in the documentation
Per complaint from Grant Gryczan.

It's a very old typo; backpatch all the way back.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/172179789219.915368.16590585529628354757@wrigleys.postgresql.org
2024-07-24 14:13:55 +02:00
Alvaro Herrera
08b6a9ecf9
Reset relhassubclass upon attaching table as a partition
We don't allow inheritance parents as partitions, and have checks to
prevent this; but if a table _was_ in the past an inheritance parents
and all their children are removed, the pg_class.relhassubclass flag
may remain set, which confuses the partition pruning code (most
obviously, it results in an assertion failure; in production builds it
may be worse.)

Fix by resetting relhassubclass on attach.

Backpatch to all supported versions.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18550-d5e047e9a897a889@postgresql.org
2024-07-24 12:38:18 +02:00
Nathan Bossart
878e8c6be7 Detect integer overflow in array_set_slice().
When provided an empty initial array, array_set_slice() fails to
check for overflow when computing the new array's dimensions.
While such overflows are ordinarily caught by ArrayGetNItems(),
commands with the following form are accepted:

	INSERT INTO t (i[-2147483648:2147483647]) VALUES ('{}');

To fix, perform the hazardous computations using overflow-detecting
arithmetic routines.  As with commit 18b585155a, the added test
cases generate errors that include a platform-dependent value, so
we again use psql's VERBOSITY parameter to suppress printing the
message text.

Reported-by: Alexander Lakhin
Author: Joseph Koshakow
Reviewed-by: Jian He
Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com
Backpatch-through: 12
2024-07-23 21:59:02 -05:00
Tom Lane
77b116724c Doc: improve description of plpgsql's FETCH and MOVE commands.
We were not being clear about which variants of the "direction"
clause are permitted in MOVE.  Also, the text seemed to be
written with only the FETCH/MOVE NEXT case in mind, so it
didn't apply very well to other variants.

Also, document that "MOVE count IN cursor" only works if count
is a constant.  This is not the whole truth, because some other
cases such as a parenthesized expression will also work, but
we want to push people to use "MOVE FORWARD count" instead.
The constant case is enough to cover what we allow in plain SQL,
and that seems sufficient to claim support for.

Update a comment in pl_gram.y claiming that we don't document
that point.

Per gripe from Philipp Salvisberg.

Discussion: https://postgr.es/m/172155553388.702.7932496598218792085@wrigleys.postgresql.org
2024-07-22 19:44:03 -04:00
Tom Lane
feca6c688c Correctly check updatability of columns targeted by INSERT...DEFAULT.
If a view has some updatable and some non-updatable columns, we failed
to verify updatability of any columns for which an INSERT or UPDATE
on the view explicitly specifies a DEFAULT item (unless the view has
a declared default for that column, which is rare anyway, and one
would almost certainly not write one for a non-updatable column).
This would lead to an unexpected "attribute number N not found in
view targetlist" error rather than the intended error.

Per bug #18546 from Alexander Lakhin.  This bug is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/18546-84a292e759a9361d@postgresql.org
2024-07-20 13:40:15 -04:00
Nathan Bossart
4f96281587 Add overflow checks to money type.
None of the arithmetic functions for the the money type handle
overflow.  This commit introduces several helper functions with
overflow checking and makes use of them in the money type's
arithmetic functions.

Fixes bug #18240.

Reported-by: Alexander Lakhin
Author: Joseph Koshakow
Discussion: https://postgr.es/m/18240-c5da758d7dc1ecf0%40postgresql.org
Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
Backpatch-through: 12
2024-07-19 11:52:32 -05:00
Andrew Dunstan
b06fe880da Avoid error in recovery test if history file is not yet present
Error was detected when testing use of libpq sessions instead of psql
for polling queries.

Discussion: https://postgr.es/m/e86b6d2d-20d8-4ac9-9a98-165fff7db886@dunslane.net

Backpatch to all live branches
2024-07-17 10:43:07 -04:00
Andres Freund
69bdee12e4 Fix bad indentation introduced in 43cd30bcd1c
Oops.

Reported-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/ZpVZB9rH5tHllO75@nathan
Backpatch: 12-, like 43cd30bcd1c
2024-07-15 15:17:18 -07:00
Andres Freund
92f02c39c0 Fix type confusion in guc_var_compare()
Before this change guc_var_compare() cast the input arguments to
const struct config_generic *.  That's not quite right however, as the input
on one side is often just a char * on one side.

Instead just use char *, the first field in config_generic.

This fixes a -Warray-bounds warning with some versions of gcc. While the
warning is only known to be triggered for <= 15, the issue the warning points
out seems real, so apply the fix everywhere.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reported-by: Erik Rijkers <er@xs4all.nl>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/a74a1a0d-0fd2-3649-5224-4f754e8f91aa%40xs4all.nl
2024-07-15 09:26:03 -07:00
Tom Lane
236b225ed4 Avoid unhelpful internal error for incorrect recursive-WITH queries.
checkWellFormedRecursion would issue "missing recursive reference"
if a WITH RECURSIVE query contained a single self-reference but
that self-reference was inside a top-level WITH, ORDER BY, LIMIT,
etc, rather than inside the second arm of the UNION as expected.
We already intended to throw more-on-point errors for such cases,
but those error checks must be done before examining the UNION arm
in order to have the desired results.  So this patch need only
move some code (and improve the comments).

Per bug #18536 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18536-0a342ec07901203e@postgresql.org
2024-07-14 13:49:46 -04:00
Andrew Dunstan
53a14ec290 Make sure to run pg_isready on correct port
The current code can have pg_isready unexpectedly succeed if there is a
server running on the default port. To avoid this we delay running the
test until after a node has been created but before it starts, and then
use that node's port, so we are fairly sure there is nothing running on
the port.

Backpatch to all live branches.
2024-07-13 08:11:46 -04:00
Thomas Munro
ba9fcac720 Fix lost Windows socket EOF events.
Winsock only signals an FD_CLOSE event once if the other end of the
socket shuts down gracefully.  Because each WaitLatchOrSocket() call
constructs and destroys a new event handle every time, with unlucky
timing we can lose it and hang.  We get away with this only if the other
end disconnects non-gracefully, because FD_CLOSE is repeatedly signaled
in that case.

To fix this design flaw in our Windows socket support fundamentally,
we'd probably need to rearchitect it so that a single event handle
exists for the lifetime of a socket, or switch to completely different
multiplexing or async I/O APIs.  That's going to be a bigger job
and probably wouldn't be back-patchable.

This brute force kludge closes the race by explicitly polling with
MSG_PEEK before sleeping.

Back-patch to all supported releases.  This should hopefully clear up
some random build farm and CI hang failures reported over the years.  It
might also allow us to try using graceful shutdown in more places again
(reverted in commit 29992a6) to fix instability in the transmission of
FATAL error messages, but that isn't done by this commit.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/176008.1715492071%40sss.pgh.pa.us
2024-07-13 15:45:28 +12:00
Alvaro Herrera
067cb6c5da
Add ORDER BY to new test query
Per buildfarm.
2024-07-12 13:44:19 +02:00
Alvaro Herrera
d0054432d4
Fix ALTER TABLE DETACH for inconsistent indexes
When a partitioned table has an index that doesn't support a constraint,
but a partition has an equivalent index that does, then a DETACH
operation would misbehave: a crash in assertion-enabled systems (because
we fail to find the constraint in the parent that we expect to), or a
broken coninhcount value (-1) in production systems (because we blindly
believe that we've successfully detached the parent).

While we should reject an ATTACH of a partition with such an index, we
have failed to do so in existing releases, so adding an error in stable
releases might break the (unlikely) existing applications that rely on
this behavior.  At this point I don't even want to reject them in
master, because it'd break pg_upgrade if such databases exist, and there
would be no easy way to fix existing databases without expensive index
rebuilds.

(Later on we could add ALTER TABLE ... ADD CONSTRAINT USING INDEX to
partitioned tables, which would allow the user to fix such patterns.  At
that point we could add more restrictions to prevent the problem from
its root.)

Also, add a test case that leaves one table in this condition, so that
we can verify that pg_upgrade continues to work if we later decide to
change the policy on the master branch.

Backpatch to all supported branches.

Co-authored-by: Tender Wang <tndrwang@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18500-62948b6fe5522f56@postgresql.org
2024-07-12 12:54:01 +02:00
Masahiko Sawada
1b37075877 Fix possibility of logical decoding partial transaction changes.
When creating and initializing a logical slot, the restart_lsn is set
to the latest WAL insertion point (or the latest replay point on
standbys). Subsequently, WAL records are decoded from that point to
find the start point for extracting changes in the
DecodingContextFindStartpoint() function. Since the initial
restart_lsn could be in the middle of a transaction, the start point
must be a consistent point where we won't see the data for partial
transactions.

Previously, when not building a full snapshot, serialized snapshots
were restored, and the SnapBuild jumps to the consistent state even
while finding the start point. Consequently, the slot's restart_lsn
and confirmed_flush could be set to the middle of a transaction. This
could lead to various unexpected consequences. Specifically, there
were reports of logical decoding decoding partial transactions, and
assertion failures occurred because only subtransactions were decoded
without decoding their top-level transaction until decoding the commit
record.

To resolve this issue, the changes prevent restoring the serialized
snapshot and jumping to the consistent state while finding the start
point.

On v17 and HEAD, a flag indicating whether snapshot restores should be
skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION
has been bumpded.

On backbranches, the flag is stored in the LogicalDecodingContext
instead, preserving on-disk compatibility.

Backpatch to all supported versions.

Reported-by: Drew Callahan
Reviewed-by: Amit Kapila, Hayato Kuroda
Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com
Backpatch-through: 12
2024-07-11 22:48:08 +09:00
Tom Lane
a134baea7a Make our back branches compatible with libxml2 2.13.x.
This back-patches HEAD commits 066e8ac6e, 6082b3d5d, e7192486d,
and 896cd266f into supported branches.  Changes:

* Use xmlAddChildList not xmlAddChild in XMLSERIALIZE
(affects v16 and up only).  This was a flat-out coding mistake
that we got away with due to lax checking in previous versions
of xmlAddChild.

* Use xmlParseInNodeContext not xmlParseBalancedChunkMemory.
This is to dodge a bug in xmlParseBalancedChunkMemory in libxm2
releases 2.13.0-2.13.2.  While that bug is now fixed upstream and
will probably never be seen in any production-oriented distro, it is
currently a problem on some more-bleeding-edge-friendly platforms.

* Suppress "chunk is not well balanced" errors from libxml2,
unless it is the only error.  This eliminates an error-reporting
discrepancy between 2.13 and older releases.  This error is
almost always redundant with previous errors, if not flat-out
inappropriate, which is why 2.13 changed the behavior and why
nobody's likely to miss it.

Erik Wienhold and Tom Lane, per report from Frank Streitzig.

Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25
Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01
2024-07-10 20:15:52 -04:00
Andrew Dunstan
d1c30fd884 Unrevert "Force nodes for SSL tests to start in TCP mode"
This reverts commit 8b03d743a85fed4930645382a219cc75bdf7ab73.

Release 12 required a little extra sauce to make it work, but this has
been tested now.
2024-07-09 09:40:30 -04:00
Andrew Dunstan
8b03d743a8 Revert "Force nodes for SSL tests to start in TCP mode"
This reverts commit 198088dc63de4f89835419969c7b5d1640be3441.

This is causing errors on Release 12, so revert for now to keep the
buildfarm green.
2024-07-08 17:15:10 -04:00
Andrew Dunstan
8398485957 Choose ports for test servers less likely to result in conflicts
If we choose ports in the range typically used for ephemeral ports there
is a danger of encountering a port conflict due to a race condition
between the time we choose the port in a range below that typically used
to allocate ephemeral ports, but higher than the range typically used by
well known services.

Author: Jelte Fenema-Nio, with some editing by me.

Discussion: https://postgr.es/m/d6ee8761-39d1-0033-1afb-d5a57ee056f2@gmail.com

Backpatch to all live branches (12 and up)
2024-07-08 11:40:58 -04:00
Andrew Dunstan
198088dc63 Force nodes for SSL tests to start in TCP mode
Currently they are started in unix socket mode in ost cases, and then
converted to run in TCP mode. This can result in port collisions, and
there is no virtue in startng in unix socket mode, so start as we will
be going on.

Discussion: https://postgr.es/m/d6ee8761-39d1-0033-1afb-d5a57ee056f2@gmail.com

Backpatch to all live branches (12 and up).
2024-07-08 11:40:58 -04:00
Dean Rasheed
8badee7875 Fix scale clamping in numeric round() and trunc().
The numeric round() and trunc() functions clamp the scale argument to
the range between +/- NUMERIC_MAX_RESULT_SCALE (2000), which is much
smaller than the actual allowed range of type numeric. As a result,
they return incorrect results when asked to round/truncate more than
2000 digits before or after the decimal point.

Fix by using the correct upper and lower scale limits based on the
actual allowed (and documented) range of type numeric.

While at it, use the new NUMERIC_WEIGHT_MAX constant instead of
SHRT_MAX in all other overflow checks, and fix a comment thinko in
power_var() introduced by e54a758d24 -- the minimum value of
ln_dweight is -NUMERIC_DSCALE_MAX (-16383), not -SHRT_MAX, though this
doesn't affect the point being made in the comment, that the resulting
local_rscale value may exceed NUMERIC_MAX_DISPLAY_SCALE (1000).

Back-patch to all supported branches.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/CAEZATCXB%2BrDTuMjhK5ZxcouufigSc-X4tGJCBTMpZ3n%3DxxQuhg%40mail.gmail.com
2024-07-08 17:58:42 +01:00
Thomas Munro
274a8195d4 Cope with <regex.h> name clashes.
macOS 15's SDK pulls in headers related to <regex.h> when we include
<xlocale.h>.  This causes our own regex_t implementation to clash with
the OS's regex_t implementation.  Luckily our function names already had
pg_ prefixes, but the macros and typenames did not.

Include <regex.h> explicitly on all POSIX systems, and fix everything
that breaks.  Then we can prove that we are capable of fully hiding and
replacing the system regex API with our own.

1.  Deal with standard-clobbering macros by undefining them all first.
POSIX says they are "symbolic constants".  If they are macros, this
allows us to redefine them.  If they are enums or variables, our macros
will hide them.

2.  Deal with standard-clobbering types by giving our types pg_
prefixes, and then using macros to redirect xxx_t -> pg_xxx_t.

After including our "regex/regex.h", the system <regex.h> is hidden,
because we've replaced all the standard names.  The PostgreSQL source
tree and extensions can continue to use standard prefix-less type and
macro names, but reach our implementation, if they included our
"regex/regex.h" header.

Back-patch to all supported branches, so that macOS 15's tool chain can
build them.

Reported-by: Stan Hu <stanhu@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com
2024-07-06 10:30:03 +12:00
Tom Lane
ac2ec6c57b Doc: small improvements in discussion of geometric data types.
State explicitly that the coordinates in our geometric data types are
float8.  Also explain that polygons store their bounding box.

While here, fix the table of geometric data types to show type
"line"'s size correctly: it's 24 bytes not 32.  This has somehow
escaped notice since that table was made in 1998.

Per suggestion from Sebastian Skałacki.  The size error seems
important enough to justify back-patching.

Discussion: https://postgr.es/m/172000045661.706.1822177575291548794@wrigleys.postgresql.org
2024-07-04 13:23:32 -04:00
Daniel Gustafsson
65b51c7541 doc: Specify when ssl_prefer_server_ciphers was added
The ssl_prefer_server_ciphers setting is quite important from a
security point of view, so simply stating that older versions
doesn't have it isn't very helpful.  This adds the version when
the GUC was added to help readers.

Backpatch to all supported versions since this setting has been
around since 9.4.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/5D7E0F5E-E620-4D54-8788-66D421AC76F0@yesql.se
Backpatch-through: v12
2024-07-04 11:38:37 +02:00
Heikki Linnakangas
241ef53b6a Fix missing installation/uninstallation rules for BackgroundPsql.pm
Commit d5fd7865 backported BackgroundPsql perl module module with
helper functions for tests running interactive or background psql
tasks to PG 12 to 15, but did not add installation/uninstallation
rules of the build system, causing problems running TAP tests for the
extensions.

Author: Pavan Deolasee <pavan.deolasee@gmail.com>
Discussion: https://www.postgresql.org/message-id/CABOikdPmRuZrcf_gtgXmQzZ5Tbg9yUJmqXDCAZ2aW%3DWi-PbDyQ%40mail.gmail.com
2024-07-01 19:25:51 +03:00
Tom Lane
8565fb6fbb Preserve CurrentMemoryContext across notify and sinval interrupts.
ProcessIncomingNotify is called from the main processing loop that
normally runs in MessageContext.  That outer-loop code assumes that
whatever it allocates will be cleaned up when we're done processing
the current client message --- but if we service a notify interrupt,
then whatever gets allocated before the next switch into
MessageContext will be permanently leaked in TopMemoryContext,
because CommitTransactionCommand sets CurrentMemoryContext to
TopMemoryContext.  There are observable leaks associated with
(at least) encoding conversion of incoming queries and parameters
attached to Bind messages.

sinval catchup interrupts have a similar problem.  There might be
others, but I've not identified any other clear cases.

To fix, take care to save and restore CurrentMemoryContext across
the Start/CommitTransactionCommand calls in these functions.

Per bug #18512 from wizardbrony.  Commit to back branches only;
in HEAD, this was dealt with by the riskier but more thoroughgoing
approach in commit 1afe31f03.

Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
2024-07-01 12:21:07 -04:00
Noah Misch
79567599cd Remove configuration-dependent output from new inplace-inval test.
Per buildfarm members prion and trilobite.  Back-patch to v12 (all
supported versions), like commit
0844b3968985447ed0a6937cfc8639e379da2fe6.

Strategy reviewed by Tom Lane.

Discussion: https://postgr.es/m/20240628051353.a0.nmisch@google.com
2024-06-28 09:33:45 -07:00
Noah Misch
77d0bc8001 Remove comment about xl_heap_inplace "AT END OF STRUCT".
Commit 2c03216d831160bedd72d45f712601b6f7d03f1c moved the tuple data
from there to the buffer-0 data.  Back-patch to v12 (all supported
versions), the plan for the next change to this struct.

Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-06-27 19:21:13 -07:00
Noah Misch
11f3815d6a Cope with inplace update making catcache stale during TOAST fetch.
This extends ad98fb14226ae6456fbaed7990ee7591cbe5efd2 to invals of
inplace updates.  Trouble requires an inplace update of a catalog having
a TOAST table, so only pg_database was at risk.  (The other catalog on
which core code performs inplace updates, pg_class, has no TOAST table.)
Trouble would require something like the inplace-inval.spec test.
Consider GRANT ... ON DATABASE fetching a stale row from cache and
discarding a datfrozenxid update that vac_truncate_clog() has already
relied upon.  Back-patch to v12 (all supported versions).

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240114201411.d0@rfd.leadboat.com
Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27 19:21:13 -07:00
Noah Misch
7af775bbae AccessExclusiveLock new relations just after assigning the OID.
This has no user-visible, important consequences, since other sessions'
catalog scans can't find the relation until we commit.  However, this
unblocks introducing a rule about locks required to heap_update() a
pg_class row.  CREATE TABLE has been acquiring this lock eventually, but
it can heap_update() pg_class.relchecks earlier.  create_toast_table()
has been acquiring only ShareLock.  Back-patch to v12 (all supported
versions), the plan for the commit relying on the new rule.

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
2024-06-27 19:21:13 -07:00
Noah Misch
e2eefb97e3 Lock before setting relhassubclass on RELKIND_PARTITIONED_INDEX.
Commit 5b562644fec696977df4a82790064e8287927891 added a comment that
SetRelationHasSubclass() callers must hold this lock.  When commit
17f206fbc824d2b4b14480199ca9ff7dea417eda extended use of this column to
partitioned indexes, it didn't take the lock.  As the latter commit
message mentioned, we currently never reset a partitioned index to
relhassubclass=f.  That largely avoids harm from the lock omission.  The
cause for fixing this now is to unblock introducing a rule about locks
required to heap_update() a pg_class row.  This might cause more
deadlocks.  It gives minor user-visible benefits:

- If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE
  ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks
  instead of failing with "tuple concurrently updated".  (Many cases of
  DDL concurrency still fail that way.)

- Match ALTER INDEX ATTACH PARTITION in choosing to lock the index.

While not user-visible today, we'll need this if we ever make something
set the flag to false for a partitioned index, like ANALYZE does today
for tables.  Back-patch to v12 (all supported versions), the plan for
the commit relying on the new rule.  In back branches, add
LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter.

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
2024-06-27 19:21:13 -07:00
Noah Misch
fea47de301 Expand comments and add an assertion in nodeModifyTable.c.
Most comments concern RELKIND_VIEW.  One addresses the ExecUpdate()
"tupleid" parameter.  A later commit will rely on these facts, but they
hold already.  Back-patch to v12 (all supported versions), the plan for
that commit.

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27 19:21:13 -07:00
Noah Misch
796d191028 Improve test coverage for changes to inplace-updated catalogs.
This covers both regular and inplace changes, since bugs arise at their
intersection.  Where marked, these witness extant bugs.  Back-patch to
v12 (all supported versions).

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27 19:21:13 -07:00
Tom Lane
dccda847b7 Avoid crashing when a JIT-inlined backend function throws an error.
errfinish() assumes that the __FUNC__ and __FILE__ arguments it's
passed are compile-time constant strings that can just be pointed
to rather than physically copied.  However, it's possible for LLVM
to generate code in which those pointers point into a dynamically
loaded code segment.  If that segment gets unloaded before we're
done with the ErrorData struct, we have dangling pointers that
will lead to SIGSEGV.  In simple cases that won't happen, because we
won't unload LLVM code before end of transaction.  But it's possible
to happen if the error is thrown within end-of-transaction code run by
_SPI_commit or _SPI_rollback, because since commit 2e517818f those
functions clean up by ending the transaction and starting a new one.

Rather than fixing this by adding pstrdup() overhead to every
elog/ereport sequence, let's fix it by copying the risky pointers
in CopyErrorData().  That solves it for _SPI_commit/_SPI_rollback
because they use that function to preserve the error data across
the transaction end/restart sequence; and it seems likely that
any other code doing something similar would need to do that too.

I'm suspicious that this behavior amounts to an LLVM bug (or a
bug in our use of it?), because it implies that string constant
references that should be pointer-equal according to a naive
understanding of C semantics will sometimes not be equal.
However, even if it is a bug and someday gets fixed, we'll have
to cope with the current behavior for a long time to come.

Report and patch by me.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/1565654.1719425368@sss.pgh.pa.us
2024-06-27 14:44:04 -04:00
Heikki Linnakangas
5dea6628b3 Fix MVCC bug with prepared xact with subxacts on standby
We did not recover the subtransaction IDs of prepared transactions
when starting a hot standby from a shutdown checkpoint. As a result,
such subtransactions were considered as aborted, rather than
in-progress. That would lead to hint bits being set incorrectly, and
the subtransactions suddenly becoming visible to old snapshots when
the prepared transaction was committed.

To fix, update pg_subtrans with prepared transactions's subxids when
starting hot standby from a shutdown checkpoint. The snapshots taken
from that state need to be marked as "suboverflowed", so that we also
check the pg_subtrans.

Backport to all supported versions.

Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
2024-06-27 21:09:15 +03:00
Heikki Linnakangas
266454f192 tests: Trim newline from result returned by BackgroundPsql->query
This went unnoticed, because only a few existing callers of
BackgroundPsql->query used the result, and the ones that did were not
bothered by an extra newline. I noticed because I was about to add a
new test that checks the result.

Backport to all supported versions, since I just backported the
BackgroundPsql facility to all supported versions too.
2024-06-27 21:09:06 +03:00
Heikki Linnakangas
4b467a6581 Backport BackgroundPsql perl test module
Backport the new BackgroundPsql modules and the constructor functions,
background_psql() and interactive_psql, to all supported
branches. That makes it easier to backpatch tests that use it.

BackgroundPsql was introduced in version 16. On version 16, this
commit backports just the new timeout argument from master (commit
334f512f45). On older branches, the whole facility. This includes the
change to `use warnings FATAL => 'all'`, which we haven't otherwise
backported, but it seems good to keep the file identical across
branches.

Discussion: https://www.postgresql.org/message-id/b7c64f20-ea01-4f15-9088-0cd6832af149@iki.fi
2024-06-27 19:01:25 +03:00
Andrew Dunstan
ab46e132fa Remove redundant perl version checks
Commit 4c1532763a removed some redundant uses of 'use 5.008001;' in perl
scripts, including in plperl's plc_perlboot.pl. Because it made other
changes it wasn't backpatched. However, now this is causing a failure on
back branches when built with bleeding edge perl. Therefore, backpatch
just that part of it which removed those uses, from 15 all the way down
to 9.2, which is the earliest version currently built in the buildfarm.

per report from Alexander Lakhin

Discussion: https://postgr.es/m/4cc2ee93-e03c-8e13-61ed-412e7e6ff19d@gmail.com
2024-06-26 07:25:10 -04:00
Amit Kapila
b0121801d3 Doc: Generated columns are skipped for logical replication.
Add a note in docs that generated columns are skipped for logical
replication.

Author: Peter Smith
Reviewed-by: Peter Eisentraut
Backpatch-through: 12
Discussion: https://postgr.es/m/CAHut+PuXb1GLQztQkoWzYjSwkAZZ0dgCJaAHyJtZF3kmtcL=kA@mail.gmail.com
2024-06-21 09:16:06 +05:30
Tom Lane
b0037bbefd Don't throw an error if a queued AFTER trigger no longer exists.
afterTriggerInvokeEvents and AfterTriggerExecute have always
treated it as an error if the trigger OID mentioned in a queued
after-trigger event can't be found.  However, that fails to
account for the edge case where the trigger's been dropped in
the current transaction since queueing the event.  There seems
no very good reason to disallow that case, so instead silently
do nothing if the trigger OID can't be found.

This does give up a little bit of bug-detection ability, but I don't
recall that these error messages have ever actually revealed a bug,
so it seems mostly theoretical.  Alternatives such as marking
pending events DONE at the time of dropping a trigger would be
complicated and perhaps introduce bugs of their own.

Per bug #18517 from Alexander Lakhin.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/18517-af2d19882240902c@postgresql.org
2024-06-20 14:21:36 -04:00
Tom Lane
3e3e2ebea7 Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY.
Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may
insert some REDIRECT tuples.  This will typically happen in
a transaction that lacks an XID, which leads either to assertion
failure in spgFormDeadTuple or to insertion of a REDIRECT tuple
with zero xid.  The latter's not good either, since eventually
VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid,
resulting in either an assertion failure or a garbage answer.

In practice, since REINDEX CONCURRENTLY locks out index scans
till it's done, it doesn't matter whether it inserts REDIRECTs
or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM
reduces such a REDIRECT to a PLACEHOLDER.  So in non-assert builds
there's no observable problem here, other than perhaps a little
index bloat.  But it's not behaving as intended.

To fix, remove the failing Assert in spgFormDeadTuple, acknowledging
that we might sometimes insert a zero XID; and guard VACUUM's
GlobalVisTestIsRemovableXid() call with a test for valid XID,
ensuring that we'll reduce such a REDIRECT the first time VACUUM
sees it.  (Versions before v14 use TransactionIdPrecedes here,
which won't fail on zero xid, so they really have no bug at all
in non-assert builds.)

Another solution could be to not create REDIRECTs at all during
REINDEX CONCURRENTLY, making the relevant code paths treat that
case like index build (which likewise knows that no concurrent
index scans can be happening).  That would allow restoring the
Assert in spgFormDeadTuple, but we'd still need the VACUUM change
because redirection tuples with zero xid may be out there already.
But there doesn't seem to be a nice way for spginsert() to tell that
it's being called in REINDEX CONCURRENTLY without some API changes,
so we'll leave that as a possible future improvement.

In HEAD, also rename the SpGistState.myXid field to redirectXid,
which seems less misleading (since it might not in fact be our
transaction's XID) and is certainly less uninformatively generic.

Per bug #18499 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org
2024-06-17 14:30:59 -04:00