postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-02-17 19:30:00 +08:00

Author	SHA1	Message	Date
Heikki Linnakangas	c809a2b2d6	Fix bugs in MultiXact truncation 1. TruncateMultiXact() performs the SLRU truncations in a critical section. Deleting the SLRU segments calls ForwardSyncRequest(), which will try to compact the request queue if it's full (CompactCheckpointerRequestQueue()). That in turn allocates memory, which is not allowed in a critical section. Backtrace: TRAP: failed Assert("CritSectionCount == 0 \|\| (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981 postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e] postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d] postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e] postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb] postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a] postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1] postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b] postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3] postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66] postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d] postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead] postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e] postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb] postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e] /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45] postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31] To fix, bail out in CompactCheckpointerRequestQueue() without doing anything, if it's called in a critical section. That covers the above call path, as well as any other similar cases where RegisterSyncRequest might be called in a critical section. 2. After fixing that, another problem became apparent: Autovacuum process doing that truncation can deadlock with the checkpointer process. TruncateMultiXact() sets "MyProc->delayChkptFlags \|= DELAY_CHKPT_START". If the sync request queue is full and cannot be compacted, the process will repeatedly sleep and retry, until there is room in the queue. However, if the checkpointer is trying to start a checkpoint at the same time, and is waiting for the DELAY_CHKPT_START processes to finish, the queue will never shrink. More concretely, the autovacuum process is stuck here: #0 0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570 #2 WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516 #3 0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:538 #4 0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614 #5 0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495 #6 0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566 #7 0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006 #8 TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201 #9 0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917 #10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760 #11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550 #12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569 and the checkpointer is stuck here: #0 0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50 #3 0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098 #4 0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464 To fix, add AbsorbSyncRequests() to the loops where the checkpointer waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to finish. Backpatch to v14. Before that, SLRU deletion didn't call RegisterSyncRequest, which avoided this failure. I'm not sure if there are other similar scenarios on older versions, but we haven't had any such reports. Discussion: https://www.postgresql.org/message-id/ccc66933-31c1-4f6a-bf4b-45fef0d4f22e@iki.fi	2024-06-26 23:05:58 +03:00
Andrew Dunstan	f853e23bf8	Remove redundant perl version checks Commit `4c1532763a` removed some redundant uses of 'use 5.008001;' in perl scripts, including in plperl's plc_perlboot.pl. Because it made other changes it wasn't backpatched. However, now this is causing a failure on back branches when built with bleeding edge perl. Therefore, backpatch just that part of it which removed those uses, from 15 all the way down to 9.2, which is the earliest version currently built in the buildfarm. per report from Alexander Lakhin Discussion: https://postgr.es/m/4cc2ee93-e03c-8e13-61ed-412e7e6ff19d@gmail.com	2024-06-26 07:01:47 -04:00
Alvaro Herrera	fb0fb0740f	Fix partition pruning setup during DETACH CONCURRENTLY When detaching partition in concurrent mode, it's possible for partition descriptors to not match the set that was recently seen when the plan was made, causing an assertion failure or (in production builds) failure to construct a working plan. The case that was reported involves prepared statements, but I think it may be possible to hit this bug without that too. The problem is that CreatePartitionPruneState is constructing a PartitionPruneState under the assumption that new partitions can be added, but never removed, but it turns out that this isn't true: a prepared statement gets replanned when the DETACH CONCURRENTLY session sends out its invalidation message, but if the invalidation message arrives after ExecInitAppend started, we would build a partition descriptor without the partition, and then CreatePartitionPruneState would refuse to work with it. CreatePartitionPruneState already contains code to deal with the new descriptor having more partitions than before (and behaving for the extra partitions as if they had been pruned), but doesn't have code to deal with less partitions than before, and it is naïve about the case where the number of partitions is the same. We could simply add that a new stanza for less partitions than before, and in simple testing it works to do that; but it's possible to press the test scripts even further and hit the case where one partition is added and a partition is removed quickly enough that we see the same number of partitions, but they don't actually match, causing hangs during execution. To cope with both these problems, we now memcmp() the arrays of partition OIDs, and do a more elaborate mapping (relying on the fact that both OID arrays are in partition-bounds order) if they're not identical. Backpatch to 14, where DETACH CONCURRENTLY appeared. Reported-by: yajun Hu <1026592243@qq.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18377-e0324601cfebdfe5@postgresql.org	2024-06-24 15:56:32 +02:00
Amit Kapila	a99b2ccd56	Doc: Generated columns are skipped for logical replication. Add a note in docs that generated columns are skipped for logical replication. Author: Peter Smith Reviewed-by: Peter Eisentraut Backpatch-through: 12 Discussion: https://postgr.es/m/CAHut+PuXb1GLQztQkoWzYjSwkAZZ0dgCJaAHyJtZF3kmtcL=kA@mail.gmail.com	2024-06-21 09:41:13 +05:30
Tom Lane	1424c7abc4	Don't throw an error if a queued AFTER trigger no longer exists. afterTriggerInvokeEvents and AfterTriggerExecute have always treated it as an error if the trigger OID mentioned in a queued after-trigger event can't be found. However, that fails to account for the edge case where the trigger's been dropped in the current transaction since queueing the event. There seems no very good reason to disallow that case, so instead silently do nothing if the trigger OID can't be found. This does give up a little bit of bug-detection ability, but I don't recall that these error messages have ever actually revealed a bug, so it seems mostly theoretical. Alternatives such as marking pending events DONE at the time of dropping a trigger would be complicated and perhaps introduce bugs of their own. Per bug #18517 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18517-af2d19882240902c@postgresql.org	2024-06-20 14:21:36 -04:00
David Rowley	27c6242a04	Fix possible Assert failure in cost_memoize_rescan In cost_memoize_rescan(), when calculating the hit_ratio using the calls and ndistinct estimations, if the value that was set in MemoizePath.calls had not been processed through clamp_row_est(), then it was possible that it was set to some non-integer value which could result in ndistinct being 1 higher than calls due to estimate_num_groups() performing clamp_row_est() on its input_rows. This could result in hit_ratio values slightly below 0.0, which would cause an Assert failure. The value of MemoizePath.calls comes from the final parameter in the create_memoize_path() function, of which we only have one true caller of. That caller passes outer_path->rows. All the core code I looked at always seems to call clamp_row_est() on the Path.rows, so there might have been no issues with any core Paths causing troubles here. The bug report was about a CustomPath with a non-clamped row estimated. The misbehavior as a result of this seems to be mostly limited to the Assert() failing. Aside from that, it seems the Memoize costs would just come out slightly higher than they should have, which is likely fairly harmless. Reported-by: Kohei KaiGai <kaigai@heterodb.com> Discussion: https://postgr.es/m/CAOP8fzZnTU+N64UYJYogb1hN-5hFP+PwTb3m_cnGAD7EsQwrKw@mail.gmail.com Reviewed-by: Richard Guo Backpatch-through: 14, where Memoize was introduced	2024-06-19 10:21:26 +12:00
Tom Lane	f550833193	Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY. Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may insert some REDIRECT tuples. This will typically happen in a transaction that lacks an XID, which leads either to assertion failure in spgFormDeadTuple or to insertion of a REDIRECT tuple with zero xid. The latter's not good either, since eventually VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid, resulting in either an assertion failure or a garbage answer. In practice, since REINDEX CONCURRENTLY locks out index scans till it's done, it doesn't matter whether it inserts REDIRECTs or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM reduces such a REDIRECT to a PLACEHOLDER. So in non-assert builds there's no observable problem here, other than perhaps a little index bloat. But it's not behaving as intended. To fix, remove the failing Assert in spgFormDeadTuple, acknowledging that we might sometimes insert a zero XID; and guard VACUUM's GlobalVisTestIsRemovableXid() call with a test for valid XID, ensuring that we'll reduce such a REDIRECT the first time VACUUM sees it. (Versions before v14 use TransactionIdPrecedes here, which won't fail on zero xid, so they really have no bug at all in non-assert builds.) Another solution could be to not create REDIRECTs at all during REINDEX CONCURRENTLY, making the relevant code paths treat that case like index build (which likewise knows that no concurrent index scans can be happening). That would allow restoring the Assert in spgFormDeadTuple, but we'd still need the VACUUM change because redirection tuples with zero xid may be out there already. But there doesn't seem to be a nice way for spginsert() to tell that it's being called in REINDEX CONCURRENTLY without some API changes, so we'll leave that as a possible future improvement. In HEAD, also rename the SpGistState.myXid field to redirectXid, which seems less misleading (since it might not in fact be our transaction's XID) and is certainly less uninformatively generic. Per bug #18499 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org	2024-06-17 14:30:59 -04:00
Tom Lane	1f1eedd3fa	Clean out column-level pg_init_privs entries when dropping tables. DeleteInitPrivs did not get the memo about how, when dropping a whole object (with subid == 0), you should drop entries relating to its sub-objects too. This is visible in the test_pg_dump test case if one drops the extension at the end: the entry for GRANT SELECT(col1) ON regress_pg_dump_table TO public; was still present in pg_init_privs afterwards, although it was pointing to a dangling table OID. Noted while fooling with a fix for REASSIGN OWNED for pg_init_privs entries. This bug is aboriginal in the pg_init_privs feature though, and there seems no reason not to back-patch the fix.	2024-06-14 16:20:35 -04:00
Tom Lane	df95c1ec0e	Fix parsing of ignored operators in websearch_to_tsquery(). The manual says clearly that punctuation in the input of websearch_to_tsquery() is ignored, except for the special cases of dashes and quotes. However, this failed for cases like "(foo bar) or something", or in general an ISOPERATOR character in front of the "or". We'd switch back to WAITOPERAND state, then ignore the operator character while remaining in that state, and then reach the "or" in WAITOPERAND state which (intentionally) makes us treat it as data. The fix is simple enough: if we see an ISOPERATOR character while in WAITOPERATOR state, we have to skip it while staying in that state. (We don't need to worry about other punctuation characters: those will be consumed as though they were words, but then rejected by lexizing.) In v14 and up (since commit `eb086056f`) we can simplify the code a bit more too, because there is no longer a reason for the WAITOPERAND state to distinguish between quoted and unquoted operands. Per bug #18479 from Manos Emmanouilidis. Back-patch to all supported branches. Discussion: https://postgr.es/m/18479-d9b46e2fc242c33e@postgresql.org	2024-06-13 20:35:03 -04:00
Michael Paquier	37c5e5f4d4	doc: Fix description WAL writer in glossary The WAL writer is an auxiliary process, but its description in the glossary did not match that. This is inexact since `d3014fff4c`. Author: Masahiro Ikeda Discussion: https://postgr.es/m/d3a5a4278fd8d9e7a47c6aa4db9e9a39@oss.nttdata.com Backpatch-through: 15	2024-06-14 09:26:39 +09:00
Tom Lane	bf552b1b2d	When replanning a plpgsql "simple expression", check it's still simple. The previous coding here assumed that we didn't need to recheck any of the querytree tests made in exec_simple_check_plan(). I think we supposed that those properties were fully determined by the syntax of the source text and hence couldn't change. That is true for most of them, but at least hasTargetSRFs and hasAggs can change by dint of forcibly dropping an originally-referenced function and recreating it with new properties. That leads to "unexpected plan node type" or similar failures. These tests are pretty cheap compared to the cost of replanning, so rather than sweat over exactly which properties need to be rechecked, let's just recheck them all. Hence, factor out those tests into a new function exec_is_simple_query(), and rearrange callers as needed. A second problem in the same area was that if we failed during replanning or during exec_save_simple_expr(), we'd potentially leave behind now-dangling pointers to the old simple expression, potentially resulting in crashes later. To fix, clear those pointers before replanning. The v12 code looks quite different in this area but still has the bug about needing to recheck query simplicity. I chose to back-patch all of the plpgsql_simple.sql test script, which formerly didn't exist in this branch. Per bug #18497 from Nikita Kalinin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18497-fe93b6da82ce31d4@postgresql.org	2024-06-13 13:37:50 -04:00
Heikki Linnakangas	7768ac1c17	Clamp result of MultiXactMemberFreezeThreshold The purpose of the function is to reduce the effective autovacuum_multixact_freeze_max_age if the multixact members SLRU is approaching wraparound, to make multixid freezing more aggressive. The returned value should therefore never be greater than plain autovacuum_multixact_freeze_max_age. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/85fb354c-f89f-4d47-b3a2-3cbd461c90a3@iki.fi Backpatch-through: 12, all supported versions	2024-06-13 19:02:51 +03:00
Andrew Dunstan	c38518fa97	Skip some permissions checks on Cygwin These are checks that are already skipped on other Windows systems. Backpatch to all live branches, as appropriate.	2024-06-13 07:41:49 -04:00
Tom Lane	1d0399b540	Fix infer_arbiter_indexes() to not assume resultRelation is 1. infer_arbiter_indexes failed to renumber varnos in index expressions or predicates that it got from the catalogs. This escaped detection up to now because the stored varnos in such trees will be 1, and an INSERT's result relation is usually the first rangetable entry, so that that was fine. However, in cases such as inserting through an updatable view, it's not fine, leading to failure to match the expressions to the query with ensuing "there is no unique or exclusion constraint matching the ON CONFLICT specification" errors. Fix by copy-and-paste from get_relation_info(). Per bug #18502 from Michael Wang. Back-patch to all supported versions. Discussion: https://postgr.es/m/18502-545b53f5b81e54e0@postgresql.org	2024-06-11 17:57:46 -04:00
Alvaro Herrera	03c8cdbb7e	Fix creation of partition descriptor during concurrent detach When a partition is being detached in concurrent mode, it is possible for find_inheritance_children_extended() to return that partition in the list, and immediately after that receive an invalidation message that sets its relpartbound to NULL just before we read it. (This can happen because table_open() reads invalidation messages.) Currently we raise an error ERROR: missing relpartbound for relation %u about the situation, but that's bogus because the table is no longer a partition, so we shouldn't be complaining about it. A better reaction is to retry the find_inheritance_children_extended call to get a new list, which will no longer have the partition being detached. Noticed while investigating bug #18377. Backpatch to 14, where DETACH CONCURRENTLY appeared. Discussion: https://postgr.es/m/202405201616.y4ht2qe5ihoy@alvherre.pgsql	2024-06-11 11:38:45 +02:00
Amit Kapila	a8d747771f	Doc: Fix ambuiguity in column lists. The behavior for columns added later to the table for publications with no specified column lists was not clear. Reported-by: Koen De Groote Author: Peter Smith Reviewed-by: Vignesh C, Laurenz Albe Backpatch-through: 15 Discussion: https://postgr.es/m/171621878740.686.11325940592820985181@wrigleys.postgresql.org	2024-06-11 09:17:34 +05:30
Tom Lane	2eba27571e	Tighten test_predtest's input checks, and improve error messages. test_predtest() neglected to consider the possibility that SPI_plan_get_cached_plan would return NULL. This led to a core dump if the input (incorrectly) contains more than one SQL command. While here, let's expend more than zero effort on the error message for this case and nearby ones. Per (half of) bug #18483 from Alexander Kozhemyakin. Back-patch to all supported branches, not because this is very significant (it's merely test scaffolding) but to make our world a bit safer for fuzz testing. Discussion: https://postgr.es/m/18483-30bfff42de238000@postgresql.org	2024-06-07 16:45:56 -04:00
Tom Lane	3c71cb497b	Reject modifying a temp table of another session with ALTER TABLE. Normally this case isn't even reachable by non-superusers, since permissions checks prevent naming such a table. However, it is possible to make it happen by altering a parent table whose child is another session's temp table. We definitely can't support any such ALTER that requires modifying the contents of such a table, since we lack access to the other session's temporary-buffer pool. But there seems no good reason to allow it even if it'd only require changing catalog contents. One reason not to allow it is that we'd rather not expose the implementation-dependent behavior of whether a specific ALTER requires touching the table contents. Another is that there may be (in future, even if not today) optimizations that assume that a session's own temp tables won't be modified by other sessions. Hence, add a RELATION_IS_OTHER_TEMP() check to all the places where ALTER TABLE currently does CheckTableNotInUse(). (I looked through all other callers of CheckTableNotInUse(), and they seem OK already.) Per bug #18492 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/18492-c7a2634bf4968763@postgresql.org	2024-06-07 14:50:09 -04:00
Tom Lane	a160e92779	Fix behavior of stable functions called from a CALL's argument list. If the CALL is within an atomic context (e.g. there's an outer transaction block), _SPI_execute_plan should acquire a fresh snapshot to execute any such functions with. We failed to do that and instead passed them the Portal snapshot, which had been acquired at the start of the current SQL command. This'd lead to seeing stale values of rows modified since the start of the command. This is arguably a bug in `84f5c2908`: I failed to see that "are we in non-atomic mode" needs to be defined the same way as it is further down in _SPI_execute_plan, i.e. check !_SPI_current->atomic not just options->allow_nonatomic. Alternatively the blame could be laid on plpgsql, which is unconditionally passing allow_nonatomic = true for CALL/DO even when it knows it's in an atomic context. However, fixing it in spi.c seems like a better idea since that will also fix the problem for any extensions that may have copied plpgsql's coding pattern. While here, update an obsolete comment about _SPI_execute_plan's snapshot management. Per report from Victor Yegorov. Back-patch to all supported versions. Discussion: https://postgr.es/m/CAGnEboiRe+fG2QxuBO2390F7P8e2MQ6UyBjZSL_w1Cej+E4=Vw@mail.gmail.com	2024-06-07 13:27:26 -04:00
Michael Paquier	2de059de2c	Add more debugging information when dropping twice pgstats entry Floris Van Nee has reported a bug in the pgstats facility where a stats entry already dropped would get again dropped. This case should not happen, still the error generated did not offer any details about the stats entry getting dropped. This commit improves the error message generated to inform about the stats entry kind, database OID, object OID and refcount, which should help to debug more the problem reported. Bertrand Drouvot has been independently able to reach this error path while writing a new feature, and more details about the failure would have been helpful for debugging. Author: Andres Freund, Bertrand Drouvot Discussion: https://postgr.es/m/20240505160915.6boysum4f34siqct@awork3.anarazel.de Discussion: https://postgr.es/m/ZkM30paAD8Cr/Bix@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 15	2024-06-07 18:46:34 +09:00
Etsuro Fujita	b33c141cc5	postgres_fdw: Refuse to send FETCH FIRST WITH TIES to remote servers. Previously, when considering LIMIT pushdown, postgres_fdw failed to check whether the query has this clause, which led to pushing false LIMIT clauses, causing incorrect results. This clause has been supported since v13, so we need to do a remote-version check before deciding that it will be safe to push such a clause, but we do not currently have a way to do the check (without accessing the remote server); disable pushing such a clause for now. Oversight in commit `357889eb1`. Back-patch to v13, where that commit added the support. Per bug #18467 from Onder Kalaci. Patch by Japin Li, per a suggestion from Tom Lane, with some changes to the comments by me. Review by Onder Kalaci, Alvaro Herrera, and me. Discussion: https://postgr.es/m/18467-7bb89084ff03a08d%40postgresql.org	2024-06-07 17:45:04 +09:00
Peter Eisentraut	00071ef04c	doc: Fix copy-and-paste mistake The wording from the "columns" view was copied to the "attributes" view without the required adjustments.	2024-06-07 08:03:20 +02:00
Tom Lane	5fe43d41db	Fix failure with SQL-procedure polymorphic output arguments in v12. Before the v13-era commit `913bbd88d`, check_sql_fn_retval fails to resolve polymorphic output types and then just throws up its hands and assumes the check will be made at runtime. I think that's true for ordinary functions returning RECORD, but it doesn't happen in CALL, potentially resulting in crashes if the actual output of the SQL procedure's SELECT doesn't match the type inferred from polymorphism. With a little bit of rearrangement, we can use get_call_result_type instead of get_func_result_type and thereby infer the correct types. I'm still unwilling to back-patch all of `913bbd88d`, so if the types don't match you'll get an error rather than perhaps silently inserting a cast as v13 and later can. That's consistent with prior behavior though, so it seems fine. Prior to `70ffb27b2`, you'd typically get other errors due to other shortcomings of CALL's management of polymorphism. Nonetheless, this is an independent bug. Although there is no bug in v13 and up, it seems prudent to add the test case for this to the newer branches too. It's clearly an under-tested area. Per report from Andrew Bille. Discussion: https://postgr.es/m/CAJnzarw9EeWHAQRm76dXd=7j+rgw6ERqC=nCay8jeFqTwKwhqQ@mail.gmail.com	2024-06-06 15:16:56 -04:00
Michael Paquier	bfc44da247	Prevent inconsistent use of stats entry for replication slots Concurrent activity around replication slot creation and drop could cause a replication slot to use a stats entry it should not have used when created, triggering an assertion failure when retrieving this inconsistent entry from the dshash table used by the stats facility. The issue is that pgstat_drop_replslot() calls pgstat_drop_entry() without checking the result. If pgstat_drop_entry() cannot free the entry related to the object dropped, pgstat_request_entry_refs_gc() should be called. AtEOXact_PgStat_DroppedStats() and surrounding routines dropping stats entries already do that. This is documented in pgstat_internal.h, but let's add a comment at the top of pgstat_drop_entry() as that can be easy to miss. Reported-by: Alexander Lakhin Author: Floris Van Nee Analyzed-by: Andres Freund Discussion: https://postgr.es/m/17947-b9554521ad963c9c@postgresql.org Backpatch-through: 15	2024-06-06 08:48:21 +09:00
Nathan Bossart	bb8425491c	Fix documentation for POSIX semaphores. The documentation for POSIX semaphores is missing a reference to max_wal_senders. This commit fixes that in the same way that commit `4ebe51a5fb` fixed the same issue in the documentation for System V semaphores. Discussion: https://postgr.es/m/20240517164452.GA1914161%40nathanxps13 Backpatch-through: 12	2024-06-05 15:32:47 -05:00
Tom Lane	89ef2aedae	Fix pl/tcl's handling of errors from Tcl_ListObjGetElements(). In a procedure or function returning tuple, we use that function to parse the Tcl script's result, which is supposed to be a Tcl list. If it isn't, you get an error. Commit `26abb50c4` incautiously supposed that we could use throw_tcl_error() to report such an error. That doesn't actually work, because low-level functions like Tcl_ListObjGetElements() don't fill Tcl's errorInfo variable. The result is either a null-pointer-dereference crash or emission of misleading context information describing the previous Tcl error. Back off to just reporting the interpreter's result string, and improve throw_tcl_error()'s comment to explain when to use it. Also, although the similar code in pltcl_trigger_handler() avoided this mistake, it was using a fairly confusing wording of the error message. Improve that while we're here. Per report from A. Kozhemyakin. Back-patch to all supported branches. Erik Wienhold and Tom Lane Discussion: https://postgr.es/m/6a2a1c40-2b2c-4a33-8b72-243c0766fcda@postgrespro.ru	2024-06-04 18:02:13 -04:00
Andres Freund	6b52e2298d	ci: windows: Use the same image for VS and MinGW tasks The VS and MinGW Windows images have been merged, to reduce the space needed for images. Before `98811323c8` the split helped boot performance, but now that we are using VMs that doesn't appear to be the case anymore. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ2kWYjPd7uUC5QswrB3tfVJDiURqC%2BMGM6a3oeev%3DVgOA%40mail.gmail.com Backpatch: 15-, where CI was added	2024-06-03 19:14:57 -07:00
Nathan Bossart	f1884f5757	Fix documentation for System V semaphores. The formulas for SEMMNI and SEMMNS do not include the archiver process, which was converted to an auxiliary process in v14, and the WAL summarizer process, which was introduced in v17. This commit corrects these formulas and adds a missing reference to max_wal_senders nearby. Since this section of the documentation tends to be incorrect quite often, we should likely give up on documenting the exact formulas in favor of something less fragile, but that is left as a future exercise. Reported-by: Sami Imseih Reviewed-by: Sami Imseih Discussion: https://postgr.es/m/20240517164452.GA1914161%40nathanxps13 Backpatch-through: 12	2024-06-03 12:10:43 -05:00
Michael Paquier	8e16f81f3d	Improve stability of subscription/029_on_error.pl This test was failing when using wal_debug=on and -DWAL_DEBUG because of additional log entries that made the test grab an LSN not mapping with the error expected in the test. Previously the test would look for the first matching line to get the LSN to skip up to. This is improved by having the test scan the logs with a regexp that checks for the expected ERROR string, ensuring that the wanted LSN comes from the correct context. Backpatch down to 15 where this test has been introduced. Author: Ian Ilyasov Discussion: https://postgr.es/m/GV1P251MB100415F17E6B2FDD7188777ECDE32@GV1P251MB1004.EURP251.PROD.OUTLOOK.COM Backpatch-through: 15	2024-05-24 11:21:31 +09:00
Tom Lane	e892e72b3c	Remove race conditions between ECPGdebug() and ecpg_log(). Coverity complains that ECPGdebug is accessing debugstream without holding debug_mutex, which is a fair complaint: we should take debug_mutex while changing the settings ecpg_log looks at. In some branches it also complains about unlocked use of simple_debug. I think it's intentional and safe to have a quick unlocked check of simple_debug at the start of ecpg_log, since that early exit will always be taken in non-debug cases. But we should recheck simple_debug after acquiring the mutex. In the worst case, calling ECPGdebug concurrently with ecpg_log in another thread could result in a null-pointer dereference due to debugstream transiently being NULL while simple_debug isn't 0. This is largely hypothetical, since it's unlikely anybody uses ECPGdebug() at all in the field, and our own regression tests don't seem to be hitting the theoretical race conditions either. Still, if we're going to the trouble of having mutexes here, we ought to be using them in a way that's actually safe not just almost safe. Hence, back-patch to all supported branches.	2024-05-23 15:52:06 -04:00
Michael Paquier	c0df15ac7e	doc: Fix column_name parameter in ALTER MATERIALIZED VIEW Parameter column_name must be an existing column because ALTER MATERIALIZED VIEW cannot add new columns. The old description was likely copied from ALTER TABLE. Author: Erik Wienhold Discussion: https://postgr.es/m/6880ca53-7961-4eeb-86d5-6bd05fc2027e@ewie.name Backpatch-through: 12	2024-05-23 13:03:16 +09:00
Tom Lane	2f3cfcf767	Fix handling of extended expression statistics in CREATE TABLE LIKE. transformTableLikeClause believed that it could process extended statistics immediately because "the representation of CreateStatsStmt doesn't depend on column numbers". That was true when extended stats were first introduced, but it was falsified by the addition of extended stats on expressions: the parsed expression tree is fed forward by the LIKE option, and that will contain Vars. So if the new table doesn't have attnums identical to the old one's (typically because there are some dropped columns in the old one), that doesn't work. The CREATE goes through, but it emits invalid statistics objects that will cause problems later. Fortunately, we already have logic that can adapt expression trees to the possibly-new column numbering. To use it, we have to delay processing of CREATE_TABLE_LIKE_STATISTICS into expandTableLikeClause, just as for other LIKE options that involve expressions. Per bug #18468 from Alexander Lakhin. Back-patch to v14 where extended statistics on expressions were added. Discussion: https://postgr.es/m/18468-f5add190e3fa5902@postgresql.org	2024-05-22 17:54:17 -04:00
Tom Lane	4ac385adc5	Account for optimized MinMax aggregates during SS_finalize_plan. We are capable of optimizing MIN() and MAX() aggregates on indexed columns into subqueries that exploit the index, rather than the normal thing of scanning the whole table. When we do this, we replace the Aggref node(s) with Params referencing subquery outputs. Such Params really ought to be included in the per-plan-node extParam/allParam sets computed by SS_finalize_plan. However, we've never done so up to now because of an ancient implementation choice to perform that substitution during set_plan_references, which runs after SS_finalize_plan, so that SS_finalize_plan never sees these Params. The cleanest fix would be to perform a separate tree walk to do these substitutions before SS_finalize_plan runs. That seems unattractive, first because a whole-tree mutation pass is expensive, and second because we lack infrastructure for visiting expression subtrees in a Plan tree, so that we'd need a new function knowing as much as SS_finalize_plan knows about that. I also considered swapping the order of SS_finalize_plan and set_plan_references, but that fell foul of various assumptions that seem tricky to fix. So the approach adopted here is to teach SS_finalize_plan itself to check for such Aggrefs. I refactored things a bit in setrefs.c to avoid having three copies of the code that does that. Back-patch of v17 commits `d0d44049d` and `779ac2c74`. When `d0d44049d` went in, there was no evidence that it was fixing a reachable bug, so I refrained from back-patching. Now we have such evidence. Per bug #18465 from Hal Takahara. Back-patch to all supported branches. Discussion: https://postgr.es/m/18465-2fae927718976b22@postgresql.org Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us	2024-05-18 14:31:35 -04:00
Noah Misch	484b958737	Fix documentation about DROP DATABASE FORCE process termination rights. Specifically, it terminates a background worker even if the caller couldn't terminate the background worker with pg_terminate_backend(). Commit `3a9b18b309` neglected to update this. Back-patch to v13, which introduced DROP DATABASE FORCE. Reviewed by Amit Kapila. Reported by Kirill Reshke. Discussion: https://postgr.es/m/20240429212756.60.nmisch@google.com	2024-05-16 14:11:13 -07:00
Daniel Gustafsson	e6fc3b70df	Fix query result leak during binary upgrade `9a974cbcba` moved the query in binary_upgrade_set_pg_class_oids to the outer level, but left the PQclear and query buffer destruction in the is_index conditional. `353708e1fb` fixed the leak of the query buffer but left the PGresult leak. This moves clearing the result to the outer level ensuring that it will be called. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/374550C1-F4ED-4D9D-9498-0FD029CCF674@yesql.se Backpatch-through: v15	2024-05-15 22:48:51 +02:00
Peter Eisentraut	a826021e56	doc: Remove claims that initdb and pg_ctl use libpq environment variables Erroneously introduced by `571df93cff`. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/flat/8458c9c5-18f1-46d7-94c4-1c30e4f44908%40eisentraut.org	2024-05-15 13:06:53 +02:00
Tom Lane	c40e78d239	Fix handling of polymorphic output arguments for procedures. Most of the infrastructure for procedure arguments was already okay with polymorphic output arguments, but it turns out that CallStmtResultDesc() was a few bricks shy of a load here. It thought all it needed to do was call build_function_result_tupdesc_t, but that function specifically disclaims responsibility for resolving polymorphic arguments. Failing to handle that doesn't seem to be a problem for CALL in plpgsql, but CALL from plain SQL would get errors like "cannot display a value of type anyelement", or even crash outright. In v14 and later we can simply examine the exposed types of the CallStmt.outargs nodes to get the right type OIDs. But it's a lot more complicated to fix in v12/v13, because those versions don't have CallStmt.outargs, nor do they do expand_function_arguments until ExecuteCallStmt runs. We have to duplicatively run expand_function_arguments, and then re-determine which elements of the args list are output arguments. Per bug #18463 from Drew Kimball. Back-patch to all supported versions, since it's busted in all of them. Discussion: https://postgr.es/m/18463-f8cd77e12564d8a2@postgresql.org	2024-05-14 20:19:20 -04:00
Nathan Bossart	857d280c65	Fix pg_sequence_last_value() for unlogged sequences on standbys. Presently, when this function is called for an unlogged sequence on a standby server, it will error out with a message like ERROR: could not open file "base/5/16388": No such file or directory Since the pg_sequences system view uses pg_sequence_last_value(), it can error similarly. To fix, modify the function to return NULL for unlogged sequences on standby servers. Since this bug is present on all versions since v15, this approach is preferable to making the ERROR nicer because we need to repair the pg_sequences view without modifying its definition on released versions. For consistency, this commit also modifies the function to return NULL for other sessions' temporary sequences. The pg_sequences view already appropriately filters out such sequences, so there's no bug there, but we might as well offer some defense in case someone invokes this function directly. Unlogged sequences were first introduced in v15, but temporary sequences are much older, so while the fix for unlogged sequences is only back-patched to v15, the temporary sequence portion is back-patched to all supported versions. We could also remove the privilege check in the pg_sequences view definition in v18 if we modify this function to return NULL for sequences for which the current user lacks privileges, but that is left as a future exercise for when v18 development begins. Reviewed-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/20240501005730.GA594666%40nathanxps13 Backpatch-through: 12	2024-05-13 15:54:10 -05:00
Tom Lane	6e29963edd	Fix recursive RECORD-returning plpython functions. If we recursed to a new call of the same function, with a different coldeflist (AS clause), it would fail because the inner call would overwrite the outer call's idea of what to return. This is vaguely like `1d2fe56e4` and `c5bec5426`, but it's not due to any API decisions: it's just that we computed the actual output rowtype at the start of the call, and saved it in the per-procedure data structure. We can fix it at basically zero cost by doing the computation at the end of each call instead of the start. It's not clear that there's any real-world use-case for such a function, but given that it doesn't cost anything to fix, it'd be silly not to. Per report from Andreas Karlsson. Back-patch to all supported branches. Discussion: https://postgr.es/m/1651a46d-3c15-4028-a8c1-d74937b54e19@proxel.se	2024-05-09 13:16:21 -04:00
Michael Paquier	8c3f30e675	Fix overread in JSON parsing errors for incomplete byte sequences json_lex_string() relies on pg_encoding_mblen_bounded() to point to the end of a JSON string when generating an error message, and the input it uses is not guaranteed to be null-terminated. It was possible to walk off the end of the input buffer by a few bytes when the last bytes consist of an incomplete multi-byte sequence, as token_terminator would point to a location defined by pg_encoding_mblen_bounded() rather than the end of the input. This commit switches token_terminator so as the error uses data up to the end of the JSON input. More work should be done so as this code could rely on an equivalent of report_invalid_encoding() so as incorrect byte sequences can show in error messages in a readable form. This requires work for at least two cases in the JSON parsing API: an incomplete token and an invalid escape sequence. A more complete solution may be too invasive for a backpatch, so this is left as a future improvement, taking care of the overread first. A test is added on HEAD as test_json_parser makes this issue straight-forward to check. Note that pg_encoding_mblen_bounded() no longer has any callers. This will be removed on HEAD with a separate commit, as this is proving to encourage unsafe coding. Author: Jacob Champion Discussion: https://postgr.es/m/CAOYmi+ncM7pwLS3AnKCSmoqqtpjvA8wmCdoBtKA3ZrB2hZG6zA@mail.gmail.com Backpatch-through: 13	2024-05-09 12:45:45 +09:00
Tom Lane	6a458d93ba	Ensure that "pg_restore -l" reports dependent TOC entries correctly. If -l was specified together with selective-restore options such as -n or -N, dependent TOC entries such as comments would be omitted from the listing, even when an actual restore would have selected them. This happened because PrintTOCSummary neglected to update the te->reqs marking of the entry they depended on. Per report from Justin Pryzby. This has been wrong since `0d4e6ed30` taught _tocEntryRequired to sometimes look at the "reqs" marking of other TOC entries, so back-patch to all supported branches. Discussion: https://postgr.es/m/ZjoeirG7yxODdC4P@pryzbyj2023	2024-05-07 18:23:07 -04:00
Tom Lane	363e8c2f98	Don't corrupt plpython's "TD" dictionary in a recursive trigger call. If a plpython-language trigger caused another one to be invoked, the "TD" dictionary created for the inner one would overwrite the outer one's "TD" dictionary. This is more or less the same problem that `1d2fe56e4` fixed for ordinary functions in plpython, so fix it the same way, by saving and restoring "TD" during a recursive invocation. This fix makes an ABI-incompatible change in struct PLySavedArgs. I'm not too worried about that because it seems highly unlikely that any extension is messing with those structs. We could imagine doing something weird to preserve nominal ABI compatibility in the back branches, like keeping the saved TD object in an extra element of namedargs[]. However, that would only be very nominal compatibility: if anything is touching PLySavedArgs, it would likely do the wrong thing due to not knowing about the additional value. So I judge it not worth the ugliness to do something different there. (I also changed struct PLyProcedure, but its added field fits into formerly-padding space, so that should be safe.) Per bug #18456 from Jacques Combrink. This bug is very ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/3008982.1714853799@sss.pgh.pa.us	2024-05-07 18:15:00 -04:00
Tom Lane	4a53584cf2	Stamp 15.7.	2024-05-06 16:23:18 -04:00
Tom Lane	7b2ac0f603	Last-minute updates for release notes. Security: CVE-2024-4317	2024-05-06 12:27:26 -04:00
Nathan Bossart	9cc2b62894	Fix privilege checks in pg_stats_ext and pg_stats_ext_exprs. The catalog view pg_stats_ext fails to consider privileges for expression statistics. The catalog view pg_stats_ext_exprs fails to consider privileges and row-level security policies. To fix, restrict the data in these views to table owners or roles that inherit privileges of the table owner. It may be possible to apply less restrictive privilege checks in some cases, but that is left as a future exercise. Furthermore, for pg_stats_ext_exprs, do not return data for tables with row-level security enabled, as is already done for pg_stats_ext. On the back-branches, a fix-CVE-2024-4317.sql script is provided that will install into the "share" directory. This file can be used to apply the fix to existing clusters. Bumps catversion on 'master' branch only. Reported-by: Lukas Fittl Reviewed-by: Noah Misch, Tomas Vondra, Tom Lane Security: CVE-2024-4317 Backpatch-through: 14	2024-05-06 09:00:13 -05:00
Peter Eisentraut	3672c6cdfd	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 141c2cc465bc7bd1e2d43243cf81215b0b14abd4	2024-05-06 12:10:46 +02:00
Tom Lane	ac7049dbf3	Release notes for 16.3, 15.7, 14.12, 13.15, 12.19.	2024-05-05 13:31:09 -04:00
Tom Lane	5f4a1a0a77	Throw a more on-point error for publications depending on columns. Same as `42b041243`, except that the trouble case is a publication WHERE clause that depends on a column. Again reported by Alexander Lakhin. Back-patch to v15 where we added publication WHERE clauses. Discussion: https://postgr.es/m/548a47bc-87ae-b3df-c6a2-60b9966f808b@gmail.com	2024-05-02 17:36:31 -04:00
Peter Eisentraut	da55e4cd1f	doc: Fix description of deterministic flag of CREATE COLLATION The documentation said that you need to pick a suitable LC_COLLATE setting in addition to setting the DETERMINISTIC flag. This would have been correct if the libc provider supported nondeterministic collations, but since it doesn't, you actually need to set the LOCALE option. Reviewed-by: Kashif Zeeshan <kashi.zeeshan@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a71023c2-0ae0-45ad-9688-cf3b93d0d65b%40eisentraut.org	2024-05-02 08:23:11 +02:00
David Rowley	7e5d20bbd1	Disable run condition optimization for some WindowFuncs `94985c210` added code to detect when WindowFuncs were monotonic and allowed additional quals to be "pushed down" into the subquery to be used as WindowClause runConditions in order to short-circuit execution in nodeWindowAgg.c. The Node representation of runConditions wasn't well selected and because we do qual pushdown before planning the subquery, the planning of the subquery could perform subquery pull-up of nested subqueries. For WindowFuncs with args, the arguments could be changed after pushing the qual down to the subquery. This was made more difficult by the fact that the code duplicated the WindowFunc inside an OpExpr to include in the WindowClauses runCondition field. This could result in duplication of subqueries and a pull-up of such a subquery could result in another initplan parameter being issued for the 2nd version of the subplan. This could result in errors such as: ERROR: WindowFunc not found in subplan target lists Here in the backbranches, we don't have the flexibility to improve the Node representation to resolve this, so instead we just disable the runCondition optimization for ntile() unless the argument is a Const, (v16 only) and likewise for count(expr) (both v15 and v16). count(*) is unaffected. All other window functions which support this optimization all take zero arguments and therefore are unaffected. Bug: #18170 Reported-by: Zuming Jiang Discussion: https://postgr.es/m/18170-f1d17bf9a0d58b24@postgresql.org Backpatch-through 15 (master will be fixed independently)	2024-05-01 16:35:37 +12:00

... 2 3 4 5 6 ...

55150 Commits