This should have the same results for all practical purposes.
The advantage of selecting 'GMT' is that it's guaranteed to work
even when the remote system's timezone database is missing
entries, because pg_tzset() hard-wires handling of that,
at least in 9.2 and later.
(It seems like it would be a good idea to similarly hard-wire
correct handling of 'UTC', but that'll be a little more invasive
than I want to consider back-patching. Leave that for another
day when we're not in feature freeze.)
Per trouble report from Adnan Dautovic. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/465248.1712211585@sss.pgh.pa.us
This fixes various typos, duplicated words, and tiny bits of whitespace
mainly in code comments but also in docs.
Author: Daniel Gustafsson <daniel@yesql.se>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Alexander Lakhin <exclusion@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
As explained in 4d916dd876, the test instability is caused by delayed
cleanup of deleted rows. This commit removes the DELETE, stabilizing the
test without accidentally disabling parallel builds.
The intent of the delete however was to produce empty ranges, and test
that the parallel index build populates those correctly. But there's
another way to create empty ranges - partial indexes, which does not
rely on cleanup of deleted rows.
Idea to use partial indexes by Matthias van de Meent, patch by me.
Discussion: https://postgr.es/m/95d9cd43-5a92-407c-b7e4-54cd303630fe%40enterprisedb.com
The test for parallel create of BRIN indexes added by commit 8225c2fd40
happens to be unstable - a background transaction (e.g. auto-analyze)
may hold back global xmin for the initial VACUUM / CREATE INDEX. If the
cleanup happens before the next CREATE INDEX, the indexes will not be
exactly the same.
This is the same issue as e2933a6e11, so fix it the same way by making
the table TEMPORARY, which uses an up-to-date cutoff xmin that is not
held back by other processes.
Reported by Alexander Lakhin, who also suggested the fix.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/b58901cd-a7cc-29c6-e2b1-e3d7317c3c69@gmail.com
Discussion: https://postgr.es/m/2892135.1668976646@sss.pgh.pa.us
Adds a regression test for parallel CREATE INDEX for BRIN indexes, to
improve coverage for BRIN code, particularly code to allow parallel
index builds introduced by b437571714.
The test is added to pageinspect, as that allows comparing the index to
one built without parallelism. Another option would be to just build the
index with parallelism and then check it produces correct results. But
checking the index is exactly as if built without parallelism makes
these query checks unnecessary.
Discussion: https://postgr.es/m/1df00a66-db5a-4e66-809a-99b386a06d86%40enterprisedb.com
This adjusts various appendStringInfo* function calls to use a more
appropriate and efficient function with the same behavior. For example,
use appendStringInfoChar() when appending a single character rather than
appendStringInfo() and appendStringInfoString() when no formatting is
required rather than using appendStringInfo().
All adjustments made here are in code that's new to v17, so it makes
sense to fix these now rather than wait a few years and make
backpatching harder.
Discussion: https://postgr.es/m/CAApHDvojY2UvMiO+9_55ArTj10P1LBNJyyoGB+C65BLDNT0GsQ@mail.gmail.com
Reviewed-by: Nathan Bossart, Tom Lane
When testing buffer pool logic, it is useful to be able to evict
arbitrary blocks. This function can be used in SQL queries over the
pg_buffercache view to set up a wide range of buffer pool states. Of
course, buffer mappings might change concurrently so you might evict a
block other than the one you had in mind, and another session might
bring it back in at any time. That's OK for the intended purpose of
setting up developer testing scenarios, and more complicated interlocking
schemes to give stronger guararantees about that would likely be less
flexible for actual testing work anyway. Superuser-only.
Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com> (docs, small tweaks)
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Cary Huang <cary.huang@highgo.ca>
Reviewed-by: Cédric Villemain <cedric.villemain+pgsql@abcsql.com>
Reviewed-by: Jim Nasby <jim.nasby@gmail.com>
Reviewed-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CALfch19pW48ZwWzUoRSpsaV9hqt0UPyaBPC4bOZ4W+c7FF566A@mail.gmail.com
Presently, the archiver process restarts when an archive callback
ERRORs. To avoid this, archive module authors can use sigsetjmp(),
manage a memory context, etc., but that requires a lot of extra
code that will likely look roughly the same between modules. This
commit adds basic archive callback ERROR handling to pgarch.c so
that module authors won't ordinarily need to worry about this.
While this built-in handler attempts to clean up anything that an
archive module could conceivably have left behind, it is possible
that some modules are doing unexpected things that require
additional cleanup. Module authors should be sure to do any extra
required cleanup in a PG_CATCH block within the archiving callback.
The archiving callback is now called in a short-lived memory
context that the archiver process resets between invocations. If a
module requires longer-lived storage, it must maintain its own
memory context.
Thanks to these changes, the basic_archive module can be greatly
simplified.
Suggested-by: Andres Freund
Reviewed-by: Andres Freund, Yong Li
Discussion: https://postgr.es/m/20230217215624.GA3131134%40nathanxps13
The test fails when RESET statement_timeout takes longer than 10ms.
Avoid the problem by using SET LOCAL instead.
Overall, this test is not ideal: 10ms could be shorter than the time to
have sent the query to the "remote" server, so it's possible that on
some machines this test doesn't actually witness a remote query being
cancelled. We may want to improve on this someday by using some other
testing technique, but for now it's better than nothing. I verified
manually that one round of remote cancellation occurs when this runs on
my machine.
Discussion: https://postgr.es/m/CAGECzQRsdWnj=YaaPCnA8d7E1AdbxRPBYmyBQRMPUijR2MpM_w@mail.gmail.com
Commit 61461a300c introduced new functions to libpq for cancelling
queries. This commit introduces a helper function that backend-side
libraries and extensions can use to invoke those. This function takes a
timeout and can itself be interrupted while it is waiting for a cancel
request to be sent and processed, instead of being blocked.
This replaces the usage of the old functions in postgres_fdw and dblink.
Finally, it also adds some test coverage for the cancel support in
postgres_fdw.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQT_VgOWWENUqvUV9xQmbaCyXjtRRAYO8W07oqashk_N+g@mail.gmail.com
Since 82a4edabd2 we can bulk extend relations. The bulk relation extension
logic has a heuristic component. Normally the heurstic does not trigger in the
occasionally-failing test case, as the relation is only extended once. But
with very small shared_buffers the limits for the number of buffers pinned at
once prevent the extension from happening at once. With the second "bulk"
extension, the heuristic kicks in, and the relation ends up one block bigger.
That's ok from a correctness perspective, but changes the results of the test
query due to one additional block.
We discussed a few more expansive fixes, but for now have decided to avoid
this by making the table a bit smaller.
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by:
Discussion: https://postgr.es/m/29c74104-210b-ef39-2522-27a6aa7a704f@iki.fi
Discussion: https://postgr.es/m/20230916000011.2ugpkkkp7bpp4cfh@awork3.anarazel.de
Backpatch: 16-, where the new relation extension logic was added
Until now, UNION queries have often been suboptimal as the planner has
only ever considered using an Append node and making the results unique
by either using a Hash Aggregate, or by Sorting the entire Append result
and running it through the Unique operator. Both of these methods
always require reading all rows from the union subqueries.
Here we adjust the union planner so that it can request that each subquery
produce results in target list order so that these can be Merge Appended
together and made unique with a Unique node. This can improve performance
significantly as the union child can make use of the likes of btree
indexes and/or Merge Joins to provide the top-level UNION with presorted
input. This is especially good if the top-level UNION contains a LIMIT
node that limits the output rows to a small subset of the unioned rows as
cheap startup plans can be used.
Author: David Rowley
Reviewed-by: Richard Guo, Andy Fan
Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
Add PERIOD clause to foreign key constraint definitions. This is
supported for range and multirange types. Temporal foreign keys check
for range containment instead of equality.
This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).
Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
It might happen that the varlena value wasn't compressed by index_form_tuple()
due to current storage parameters. If compression is currently enabled, we
need to compress such values to match index tuple coming from the heap.
Backpatch to all supported versions.
Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru
Author: Andrey Borodin
Reviewed-by: Alexander Lakhin, Michael Zhilin, Jian He, Alexander Korotkov
Backpatch-through: 12
In the heap, tuples may contain short varlena datum with both 1B header and 4B
headers. But the corresponding index tuple should always have such varlena's
with 1B headers. So, for fingerprinting, we need to convert.
Backpatch to all supported versions.
Discussion: https://postgr.es/m/flat/7bdbe559-d61a-4ae4-a6e1-48abdf3024cc%40postgrespro.ru
Author: Michael Zhilin
Reviewed-by: Alexander Lakhin, Andrey Borodin, Jian He, Alexander Korotkov
Backpatch-through: 12
This adds the X509 attributes notBefore and notAfter to sslinfo
as well as pg_stat_ssl to allow verifying and identifying the
validity period of the current client certificate. OpenSSL has
APIs for extracting notAfter and notBefore, but they are only
supported in recent versions so we have to calculate the dates
by hand in order to make this work for the older versions of
OpenSSL that we still support.
Original patch by Cary Huang with additional hacking by Jacob
and myself.
Author: Cary Huang <cary.huang@highgo.ca>
Co-author: Jacob Champion <jacob.champion@enterprisedb.com>
Co-author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/182b8565486.10af1a86f158715.2387262617218380588@highgo.ca
Historically we've printed SubPlan expression nodes as "(SubPlan N)",
which is pretty uninformative. Trying to reproduce the original SQL
for the subquery is still as impractical as before, and would be
mighty verbose as well. However, we can still do better than that.
Displaying the "testexpr" when present, and adding a keyword to
indicate the SubLinkType, goes a long way toward showing what's
really going on.
In addition, this patch gets rid of EXPLAIN's use of "$n" to represent
subplan and initplan output Params. Instead we now print "(SubPlan
N).colX" or "(InitPlan N).colX" to represent the X'th output column
of that subplan. This eliminates confusion with the use of "$n" to
represent PARAM_EXTERN Params, and it's useful for the first part of
this change because it eliminates needing some other indication of
which subplan is referenced by a SubPlan that has a testexpr.
In passing, this adds simple regression test coverage of the
ROWCOMPARE_SUBLINK code paths, which were entirely unburdened
by testing before.
Tom Lane and Dean Rasheed, reviewed by Aleksander Alekseev.
Thanks to Chantal Keller for raising the question of whether
this area couldn't be improved.
Discussion: https://postgr.es/m/2838538.1705692747@sss.pgh.pa.us
Commit 61461a300c introduced new functions to libpq for cancelling
queries. This replaces the usage of the old ones in parts of the
codebase with these newer ones. This specifically leaves out changes to
psql and pgbench, as those would need a much larger refactor to be able
to call them due to the new functions not being signal-safe; and also
postgres_fdw, because the original code there is not clear to me
(Álvaro) and not fully tested.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/CAGECzQT_VgOWWENUqvUV9xQmbaCyXjtRRAYO8W07oqashk_N+g@mail.gmail.com
Currently, pg_visibility computes its xid horizon using the
GetOldestNonRemovableTransactionId(). The problem is that this horizon can
sometimes go backward. That can lead to reporting false errors.
In order to fix that, this commit implements a new function
GetStrictOldestNonRemovableTransactionId(). This function computes the xid
horizon, which would be guaranteed to be newer or equal to any xid horizon
computed before.
We have to do the following to achieve this.
1. Ignore processes xmin's, because they consider connection to other databases
that were ignored before.
2. Ignore KnownAssignedXids, because they are not database-aware. At the same
time, the primary could compute its horizons database-aware.
3. Ignore walsender xmin, because it could go backward if some replication
connections don't use replication slots.
As a result, we're using only currently running xids to compute the horizon.
Surely these would significantly sacrifice accuracy. But we have to do so to
avoid reporting false errors.
Inspired by earlier patch by Daniel Shelepanov and the following discussion
with Robert Haas and Tom Lane.
Discussion: https://postgr.es/m/1649062270.289865713%40f403.i.mail.ru
Reviewed-by: Alexander Lakhin, Dmitry Koval
For UNION ALL queries where a union child query contained a foreign
table, if the targetlist of that query contained a constant, and the
top-level query performed an ORDER BY which contained the column for the
constant value, then postgres_fdw would find the EquivalenceMember with
the Const and then try to produce an ORDER BY containing that Const.
This caused problems with INT typed Consts as these could appear to be
requests to order by an ordinal column position rather than the constant
value. This could lead to either an error such as:
ERROR: ORDER BY position <int const> is not in select list
or worse, if the constant value is a valid column, then we could just
sort by the wrong column altogether.
Here we fix this issue by just not including these Consts in the ORDER
BY clause.
In passing, add a new section for testing ORDER BY in the postgres_fdw
tests and move two existing tests which were misplaced in the WHERE
clause testing section into it.
Reported-by: Michał Kłeczek
Reviewed-by: Ashutosh Bapat, Richard Guo
Bug: #18381
Discussion: https://postgr.es/m/0714C8B8-8D82-4ABB-9F8D-A0C3657E7B6E%40kleczek.org
Discussion: https://postgr.es/m/18381-137456acd168bf93%40postgresql.org
Backpatch-through: 12, oldest supported version
contrib/tablefunc connectby() checks both type OID and typmod for
its output columns while crosstab() only checks type OID. Fix that
by makeing the crosstab() check look more like the connectby() check.
Reported-by: Tom Lane
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/flat/18937.1709676295%40sss.pgh.pa.us
These messages were fairly confusing, and didn't match the
column names used in the SGML docs. Try to improve that.
Also use error codes more specific than ERRCODE_SYNTAX_ERROR.
Patch by me, reviewed by Joe Conway
Discussion: https://postgr.es/m/18937.1709676295@sss.pgh.pa.us
Use GUC_ACTION_SAVE rather than GUC_ACTION_SET, necessary for working
with parallel query.
Now that the call requires more arguments, wrap the call in a new
function to avoid code duplication and offer a place for a comment.
Discussion: https://postgr.es/m/E1rhJpO-0027Wf-9L@gemulon.postgresql.org
Presently, if an archive module's check_configured_cb callback
returns false, a generic WARNING message is emitted, which
unfortunately provides no actionable details about the reason why
the module is not configured. This commit introduces a macro that
archive module authors can use to add a DETAIL line to this WARNING
message.
Co-authored-by: Tung Nguyen
Reviewed-by: Daniel Gustafsson, Álvaro Herrera
Discussion: https://postgr.es/m/4109578306242a7cd5661171647e11b2%40oss.nttdata.com
The adminpack extension was only used to support pgAdmin III, which
in turn was declared EOL many years ago. Removing the extension also
allows us to remove functions from core as well which were only used
to support old version of adminpack.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACUmL5TraYBUBqDZBi1C+Re8_=SekqGYqYprj_W8wygQ8w@mail.gmail.com
The new facility makes it easier to optimize bulk loading, as the
logic for buffering, WAL-logging, and syncing the relation only needs
to be implemented once. It's also less error-prone: We have had a
number of bugs in how a relation is fsync'd - or not - at the end of a
bulk loading operation. By centralizing that logic to one place, we
only need to write it correctly once.
The new facility is faster for small relations: Instead of of calling
smgrimmedsync(), we register the fsync to happen at next checkpoint,
which avoids the fsync latency. That can make a big difference if you
are e.g. restoring a schema-only dump with lots of relations.
It is also slightly more efficient with large relations, as the WAL
logging is performed multiple pages at a time. That avoids some WAL
header overhead. The sorted GiST index build did that already, this
moves the buffering to the new facility.
The changes to pageinspect GiST test needs an explanation: Before this
patch, the sorted GiST index build set the LSN on every page to the
special GistBuildLSN value, not the LSN of the WAL record, even though
they were WAL-logged. There was no particular need for it, it just
happened naturally when we wrote out the pages before WAL-logging
them. Now we WAL-log the pages first, like in B-tree build, so the
pages are stamped with the record's real LSN. When the build is not
WAL-logged, we still use GistBuildLSN. To make the test output
predictable, use an unlogged index.
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/30e8f366-58b3-b239-c521-422122dd5150%40iki.fi
Commit 6b80394781 introduced integer comparison functions designed
to be as efficient as possible while avoiding overflow. This
commit makes use of these functions in many of the in-tree qsort()
comparators to help ensure transitivity. Many of these comparator
functions should also see a small performance boost.
Author: Mats Kindahl
Reviewed-by: Andres Freund, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
For ANY-SUBLINK, we adopted a two-stage pull-up approach to handle
different types of scenarios. In the first stage, the sublink is pulled up
as a subquery. Because of this, when writing this code, we did not have
the ability to perform lateral joins, and therefore, we were unable to
pull up Var with varlevelsup=1. Now that we have the ability to use
lateral joins, we can eliminate this limitation.
Author: Andy Fan <zhihui.fan1213@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru>
This commit introduces a new SQL function pg_sync_replication_slots()
which is used to synchronize the logical replication slots from the
primary server to the physical standby so that logical replication can be
resumed after a failover or planned switchover.
A new 'synced' flag is introduced in pg_replication_slots view, indicating
whether the slot has been synchronized from the primary server. On a
standby, synced slots cannot be dropped or consumed, and any attempt to
perform logical decoding on them will result in an error.
The logical replication slots on the primary can be synchronized to the
hot standby by using the 'failover' parameter of
pg-create-logical-replication-slot(), or by using the 'failover' option of
CREATE SUBSCRIPTION during slot creation, and then calling
pg_sync_replication_slots() on standby. For the synchronization to work,
it is mandatory to have a physical replication slot between the primary
and the standby aka 'primary_slot_name' should be configured on the
standby, and 'hot_standby_feedback' must be enabled on the standby. It is
also necessary to specify a valid 'dbname' in the 'primary_conninfo'.
If a logical slot is invalidated on the primary, then that slot on the
standby is also invalidated.
If a logical slot on the primary is valid but is invalidated on the
standby, then that slot is dropped but will be recreated on the standby in
the next pg_sync_replication_slots() call provided the slot still exists
on the primary server. It is okay to recreate such slots as long as these
are not consumable on standby (which is the case currently). This
situation may occur due to the following reasons:
- The 'max_slot_wal_keep_size' on the standby is insufficient to retain
WAL records from the restart_lsn of the slot.
- 'primary_slot_name' is temporarily reset to null and the physical slot
is removed.
The slot synchronization status on the standby can be monitored using the
'synced' column of pg_replication_slots view.
A functionality to automatically synchronize slots by a background worker
and allow logical walsenders to wait for the physical will be done in
subsequent commits.
Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut
Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
The following functions use a mix of bytea and text arguments, but their
C internals always used PG_GETARG_BYTEA_PP(), creating an incorrect mix
with the argument types expected by encrypt_internal():
- pgp_sym_encrypt_bytea(bytea,text[,text])
- pgp_sym_encrypt(text,text[,text])
- pgp_sym_decrypt_bytea(bytea,text[,text])
- pgp_sym_decrypt(bytea,text[,text])
- pgp_pub_encrypt_bytea(bytea,bytea[,text])
- pgp_pub_encrypt(text,bytea[,text])
- pgp_pub_decrypt_bytea(bytea, bytea[,text[,text]])
- pgp_pub_decrypt(bytea,bytea[,text[,text]])
This commit fixes the inconsistencies between the PG_GETARG*() macros
and the argument types of each function.
Both BYTEA_PP() and TEXT_PP() rely on PG_DETOAST_DATUM_PACKED(), that
returns an unaligned pointer, so this was not leading to an actual
problem as far as I know, but let's be consistent.
Author: Shihao Zhong
Discussion: https://postgr.es/m/CAGRkXqRfiWT--DzVPx_UGpNHTt0YT5Jo8eV2CtT56jNP=QpXSQ@mail.gmail.com
The code copying the PGP block into the temp buffer failed to
account for the extra 2 bytes in the buffer which are needed
for the prefix. If the block was oversized, subsequent checks
of the prefix would have exceeded the buffer size. Since the
block sizes are hardcoded in the list of supported ciphers it
can be verified that there is no live bug here. Backpatch all
the way for consistency though, as this bug is old.
Author: Mikhail Gribkov <youzhick@gmail.com>
Discussion: https://postgr.es/m/CAMEv5_uWvcMCMdRFDsJLz2Q8g16HEa9xWyfrkr+FYMMFJhawOw@mail.gmail.com
Backpatch-through: v12
libxml2 changed the required signature of error handler callbacks
to make the passed xmlError struct "const". This is causing build
failures on buildfarm member caiman, and no doubt will start showing
up in the field quite soon. Add a version check to adjust the
declaration of xml_errorHandler() according to LIBXML_VERSION.
2.12.x also produces deprecation warnings for contrib/xml2/xpath.c's
assignment to xmlLoadExtDtdDefaultValue. I see no good reason for
that to still be there, seeing that we disabled external DTDs (at a
lower level) years ago for security reasons. Let's just remove it.
Back-patch to all supported branches, since they might all get built
with newer libxml2 once it gets a bit more popular. (The back
branches produce another deprecation warning about xpath.c's use of
xmlSubstituteEntitiesDefault(). We ought to consider whether to
back-patch all or part of commit 65c5864d7 to silence that. It's
less urgent though, since it won't break the buildfarm.)
Discussion: https://postgr.es/m/1389505.1706382262@sss.pgh.pa.us
This adds a new "Memory:" line under the "Planning:" group (which
currently only has "Buffers:") when the MEMORY option is specified.
In order to make the reporting reasonably accurate, we create a separate
memory context for planner activities, to be used only when this option
is given. The total amount of memory allocated by that context is
reported as "allocated"; we subtract memory in the context's freelists
from that and report that result as "used". We use
MemoryContextStatsInternal() to obtain the quantities.
The code structure to show buffer usage during planning was not in
amazing shape, so I (Álvaro) modified the patch a bit to clean that up
in passing.
Author: Ashutosh Bapat
Reviewed-by: David Rowley, Andrey Lepikhov, Jian He, Andy Fan
Discussion: https://www.postgresql.org/message-id/CAExHW5sZA=5LJ_ZPpRO-w09ck8z9p7eaYAqq3Ks9GDfhrxeWBw@mail.gmail.com
Since commit a4ccc1cef, the 'node' and 'alloc_tuple_size' fields of
the ReorderBufferTupleBuf structure are no longer used. This leaves
only the 'tuple' field in the structure. Since keeping a single-field
structure makes little sense, the ReorderBufferTupleBuf is removed
entirely. The code is refactored accordingly.
No back-patching since these are ABI changes in an exposed structure
and functions, and there would be some risk of breaking extensions.
Author: Aleksander Alekseev
Reviewed-by: Amit Kapila, Masahiko Sawada, Reid Thompson
Discussion: https://postgr.es/m/CAD21AoCvnuxiXXfRecp7g9+CeC35POQfhuQeJFr7_9u_Q5jc_Q@mail.gmail.com
This reverts commit 2197d06224, following a discussion over a Coverity
report where issues like the "Billion laugh attack" could cause the
backend to waste CPU and memory even if a client applied checks on the
size of the data given in input, and libxml2 does not offer guarantees
that input limits are respected under XML_PARSE_HUGE.
Discussion: https://postgr.es/m/ZbHlgrPLtBZyr_QW@paquier.xyz
This commit adds the failover property to the replication slot. The
failover property indicates whether the slot will be synced to the standby
servers, enabling the resumption of corresponding logical replication
after failover. But note that this commit does not yet include the
capability to sync the replication slot; the subsequent commits will add
that capability.
A new optional parameter 'failover' is added to the
pg_create_logical_replication_slot() function. We will also enable to set
'failover' option for slots via the subscription commands in the
subsequent commits.
The value of the 'failover' flag is displayed as part of
pg_replication_slots view.
Author: Hou Zhijie, Shveta Malik, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
Commit abb0b4fc03 moved the shared state for autoprewarm to a
dynamic shared memory (DSM) segment, but it left apw_detach_shmem()
in the on_shmem_exit callback list for the autoprewarm leader
process. This is a problem because shmem_exit() detaches all the
DSM segments prior to calling the on_shmem_exit callbacks, thus
producing segfaults in the exit path for the autoprewarm leader
process.
To fix, move apw_detach_shmem() to the before_shmem_exit callback
list. This commit also adds a check to pg_prewarm's test that the
server shut down normally. It might be worth making this a common
check for all shutdowns in TAP tests, but that is left as a future
exercise.
Reported-by: Andres Freund
Reviewed-by: Andres Freund, Álvaro Herrera
Discussion: https://postgr.es/m/20240122204117.swton324xcoodnyi%40awork3.anarazel.de
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL. The evaluation of
these needless quals adds overhead. Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required. However, a recent bug
report indicates this might not always be possible.
Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice. That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.
Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.
Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all. This also works for join relations when the Var is not
nullable by any outer join.
This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.
Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
Besides showcasing the DSM registry, this prevents pg_prewarm from
stealing from the main shared memory segment's extra buffer space
when autoprewarm_start_worker() and autoprewarm_dump_now() are used
without loading the module via shared_preload_libraries.
Suggested-by: Michael Paquier
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/20231205034647.GA2705267%40nathanxps13
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required). We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.
This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses. It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
This commit adds XML_PARSE_HUGE to the libxml2 functions used in core
for the parsing of XML objects, raising up the original limit of 10MB
supported by libxml2.
In most code paths of upstream, XML_MAX_TEXT_LENGTH (10^7) is the
historical limit that gets upgraded to XML_MAX_HUGE_LENGTH (10^9) once
XML_PARSE_HUGE is given to the parser calls. These are still limited by
any palloc() calls for text, up to 1GB.
This offers the possibility to handle within the backend XML objects
larger than 10MB in general, with also a higher depth limit. This
change affects the contrib module xml2, the xml data type and SQL/XML.
Author: Dmitry Koval
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/18274-98d16bc03520665f@postgresql.org
Some functions are used in the tree and are currently marked as
deprecated by upstream. This commit refreshes the code to use the
recommended functions, leading to the following changes:
- xmlSubstituteEntitiesDefault() is gone, and needs to be replaced with
XML_PARSE_NOENT for the paths doing the parsing.
- xmlParseMemory() -> xmlReadMemory().
These functions, as well as more functions setting global states, have
been officially marked as deprecated by upstream in August 2022. Their
replacements exist since the 2001-ish area, as far as I have checked,
so that should be safe.
Author: Dmitry Koval
Discussion: https://postgr.es/m/18274-98d16bc03520665f@postgresql.org
This replaces dblink's blocking libpq calls, allowing cancellation and
allowing DROP DATABASE (of a database not involved in the query). Apart
from explicit dblink_cancel_query() calls, dblink still doesn't cancel
the remote side. The replacement for the blocking calls consists of
new, general-purpose query execution wrappers in the libpqsrv facility.
Out-of-tree extensions should adopt these. Use them in postgres_fdw,
replacing a local implementation from which the libpqsrv implementation
derives. This is a bug fix for dblink. Code inspection identified the
bug at least thirteen years ago, but user complaints have not appeared.
Hence, no back-patch for now.
Discussion: https://postgr.es/m/20231122012945.74@rfd.leadboat.com
An array element equal to INT_MAX gave this code indigestion,
causing an infinite loop that surely ended in SIGSEGV. We fixed
some nearby problems awhile ago (cf 757c5182f) but missed this.
Report and diagnosis by Alexander Lakhin (bug #18273); patch by me
Discussion: https://postgr.es/m/18273-9a832d1da122600c@postgresql.org
New TAP tests added by commits 9a17be1e and 4710b67d missed to convert
warnings to FATAL. This commit fixes that.
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
There are a lot of Perl scripts in the tree, mostly code generation
and TAP tests. Occasionally, these scripts produce warnings. These
are probably always mistakes on the developer side (true positives).
Typical examples are warnings from genbki.pl or related when you make
a mess in the catalog files during development, or warnings from tests
when they massage a config file that looks different on different
hosts, or mistakes during merges (e.g., duplicate subroutine
definitions), or just mistakes that weren't noticed because there is a
lot of output in a verbose build.
This changes all warnings into fatal errors, by replacing
use warnings;
by
use warnings FATAL => 'all';
in all Perl files.
Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
This function reads directly a page from a relation, relying on
index_open() to open the index to read from. Unfortunately, this would
crash when using partitioned indexes, as these can be opened with
index_open() but they have no physical pages.
Alexander has fixed the module, while I have written the test.
Author: Alexander Lakhin, Michael Paquier
Discussion: https://postgr.es/m/18246-f4d9ff7cb3af77e6@postgresql.org
Backpatch-through: 12
As coded, the function relied on index_open() when opening an index
relation, allowing partitioned indexes to be processed by
pgstathashindex(). This was leading to a "could not open file" error
because partitioned indexes have no physical files, or to a crash with
an assertion failure (like on HEAD).
This issue is fixed by applying the same checks as the other stat
functions for indexes, with a lookup at both RELKIND_INDEX and the index
AM expected.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/18246-f4d9ff7cb3af77e6@postgresql.org
Backpatch-through: 12
Teach _bt_binsrch (and related helper routines like _bt_search and
_bt_compare) about the initial positioning requirements of backward
scans. Routines like _bt_binsrch already know all about "nextkey"
searches, so it seems natural to teach them about "goback"/backward
searches, too. These concepts are closely related, and are much easier
to understand when discussed together.
Now that certain implementation details are hidden from _bt_first, it's
straightforward to add a new optimization: backward scans using the <
strategy now avoid extra leaf page accesses in certain "boundary cases".
Consider the following example, which uses the tenk1 table (and its
tenk1_hundred index) from the standard regression tests:
SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1;
Before this commit, nbtree would scan two leaf pages, even though it was
only really necessary to scan one leaf page. We'll now descend straight
to the leaf page containing a (12, -inf) high key instead. The scan
will locate matching non-pivot tuples with "hundred" values starting
from the value 11. The scan won't waste a page access on the right
sibling leaf page, which cannot possibly contain any matching tuples.
You can think of the optimization added by this commit as disabling an
optimization (the _bt_compare "!pivotsearch" behavior that was added to
Postgres 12 in commit dd299df8) for a small subset of cases where it was
always counterproductive.
Equivalently, you can think of the new optimization as extending the
"pivotsearch" behavior that page deletion by VACUUM has long required
(since the aforementioned Postgres 12 commit went in) to other, similar
cases. Obviously, this isn't strictly necessary for these new cases
(unlike VACUUM, _bt_first is prepared to move the scan to the left once
on the leaf level), but the underlying principle is the same.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com
Allow using multiple worker processes to build BRIN index, which until
now was supported only for BTREE indexes. For large tables this often
results in significant speedup when the build is CPU-bound.
The work is split in a simple way - each worker builds BRIN summaries on
a subset of the table, determined by the regular parallel scan used to
read the data, and feeds them into a shared tuplesort which sorts them
by blkno (start of the range). The leader then reads this sorted stream
of ranges, merges duplicates (which may happen if the parallel scan does
not align with BRIN pages_per_range), and adds the resulting ranges into
the index.
The number of duplicate results produced by workers (requiring merging
in the leader process) should be fairly small, thanks to how parallel
scans assign chunks to workers. The likelihood of duplicate results may
increase for higher pages_per_range values, but then there are fewer
page ranges in total. In any case, we expect the merging to be much
cheaper than summarization, so this should be a win.
Most of the parallelism infrastructure is a simplified copy of the code
used by BTREE indexes, omitting the parts irrelevant for BRIN indexes
(e.g. uniqueness checks).
This also introduces a new index AM flag amcanbuildparallel, determining
whether to attempt to start parallel workers for the index build.
Original patch by me, with reviews and substantial reworks by Matthias
van de Meent, certainly enough to make him a co-author.
Author: Tomas Vondra, Matthias van de Meent
Reviewed-by: Matthias van de Meent
Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
When building BRIN indexes, the brinbuildCallback only advances to the
next page range when seeing a tuple that doesn't belong to the current
one. This means that the index may end up missing ranges at the end of
the table, if those pages do not contain any indexable tuples.
We tend not to have completely empty pages at the end of a relation, but
this also applies to partial indexes, where the tuples may simply not
match the index predicate. This results in inefficient scans using the
affected BRIN index - without the summaries, the page ranges have to be
read and processed, which consumes I/O and possibly also CPU time.
The existing code already added empty ranges for earlier parts of the
table, this commit makes sure we add them for the ranges at the end of
the table too.
Patch by Matthias van de Meent, with review/improvements by me.
Author: Matthias van de Meent
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAEze2WiMsPZg%3DxkvSF_jt4%3D69k6K7gz5B8V2wY3gCGZ%2B1BzCbQ%40mail.gmail.com
SEMI-JOIN is deparsed as the EXISTS subquery. It references outer and inner
relations, so it should be evaluated as the condition in the upper-level WHERE
clause. The signatures of deparseFromExprForRel() and deparseRangeTblRef() are
revised so that they can add conditions to the upper level.
PgFdwRelationInfo now has a hidden_subquery_rels field, referencing the relids
used in the inner parts of semi-join. They can't be referred to from upper
relations and should be used internally for equivalence member searches.
The planner can create semi-join, which refers to inner rel vars in its target
list. However, we deparse semi-join as an exists() subquery. So we skip the
case when the target list references to inner rel of semi-join.
Author: Alexander Pyhalov
Reviewed-by: Ashutosh Bapat, Ian Lawrence Barwick, Yuuki Fujii, Tomas Vondra
Discussion: https://postgr.es/m/c9e2a757cf3ac2333714eaf83a9cc184@postgrespro.ru
It's not necessary to fill the key field in most cases, since
hash_search has already done that. Some existing call sites have an
assert or comment that this contract has been fulfilled, but those
are quite old and that practice seems unnecessary here.
While at it, remove a nearby redundant assignment that a smart compiler
will elide anyway.
Zhao Junwang, with some adjustments by me
Reviewed by Nathan Bossart, with additional feedback from Tom Lane
Discussion: http://postgr.es/m/CAEG8a3%2BUPF%3DR2QGPgJMF2mKh8xPd1H2TmfH77zPuVUFdBpiGUA%40mail.gmail.com
Quotes are applied to GUCs in a very inconsistent way across the code
base, with a mix of double quotes or no quotes used. This commit
removes double quotes around all the GUC names that are obviously
referred to as parameters with non-English words (use of underscore,
mixed case, etc).
This is the result of a discussion with Álvaro Herrera, Nathan Bossart,
Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson.
Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
This patch adds 'stats_since' and 'minmax_stats_since' columns to the
pg_stat_statements view and pg_stat_statements() function. The new min/max
reset mode for the pg_stat_stetments_reset() function is controlled by the
parameter minmax_only.
'stat_since' column is populated with the current timestamp when a new
statement is added to the pg_stat_statements hashtable. It provides clean
information about statistics collection time intervals for each statement.
Besides it can be used by sampling solutions to detect situations when a
statement was evicted and stored again between samples.
Such a sampling solution could derive any pg_stat_statements statistic values
for an interval between two samples with the exception of all min/max
statistics. To address this issue this patch adds the ability to reset
min/max statistics independently of the statement reset using the new
minmax_only parameter of the pg_stat_statements_reset(userid oid, dbid oid,
queryid bigint, minmax_only boolean) function. The timestamp of such reset
is stored in the minmax_stats_since field for each statement.
pg_stat_statements_reset() function now returns the timestamp of a reset as the
result.
Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
This is preliminary patch. It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.
Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
The brininsert code used to initialize (and destroy) BrinDesc and
BrinRevmap for each tuple, which is not free. This patch initializes
these structures only once, and reuses them for all inserts in the same
command. The data is passed through indexInfo->ii_AmCache.
This also introduces an optional AM callback "aminsertcleanup" that
allows performing custom cleanup in case simply pfree-ing ii_AmCache is
not sufficient (which is the case when the cache contains TupleDesc,
Buffers, and so on).
Author: Soumyadeep Chakraborty
Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra
Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
A WaitEventSet holds file descriptors or event handles (on Windows).
If FreeWaitEventSet is not called, those fds or handles are leaked.
Use ResourceOwners to track WaitEventSets, to clean those up
automatically on error.
This was a live bug in async Append nodes, if a FDW's
ForeignAsyncRequest function failed. (In back branches, I will apply a
more localized fix for that based on PG_TRY-PG_FINALLY.)
The added test doesn't check for leaking resources, so it passed even
before this commit. But at least it covers the code path.
In the passing, fix misleading comment on what the 'nevents' argument
to WaitEventSetWait means.
Report by Alexander Lakhin, analysis and suggestion for the fix by
Tom Lane. Fixes bug #17828.
Reviewed-by: Alexander Lakhin, Thomas Munro
Discussion: https://www.postgresql.org/message-id/472235.1678387869@sss.pgh.pa.us
This adds alternative expected files for various tests.
In src/test/regress/sql/password.sql, we make a small change to the
test so that the CREATE ROLE still succeeds even if the ALTER ROLE
that attempts to set a password might fail. That way, the roles are
available for the rest of the test file in either case.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/dbbd927f-ef1f-c9a1-4ec6-c759778ac852%40enterprisedb.com
This adds several alternative expected files for when MD5 and 3DES are
not available. This is similar to the alternative expected files for
when the legacy provider is disabled. In fact, running the pgcrypto
tests in FIPS mode makes use of some of these existing alternative
expected files as well (e.g., for blowfish).
These new expected files currently cover the FIPS mode provided by
OpenSSL 3.x as well as the modified OpenSSL 3.x from Red Hat (e.g.,
Fedora 38), but not the modified OpenSSL 1.x from Red Hat (e.g.,
Fedora 35). (The latter will have some error message wording
differences.)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/flat/dbbd927f-ef1f-c9a1-4ec6-c759778ac852%40enterprisedb.com
In FIPS mode, these tests will fail. By having them in a separate
file, it would make it easier to have an alternative output file or
selectively disable these tests. This isn't done here; this is just
some preparation.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/2766054.1700080156@sss.pgh.pa.us
This adds support for infinity to the interval data type, using the
same input/output representation as the other date/time data types
that support infinity. This allows various arithmetic operations on
infinite dates, timestamps and intervals.
The new values are represented by setting all fields of the interval
to INT32/64_MIN for -infinity, and INT32/64_MAX for +infinity. This
ensures that they compare as less/greater than all other interval
values, without the need for any special-case comparison code.
Note that, since those 2 values were formerly accepted as legal finite
intervals, pg_upgrade and dump/restore from an old database will turn
them from finite to infinite intervals. That seems OK, since those
exact values should be extremely rare in practice, and they are
outside the documented range supported by the interval type, which
gives us a certain amount of leeway.
Bump catalog version.
Joseph Koshakow, Jian He, and Ashutosh Bapat, reviewed by me.
Discussion: https://postgr.es/m/CAAvxfHea4%2BsPybKK7agDYOMo9N-Z3J6ZXf3BOM79pFsFNcRjwA%40mail.gmail.com
When we decide that we don't want to track execution time of a
specific planner or ProcessUtility call, we still have to increment
the nesting depth, or we'll make the wrong determination of whether
we are at top level when considering nested statements. (PREPARE
and EXECUTE are exceptions, for reasons explained in the code.)
Counting planner nesting depth separately from executor nesting depth
was a mistake: it causes us to make the wrong determination of whether
we are at top level when considering nested statements that get
executed during planning (as a result of constant-folding of
functions, for example). Merge those counters into one.
In passing, get rid of the PGSS_HANDLED_UTILITY macro in favor of
explicitly listing statement types. It seems somewhat coincidental
that PREPARE and EXECUTE are handled alike in each of the places where
that was used: the reasoning tends to be different for each one.
Thus, the macro seems as likely to encourage future bugs as prevent
them, since it's quite unclear whether any future statement type that
might need special-casing here would also need the same choices at
each spot.
Sergei Kornilov, Julien Rouhaud, and Tom Lane, per bug #17552 from
Maxim Boguk. This is pretty clearly a bug fix, but it's also a
behavioral change that might surprise somebody, so no back-patch.
Discussion: https://postgr.es/m/17552-213b534c56ab5d02@postgresql.org
A PostgreSQL release tarball contains a number of prebuilt files, in
particular files produced by bison, flex, perl, and well as html and
man documentation. We have done this consistent with established
practice at the time to not require these tools for building from a
tarball. Some of these tools were hard to get, or get the right
version of, from time to time, and shipping the prebuilt output was a
convenience to users.
Now this has at least two problems:
One, we have to make the build system(s) work in two modes: Building
from a git checkout and building from a tarball. This is pretty
complicated, but it works so far for autoconf/make. It does not
currently work for meson; you can currently only build with meson from
a git checkout. Making meson builds work from a tarball seems very
difficult or impossible. One particular problem is that since meson
requires a separate build directory, we cannot make the build update
files like gram.h in the source tree. So if you were to build from a
tarball and update gram.y, you will have a gram.h in the source tree
and one in the build tree, but the way things work is that the
compiler will always use the one in the source tree. So you cannot,
for example, make any gram.y changes when building from a tarball.
This seems impossible to fix in a non-horrible way.
Second, there is increased interest nowadays in precisely tracking the
origin of software. We can reasonably track contributions into the
git tree, and users can reasonably track the path from a tarball to
packages and downloads and installs. But what happens between the git
tree and the tarball is obscure and in some cases non-reproducible.
The solution for both of these issues is to get rid of the step that
adds prebuilt files to the tarball. The tarball now only contains
what is in the git tree (*). Getting the additional build
dependencies is no longer a problem nowadays, and the complications to
keep these dual build modes working are significant. And of course we
want to get the meson build system working universally.
This commit removes the make distprep target altogether. The make
dist target continues to do its job, it just doesn't call distprep
anymore.
(*) - The tarball also contains the INSTALL file that is built at make
dist time, but not by distprep. This is unchanged for now.
The make maintainer-clean target, whose job it is to remove the
prebuilt files in addition to what make distclean does, is now just an
alias to make distprep. (In practice, it is probably obsolete given
that git clean is available.)
The following programs are now hard build requirements in configure
(they were already required by meson.build):
- bison
- flex
- perl
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
cac169d68 adjusted DEFAULT_FDW_TUPLE_COST and that seems to have caused
a test to become unstable on 32-bit machines.
4b14e1871 tried to fix this as originally the plan was flipping between
a Nested Loop and Hash Join. That commit forced the Nested Loop, but
there's still flexibility to push or not push the sort to the remote
server and 32-bit seems to prefer to push and on 64-bit, the costs
prefer not to.
Here let's just turn off enable_sort to significantly encourage the sort
to take place on the remote server.
Reported-by: Michael Paquier, Richard Guo
Discussion: https://postgr.es/m/ZUM2IhA8X2lrG50K@paquier.xyz
cac169d68 adjusted DEFAULT_FDW_TUPLE_COST and that seems to have caused
a test to become unstable on 32-bit machines. Try to make it stable
again.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZUM2IhA8X2lrG50K@paquier.xyz
0.01 was unrealistically low as it's the same as the default
cpu_tuple_cost and 10x cheaper than the default parallel_tuple_cost.
It's hard to imagine a situation where fetching a tuple from a foreign
server would be cheaper than fetching one from a parallel worker.
After some experimentation on a loopback server, somewhere between 0.15
and 0.3 seems more realistic. Here we split the difference and set it
to 0.2.
This will cause operations that reduce the number of tuples (e.g.
aggregation) to be more likely to take place on the foreign server.
Adjusting this causes some plan changes in the postgres_fdw regression
tests. This is because penalizing each Path with the additional tuple
costs causes some dilution of the costs of the other operations being
charged for and results in various paths appearing to be closer to the
same costs such that add_path's STD_FUZZ_FACTOR is more likely to see two
paths as costing (fuzzily) the same. This isn't ideal, but it shouldn't
be reason enough to use artificially low costs.
Discussion: https://postgr.es/m/CAApHDvopVjjfh5c1Ed2HRvDdfom2dEpMwwiu5-f1AnmYprJngA@mail.gmail.com