A couple of code paths use the special area on the page passed by the
function caller, expecting to find some data in it. However, feeding
an incorrect page can lead to out-of-bound reads when trying to access
the page special area (like a heap page that has no special area,
leading PageGetSpecialPointer() to grab a pointer outside the allocated
page).
The functions used for hash and btree indexes have some protection
already against that, while some other functions using a relation OID
as argument would make sure that the access method involved is correct,
but functions taking in input a raw page without knowing the relation
the page is attached to would run into problems.
This commit improves the set of checks used in the code paths of BRIN,
btree (including one check if a leaf page is found with a non-zero
level), GIN and GiST to verify that the page given in input has a
special area size that fits with each access method, which is done
though PageGetSpecialSize(), becore calling PageGetSpecialPointer().
The scope of the checks done is limited to work with pages that one
would pass after getting a block with get_raw_page(), as it is possible
to craft byteas that could bypass existing code paths. Having too many
checks would also impact the usability of pageinspect, as the existing
code is very useful to look at the content details in a corrupted page,
so the focus is really to avoid out-of-bound reads as this is never a
good thing even with functions whose execution is limited to
superusers.
The safest approach could be to rework the functions so as these fetch a
block using a relation OID and a block number, but there are also cases
where using a raw page is useful.
Tests are added to cover all the code paths that needed such checks, and
an error message for hash indexes is reworded to fit better with what
this commit adds.
Reported-By: Alexander Lakhin
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
One of the TAP tests added in 923def9a53 did not wait after creating a
subscription, and wait_for_catchup is not sufficient for this. So if the
tablesync workers happen do not complete quickly enough, the test won't
see the expected results.
This probably explains intermittent failures on a couple buildfarm
animals (komodoensis, petalura and snapper).
Reported-by: Tom Lane
Discussion: https://postgr.es/m/170549.1648330634@sss.pgh.pa.us
The SSL TAP tests were tightly coupled to the OpenSSL implementation,
making it hard to add support for additional SSL/TLS backends. This
refactoring makes the test avoid depending on specific implementations
The SSLServer Perl module is renamed SSL::Server, which in turn use
SSL::Backend::X where X is the backend pointed to by with_ssl. Each
backend will implement its own module responsible for setting up keys,
certs and to resolve sslkey values to their implementation specific
value (file paths or vault nicknames etc). Further, switch_server_cert
now takes a set of named parameters rather than a fixed set which used
defaults. The modules also come with POD documentation.
There are a few testcases which still use OpenSSL specifics, but it's
not entirely clear how to abstract those until we have another library
implemented.
Original patch by me, with lots of rework by Andrew Dunstan to turn it
into better Perl.
Discussion: https://postgr.es/m/AA18A362-CA65-4F9A-AF61-76AE318FE97C@yesql.se
clang 13 with -Wextra warns that "performing pointer subtraction with
a null pointer has undefined behavior" in the places where freepage.c
tries to set a relptr variable to constant NULL. This appears to be
a compiler bug, but it's unlikely to get fixed instantly. Fortunately,
we can work around it by introducing an inline support function, which
seems like a good change anyway because it removes the macro's existing
double-evaluation hazard.
Backpatch to v10 where this code was introduced.
Patch by me, based on an idea of Andres Freund's.
Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us
A fair percentage of the buildfarm doesn't recognize that oldcxt
won't be used uninitialized; silence those warnings by initializing it.
While here, upgrade the function's thoroughly inadequate header comment.
Oversight in 923def9a5, per buildfarm.
This allows specifying an optional column list when adding a table to
logical replication. The column list may be specified after the table
name, enclosed in parentheses. Columns not included in this list are not
sent to the subscriber, allowing the schema on the subscriber to be a
subset of the publisher schema.
For UPDATE/DELETE publications, the column list needs to cover all
REPLICA IDENTITY columns. For INSERT publications, the column list is
arbitrary and may omit some REPLICA IDENTITY columns. Furthermore, if
the table uses REPLICA IDENTITY FULL, column list is not allowed.
The column list can contain only simple column references. Complex
expressions, function calls etc. are not allowed. This restriction could
be relaxed in the future.
During the initial table synchronization, only columns included in the
column list are copied to the subscriber. If the subscription has
several publications, containing the same table with different column
lists, columns specified in any of the lists will be copied.
This means all columns are replicated if the table has no column list
at all (which is treated as column list with all columns), or when of
the publications is defined as FOR ALL TABLES (possibly IN SCHEMA that
matches the schema of the table).
For partitioned tables, publish_via_partition_root determines whether
the column list for the root or the leaf relation will be used. If the
parameter is 'false' (the default), the list defined for the leaf
relation is used. Otherwise, the column list for the root partition
will be used.
Psql commands \dRp+ and \d <table-name> now display any column lists.
Author: Tomas Vondra, Alvaro Herrera, Rahila Syed
Reviewed-by: Peter Eisentraut, Alvaro Herrera, Vignesh C, Ibrar Ahmed,
Amit Kapila, Hou zj, Peter Smith, Wang wei, Tang, Shi yu
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
The previous method for doing that was to write zeroes into a
predetermined set of page locations. However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.
Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.
Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.
In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against. We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.
Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.
Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us
Commit 75b1521dae added support for logical replication of sequences,
including grammar changes, but it did not update preprocess_pubobj_list
accordingly. This can cause segfaults with "continuations", i.e. when
command specifies a list of objects:
CREATE PUBLICATION p FOR SEQUENCE s1, s2;
Reported by Amit Kapila, patch by me.
Reported-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1JxDNKGBSNTyN-Xj2JRjzFo+ziSqJbjH==vuO0YF_CQrg@mail.gmail.com
Crash recovery on standby may encounter missing directories when
replaying create database WAL records. Prior to this patch, the standby
would fail to recover in such a case. However, the directories could be
legitimately missing. Consider a sequence of WAL records as follows:
CREATE DATABASE
DROP DATABASE
DROP TABLESPACE
If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.
This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery. If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.
Backpatch to 13, at least for now. The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d8069522, which appeared in 13.
A new TAP test file is added to verify the condition. However, because
it depends on commit d6d317dbf6, it can only be added to branch
master. I (Álvaro) manually verified that the code behaves as expected
in branch 14. It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
Move DLSUFFIX from makefiles into header files for all platforms.
Move the DLSUFFIX assignment from src/makefiles/ to src/templates/,
have configure read it, and then substitute it into Makefile.global
and pg_config.h. This avoids the need for all makefile rules that
need it to locally set CPPFLAGS. It also resolves an inconsistent
setup between the two Windows build systems.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/2f9861fb-8969-9005-7518-b8e60f2bead9@enterprisedb.com
Commit 85c696112 introduced this function to deduplicate code in the
transaction callback functions, but the SQL command passed as an
argument to it was useless when it returned before aborting a remote
transaction using the command. Modify pgfdw_abort_cleanup() so that it
constructs the command when/if necessary, as before, removing the
argument from it. Also update comments in pgfdw_abort_cleanup() and one
of the calling functions.
Etsuro Fujita, reviewed by David Zhang.
Discussion: https://postgr.es/m/CAPmGK158hrd%3DZfXmgkmNFHivgh18e4oE2Gz151C2Q4OBDjZ08A%40mail.gmail.com
These were introduced in recent commit 52e4f0cd47. We were trying to free
some transient space consumption and that too was not entirely correct and
complete. We don't need this partial freeing of memory as it will be
allocated just once for a query and will be freed at the end of the query.
Author: Zhihong Yu
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CALNJ-vQORfQ=vicbKA_RmeGZGzm1y3WsEcZqXWi7qjN43Cz_vg@mail.gmail.com
Bildfarm member prairiedog reported a pgbench TAP test failure after
commit: 4a39f87acd. This is the second
attempt to fix it. It seems older version of Perl does not accept
"\gN". Replace it with plain old "\N" because actually "\gN" is not
necessary here.
Author: Tatsuo Ishii
Reported-by: Tom Lane
Reviewed-by: Tom Lane, Yugo Nagata
Discussion: https://postgr.es/m/2775989.1648060014%40sss.pgh.pa.us
Follow-up improvements for commit 127aea2a based on discussion:
* use fork name for --fork, not number
* use -R, -B as short switches for --relation, --block
* re-alphabetize the list of switches (code, --help and docs)
Suggested-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> (fork name part)
Reviewed-by: David Christensen <david.christensen@crunchydata.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/3a4c2e93-7976-2320-fc0a-32097fe148a7%40enterprisedb.com
Tom noticed evidence in the buildfarm suggesting the failures might just be
really slow process exits. To investigate further, instead of giving up after
seeing multiple walsender pids once, retry. For now continue to report test
failure if a retry succeeds.
See also commit afdeff1052 and fe0972ee5e.
Per suggestion from Tom Lane.
Discussion: https://postgr.es/m/3042597.1648148740@sss.pgh.pa.us
The check for datallowconn being properly set on all databases in the
old cluster errored out on the first instance, rather than report the
set of problematic databases. This adds reporting to a textfile like
how many other checks are performed. While there, also add a comment
to the function as per how other checks are commented.
This check won't catch if template1 isn't allowing connections, since
that's used for connecting in the first place. That error remains as
it is today:
connection to server on socket ".." failed: FATAL: database "template1" is not currently accepting connections
Author: Jeevan Ladhe <jeevan.ladhe@enterprisedb.com>
Reviewed-by: Suraj Kharage <suraj.kharage@enterprisedb.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CAOgcT0McABqF_iFFQuuRf9is+gmYMsmUu_SWNikyr=2VGyP9Jw@mail.gmail.com
A TOAST table can normally have only one index, but there are corner
cases where it has more; for example, transiently during REINDEX
CONCURRENTLY. In such a case, the pg_statio_all_tables view produced
multiple rows for the owning table, one per TOAST index. Refactor the
view to avoid that, instead summing the stats across all the indexes,
as we do for regular table indexes.
While this has been wrong for a long time, back-patching seems unwise
due to the difficulty of putting a system view change into back
branches.
Andrei Zubkov, tweaked a bit by me
Discussion: https://postgr.es/m/acefef4189706971fc475f912c1afdab1c48d627.camel@moonset.ru
The Config and Cwd modules were no longer used, but remained imported,
in a number of tests. Remove to keep the imports to the actually used
modules.
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/A5A074CD-3198-492B-BE5E-7961EFC3733F@yesql.se
If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.
Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.
Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com
This commit adds support for decoding of sequences to the built-in
replication (the infrastructure was added by commit 0da92dc530).
The syntax and behavior mostly mimics handling of tables, i.e. a
publication may be defined as FOR ALL SEQUENCES (replicating all
sequences in a database), FOR ALL SEQUENCES IN SCHEMA (replicating
all sequences in a particular schema) or individual sequences.
To publish sequence modifications, the publication has to include
'sequence' action. The protocol is extended with a new message,
describing sequence increments.
A new system view pg_publication_sequences lists all the sequences
added to a publication, both directly and indirectly. Various psql
commands (\d and \dRp) are improved to also display publications
including a given sequence, or sequences included in a publication.
Author: Tomas Vondra, Cary Huang
Reviewed-by: Peter Eisentraut, Amit Kapila, Hannu Krosing, Andres
Freund, Petr Jelinek
Discussion: https://postgr.es/m/d045f3c2-6cfb-06d3-5540-e63c320df8bc@enterprisedb.com
Discussion: https://postgr.es/m/1710ed7e13b.cd7177461430746.3372264562543607781@highgo.ca
They were macros previously, but recent callsite additions made Coverity
complain about one of the assertions being always true. This change
could have been made a long time ago, but the Coverity complain broke
the inertia.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/202203241021.uts52sczx3al@alvherre.pgsql
Up to now, the planner estimated the size of a recursive query's
worktable as 10 times the size of the non-recursive term. It's hard
to see how to do significantly better than that automatically, but
we can give users control over the multiplier to allow tuning for
specific use-cases. The default behavior remains the same.
Simon Riggs
Discussion: https://postgr.es/m/CANbhV-EuaLm4H3g0+BSTYHEGxJj3Kht0R+rJ8vT57Dejnh=_nA@mail.gmail.com
The command used in the documentation to retrieve the value of the
runtime-computed GUC shared_memory_size_in_huge_pages would also show to
the user all the log messages generated by the postmaster before and
after printing the wanted value. This can be confusing, as the wanted
result could be masked with a lot of noise.
One way to avoid those log messages is to use something like "-c
log_min_messages=fatal" in the command (my idea, but that's not common
knowledge). Rather than mentioning this option, suffix the command with
a redirection of stderr to /dev/null, which is the stream location where
the logs show up. This is enough to show only the GUC value to the
user when copy-pasting the command.
Reported-by: Magnus Hagander
Author: Nathan Bossart
Discussion: https://postgr.es/m/20220314173417.GA1020555@nathanxps13
Bildfarm member prairiedog reported a pgbench TAP test failure after
commit: 4a39f87acd. This commit attempts
to fix some copy&paste errors introduced in the previous commit.
Author: Yugo Nagata
Reported-by: Tom Lane
Discussion: https://postgr.es/m/2775989.1648060014%40sss.pgh.pa.us
hba.c is growing big, and more contents are planned for it. In order to
prepare for this future work, this commit moves all the code related to
the system function processing the contents of pg_hba.conf,
pg_hba_file_rules() to a new file called hbafuncs.c, which will be used
as the location for the SQL portion of the authentication file parsing.
While on it, HbaToken, the structure holding a string token lexed from a
configuration file related to authentication, is renamed to a more
generic AuthToken, as it gets used not only for pg_hba.conf, but also
for pg_ident.conf. TokenizedLine is now named TokenizedAuthLine.
The size of hba.c is reduced by ~12%.
Author: Julien Rouhaud
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20220223045959.35ipdsvbxcstrhya@jrouhaud
The number of relation pages is tricky to get right in a portable way,
particularly across 32b and 64b builds, but checking after the existence
of the FSM and if there is any space available space should be stable
enough with a minimal number of tuples involved. This commit introduces
checks on a table with some btree, BRIN and hash indexes, as a first
attempt.
Author: Dong Wook Lee, Fabrízio de Royes Mello, Michael Paquier
Discussion: https://postgr.es/m/CAAcByaJ5KW3bd7fJr=jPEyK8M_UzXJFHHBVuOcBe+JHD8txRyQ@mail.gmail.com
When building with sanitizers the sanitizer library provides dlopen, but not
dlsym(), making configure think that -ldl isn't needed. Just checking for
dlsym() ought to suffice, hard to see dlsym() being provided without dlopen()
also being provided.
Backpatch to all branches, for the same reasons as 46ab07ffda.
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-
It seems possible for the condition being tested to be true in
production, and nobody would never know (except when some data
eventually becomes corrupt?).
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql
Commit ffd53659c4 messed up the
mechanism that was being used to pass parameters to LogStreamerMain()
on Windows. It worked on Linux because only Windows was using threads.
Repair by moving the additional parameters added by that commit into
the 'logstreamer_param' struct.
Along the way, fix a compiler warning on builds without HAVE_LIBZ.
Discussion: http://postgr.es/m/CA+TgmoY5=AmWOtMj3v+cySP2rR=Bt6EGyF_joAq4CfczMddKtw@mail.gmail.com
Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.
Backpatch to 10 (all stable versions).
Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com
There are more compression parameters that can be specified than just
an integer compression level, so rename the new COMPRESSION_LEVEL
option to COMPRESSION_DETAIL before it gets released. Introduce a
flexible syntax for that option to allow arbitrary options to be
specified without needing to adjust the main replication grammar,
and common code to parse it that is shared between the client and
the server.
This commit doesn't actually add any new compression parameters,
so the only user-visible change is that you can now type something
like pg_basebackup --compress gzip:level=5 instead of writing just
pg_basebackup --compress gzip:5. However, it should make it easy to
add new options. If for example gzip starts offering fries, we can
support pg_basebackup --compress gzip:level=5,fries=true for the
benefit of users who want fries with that.
Along the way, this fixes a few things in pg_basebackup so that the
pg_basebackup can be used with a server-side compression algorithm
that pg_basebackup itself does not understand. For example,
pg_basebackup --compress server-lz4 could still succeed even if
only the server and not the client has LZ4 support, provided that
the other options to pg_basebackup don't require the client to
decompress the archive.
Patch by me. Reviewed by Justin Pryzby and Dagfinn Ilmari Mannsåker.
Discussion: http://postgr.es/m/CA+TgmoYvpetyRAbbg1M8b3-iHsaN4nsgmWPjOENu5-doHuJ7fA@mail.gmail.com
When serialization or deadlock errors are reported by backend, allow
to retry and continue the benchmarking. For this purpose new options
"--max-tries", "--failures-detailed" and "--verbose-errors" are added.
Transactions with serialization errors or deadlock errors will be
repeated after rollbacks until they complete successfully or reach the
maximum number of tries (specified by the --max-tries option), or the
maximum time of tries (specified by the --latency-limit option).
These options can be specified at the same time. It is not possible to
use an unlimited number of tries (--max-tries=0) without the
--latency-limit option or the --time option. By default the option
--max-tries is set to 1, which means transactions with
serialization/deadlock errors are not retried. If the last try fails,
this transaction will be reported as failed, and the client variables
will be set as they were before the first run of this transaction.
Statistics on retries and failures are printed in the progress,
transaction / aggregation logs and in the end with other results (all
and for each script). Also retries and failures are printed
per-command with average latency by using option
(--report-per-command, -r).
Option --failures-detailed prints group failures by basic types
(serialization failures / deadlock failures).
Option --verbose-errors prints distinct reports on errors and failures
(errors without retrying) by type with detailed information like which
limit for retries was violated and how far it was exceeded for the
serialization/deadlock failures.
Patch originally written by Marina Polyakova then Yugo Nagata
inherited the discussion and heavily modified the patch to make it
commitable.
Authors: Yugo Nagata, Marina Polyakova
Reviewed-by: Fabien Coelho, Tatsuo Ishii, Alvaro Herrera, Kevin Grittner, Andres Freund, Arthur Zakirov, Alexander Korotkov, Teodor Sigaev, Ildus Kurbangaliev
Discussion: https://postgr.es/m/flat/72a0d590d6ba06f242d75c2e641820ec%40postgrespro.ru