Commit Graph

4755 Commits

Author SHA1 Message Date
Tom Lane
cf590c568d Update contrib/sepgsql regression tests for commit 89e51abcb.
Oversight revealed by buildfarm.
2024-11-01 12:50:01 -04:00
Peter Eisentraut
9be4e5d293 Remove unused #include's from contrib, pl, test .c files
as determined by IWYU

Similar to commit dbbca2cf29, but for contrib, pl, and src/test/.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-28 08:02:17 +01:00
Michael Paquier
6b652e6ce8 Set query ID for inner queries of CREATE TABLE AS and DECLARE
Some utility statements contain queries that can be planned and
executed: CREATE TABLE AS and DECLARE CURSOR.  This commit adds query ID
computation for the inner queries executed by these two utility
commands, with and without EXPLAIN.  This change leads to four new
callers of JumbleQuery() and post_parse_analyze_hook() so as extensions
can decide what to do with this new data.

Previously, extensions relying on the query ID, like pg_stat_statements,
were not able to track these nested queries as the query_id was 0.

For pg_stat_statements, this commit leads to additions under !toplevel
when pg_stat_statements.track is set to "all", as shown in its
regression tests.  The output of EXPLAIN for these two utilities gains a
"Query Identifier" if compute_query_id is enabled.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-28 09:03:20 +09:00
Michael Paquier
499edb0974 Track more precisely query locations for nested statements
Previously, a Query generated through the transform phase would have
unset stmt_location, tracking the starting point of a query string.

Extensions relying on the statement location to extract its relevant
parts in the source text string would fallback to use the whole
statement instead, leading to confusing results like in
pg_stat_statements for queries relying on nested queries, like:
- EXPLAIN, with top-level and nested query using the same query string,
and a query ID coming from the nested query when the non-top-level
entry.
- Multi-statements, with only partial portions of queries being
normalized.
- COPY TO with a query, SELECT or DMLs.

This patch improves things by keeping track of the statement locations
and propagate it to Query during transform, allowing PGSS to only show
the relevant part of the query for nested query.  This leads to less
bloat in entries for non-top-level entries, as queries can now be
grouped within the same (toplevel, queryid) duos in pg_stat_statements.
The result gives a stricter one-one mapping between query IDs and its
query strings.

The regression tests introduced in 45e0ba30fc produce differences
reflecting the new logic.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-24 09:29:54 +09:00
Tom Lane
14e5680eee Improve parser's reporting of statement start locations.
Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments.  This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty.  It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments.  There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.

The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty).  In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible.  I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.

Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.

This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though.  The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
Michael Paquier
45e0ba30fc pg_stat_statements: Add tests for nested queries with level tracking
There have never been any regression tests in PGSS for various query
patterns for nested queries combined with level tracking, like:
- Multi-statements.
- CREATE TABLE AS
- CREATE/REFRESH MATERIALIZED VIEW
- DECLARE CURSOR
- EXPLAIN, with a subset of the above supported.
- COPY.

All the tests added here track historical, sometimes confusing, existing
behaviors.  For example, EXPLAIN stores two PGSS entries with the same
top-level query string but two different query IDs as one is calculated
for the top-level EXPLAIN (this part is right) and a second one for the
inner query in the EXPLAIN (this part is not right).

A couple of patches are under discussion to improve the situation, and
all the tests added here will prove useful to evaluate the changes
discussed.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-22 13:05:51 +09:00
Masahiko Sawada
7cdfeee320 Add contrib/pg_logicalinspect.
This module provides SQL functions that allow to inspect logical
decoding components.

It currently allows to inspect the contents of serialized logical
snapshots of a running database cluster, which is useful for debugging
or educational purposes.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith, Peter Eisentraut
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
2024-10-14 17:22:02 -07:00
Peter Eisentraut
e7d0cf42b1 Allow TAP tests to force checksums off when calling init()
TAP tests can write

    $node->init(no_data_checksums => 1);

to initialize a cluster explicitly without checksums.  Currently, this
is the default, but this change allows running all tests with
checksums enabled, like

    PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...

And this also prepares the tests for when we switch the default to
checksums enabled.

The pg_checksums tests need to disable checksums so it can test its
own functionality of enabling checksums.  The amcheck/pg_amcheck tests
need to disable checksums because they manually introduce corruption
that they want to detect, but with checksums enabled, the checksum
verification will fail before they even get to their work.

Author: Greg Sabino Mullane <greg@turnstep.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
2024-10-14 11:25:03 +02:00
Michael Paquier
cf54a2c002 pg_stat_statements: Add columns to track parallel worker activity
The view pg_stat_statements gains two columns:
- parallel_workers_to_launch, the number of parallel workers planned to
be launched.
- parallel_workers_launched, the number of parallel workers actually
launched.

The ratio of both columns offers hints that parallel workers are lacking
on a per-statement basis, requiring some tuning, in coordination with
"calls", the number of times a query is executed.

As of now, these numbers are tracked within Gather and GatherMerge
nodes.  They could be extended to utilities that make use of parallel
workers (parallel btree and brin, VACUUM).

The module is bumped to 1.12.

Author: Guillaume Lelarge
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com
2024-10-09 08:30:45 +09:00
Amit Kapila
d759c1a0b8 Stabilize the test added by commit 022564f60c.
The test was unstable in branches 14 and 15 as we were relying on the
number of changes in the table having a toast column to start streaming.
On branches >= 16, we have a GUC debug_logical_replication_streaming which
can stream each change, so the test was stable in those branches.

Change the test to use PREPARE TRANSACTION as that should make the result
consistent and test the code changed in 022564f60c.

Reported-by: Daniel Gustafsson as per buildfarm
Author: Hou Zhijie, Amit Kapila
Backpatch-through: 14
Discussion: https://postgr.es/m/8C2F86AA-981E-4803-B14D-E264C0255330@yesql.se
2024-10-08 12:25:52 +05:30
Nathan Bossart
5d6187d2a2 Fix Y2038 issues with MyStartTime.
Several places treat MyStartTime as a "long", which is only 32 bits
wide on some platforms.  In reality, MyStartTime is a pg_time_t,
i.e., a signed 64-bit integer.  This will lead to interesting bugs
on the aforementioned systems in 2038 when signed 32-bit integers
are no longer sufficient to store Unix time (e.g., "pg_ctl start"
hanging).  To fix, ensure that MyStartTime is handled as a 64-bit
value everywhere.  (Of course, users will need to ensure that
time_t is 64 bits wide on their system, too.)

Co-authored-by: Max Johnson
Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com
Backpatch-through: 12
2024-10-07 13:51:03 -05:00
Amit Kapila
022564f60c Fix fetching default toast value during decoding of in-progress transactions.
During logical decoding of in-progress transactions, we perform the toast
table scan while fetching the default toast value for an attribute. We
forgot to initialize the flag during this scan to indicate that the system
table scan is in progress. We need this flag to ensure that during logical
decoding we never directly access the tableam or heap APIs because we check
for concurrent aborts only in systable_* APIs.

Reported-by: Alexander Lakhin
Author: Takeshi Ideriha, Hou Zhijie
Reviewed-by: Amit Kapila, Hou Zhijie
Backpatch-through: 14
Discussion: https://postgr.es/m/18641-6687273b7f15269d@postgresql.org
2024-10-07 15:38:45 +05:30
Tom Lane
68dfecbef2 Use generateClonedIndexStmt to propagate CREATE INDEX to partitions.
When instantiating an existing partitioned index for a new child
partition, we use generateClonedIndexStmt to build a suitable
IndexStmt to pass to DefineIndex.  However, when DefineIndex needs
to recurse to instantiate a newly created partitioned index on an
existing child partition, it was doing copyObject on the given
IndexStmt and then applying a bunch of ad-hoc fixups.  This has
a number of problems, primarily that it implies fresh lookups of
referenced objects such as opclasses and collations.  Since commit
2af07e2f7 caused DefineIndex to restrict search_path internally, those
lookups could fail or deliver different results than the original one.
We can avoid those problems and save a few dozen lines of code by
using generateClonedIndexStmt in this code path too.

Another thing this fixes is incorrect propagation of parent-index
comments to child indexes (because the copyObject approach copies
the idxcomment field while generateClonedIndexStmt doesn't).  I had
noticed this in connection with commit c01eb619a, but not run the
problem to ground.

I'm tempted to back-patch this further than v17, but the only thing
it's known to fix in older branches is the comment issue, which is
pretty minor and doesn't seem worth the risk of introducing new
issues in stable branches.  (If anyone does care about that,
clearing idxcomment in the copied IndexStmt would be a safer fix.)

Per bug #18637 from usamoi.  Back-patch to v17 where the search_path
change came in.

Discussion: https://postgr.es/m/18637-f51e314546e3ba2a@postgresql.org
2024-10-05 14:46:44 -04:00
Fujii Masao
a1c4c8a9e1 file_fdw: Add on_error and log_verbosity options to file_fdw.
In v17, the on_error and log_verbosity options were introduced for
the COPY command. This commit extends support for these options
to file_fdw.

Setting on_error = 'ignore' for a file_fdw foreign table allows users
to query it without errors, even when the input file contains
malformed rows, by skipping the problematic rows.

Both on_error and log_verbosity options apply to SELECT and ANALYZE
operations on file_fdw foreign tables.

Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
2024-10-03 15:57:32 +09:00
Peter Eisentraut
10b721821d Use macro to define the number of enum values
Refactoring in the interest of code consistency, a follow-up to 2e068db56e.

The argument against inserting a special enum value at the end of the enum
definition is that a switch statement might generate a compiler warning unless
it has a default clause.

Aleksander Alekseev, reviewed by Michael Paquier, Dean Rasheed, Peter Eisentraut

Discussion: https://postgr.es/m/CAJ7c6TMsiaV5urU_Pq6zJ2tXPDwk69-NKVh4AMN5XrRiM7N%2BGA%40mail.gmail.com
2024-10-01 09:30:24 -04:00
Michael Paquier
dc68515968 Show values of SET statements as constants in pg_stat_statements
This is a continuation of work like 11c34b342b, done to reduce the
bloat of pg_stat_statements by applying more normalization to query
entries.  This commit is able to detect and normalize values in
VariableSetStmt, resulting in:
SET conf_param = $1

Compared to other parse nodes, VariableSetStmt is embedded in much more
places in the parser, impacting many query patterns in
pg_stat_statements.  A custom jumble function is used, with an extra
field in the node to decide if arguments should be included in the
jumbling or not, a location field being not enough for this purpose.
This approach allows for a finer tuning.

Clauses relying on one or more keywords are not normalized, for example:
* DEFAULT
* FROM CURRENT
* List of keywords.  SET SESSION CHARACTERISTICS AS TRANSACTION,
where it is critical to differentiate different sets of options, is a
good example of why normalization should not happen.

Some queries use VariableSetStmt for some subclauses with SET, that also
have their values normalized:
- ALTER DATABASE
- ALTER ROLE
- ALTER SYSTEM
- CREATE/ALTER FUNCTION

ba90eac7a9 has added test coverage for most of the existing SET
patterns.  The expected output of these tests shows the difference this
commit creates.  Normalization could be perhaps applied to more portions
of the grammar but what is done here is conservative, and good enough as
a starting point.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com
Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-30 14:02:00 +09:00
Michael Paquier
ba90eac7a9 pg_stat_statements: Expand tests for SET statements
There are many grammar flavors that depend on the parse node
VariableSetStmt.  This closes the gap in pg_stat_statements by providing
test coverage for what should be a large majority of them, improving more
the work begun in de2aca2885.  This will be used to ease the
evaluation of a path towards more normalization of SET queries with
query jumbling.

Note that SET NAMES (grammar from the standard, synonym of SET
client_encoding) is omitted on purpose, this could use UTF8 with a
conditional script where UTF8 is supported, but that does not seem worth
the maintenance cost for the sake of these tests.

The author has submitted most of these in a TAP test (filled in any
holes I could spot), still queries in a SQL file of pg_stat_statements
is able to achieve the same goal while being easier to look at when
testing normalization patterns.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-25 10:04:44 +09:00
Tom Lane
cd838e2008 Neaten up our choices of SQLSTATEs for XML-related errors.
When our XML-handling modules were first written, the SQL standard
lacked any error codes that were particularly intended for XML
error conditions.  Unsurprisingly, this led to some rather random
choices of errcodes in those modules.  Now the standard has a whole
SQLSTATE class, "Class 10 - XQuery Error", with a reasonably large
selection of relevant-looking errcodes.

In this patch I've chosen one fairly generic code defined by the
standard, 10608 = invalid_argument_for_xquery, and used it where
it seemed appropriate.  I've also made an effort to replace
ERRCODE_INTERNAL_ERROR everywhere it was not clearly reporting
a coding problem; in particular, many of the existing uses look
like they can fairly be reported as ERRCODE_OUT_OF_MEMORY.

It might be interesting to try to map libxml2's error codes into
the standard's new collection, but I've not undertaken that here.

Discussion: https://postgr.es/m/417250.1726341268@sss.pgh.pa.us
2024-09-24 12:59:56 -04:00
Fujii Masao
4f08ab5545 postgres_fdw: Extend postgres_fdw_get_connections to return user name.
This commit adds a "user_name" output column to
the postgres_fdw_get_connections function, returning the name
of the local user mapped to the foreign server for each connection.
If a public mapping is used, it returns "public."

This helps identify postgres_fdw connections more easily,
such as determining which connections are invalid, closed,
or used within the current transaction.

No extension version bump is needed, as commit c297a47c5f
already handled it for v18~.

Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com
2024-09-18 12:51:48 +09:00
Michael Paquier
933848d16d Add missing query ID reporting in extended query protocol
This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.

An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message.  This corresponds to the case where
execute_is_fetch is set, for example.

Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it.  Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced.  The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.

The test added in pg_stat_statements is a courtesy of Robert Haas.  This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.

Reported-by:  Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14
2024-09-18 09:59:09 +09:00
Peter Eisentraut
89f908a6d0 Add temporal FOREIGN KEY contraints
Add PERIOD clause to foreign key constraint definitions.  This is
supported for range and multirange types.  Temporal foreign keys check
for range containment instead of equality.

This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).

Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.

(previously committed as 34768ee361, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut
fc0438b4e8 Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

(previously committed as 46a0cd4cef, reverted by 46a0cd4cefb; the new
part is this:)

Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations.  For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows.  So we need to forbid empties.

This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added.  But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut
7406ab623f Add stratnum GiST support function
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required).  We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses.  It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

(previously committed as 6db4598fcb, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:29 +02:00
Tom Lane
d5622acb32 Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().
In existing releases of libxml2, xmlXPathCompile can be driven
to stack overflow because it fails to protect itself against
too-deeply-nested input.  While there is an upstream fix as of
yesterday, it will take years for that to propagate into all
shipping versions.  In the meantime, we can protect our own
usages basically for free by calling xmlXPathCtxtCompile instead.

(The actual bug is that libxml2 keeps its nesting counter in the
xmlXPathContext, and its parsing code was willing to just skip
counting nesting levels if it didn't have a context.  So if we supply
a context, all is well.  It seems odd actually that it works at all
to not supply a context, because this means that XPath parsing does
not have access to XML namespace info.  Apparently libxml2 never
checks namespaces until runtime?  Anyway, this seems like good
future-proofing even if its only immediate effect is to dodge a bug.)

Sadly, this hack only offers protection with libxml2 2.9.11 and newer.
Before that there are multiple similar problems, so if you are
processing untrusted XML it behooves you to get a newer version.
But we have some pretty old libxml2 in the buildfarm, so it seems
impractical to add a regression test to verify this fix.

Per bug #18617 from Jingzhou Fu.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org
Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799
2024-09-15 13:33:09 -04:00
Tom Lane
94537982ec Remove obsolete comment in pg_stat_statements.
Commit 76db9cb63 removed the use of multiple nesting counters,
but missed one comment describing that arrangement.

Back-patch to v17 where 76db9cb63 came in, just to avoid confusion.

Julien Rouhaud

Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw
2024-09-14 11:42:31 -04:00
Nathan Bossart
70d1c664f4 Fix contrib/pageinspect's test for sequences.
I managed to break this test in two different ways in commit
05036a3155.

First, the output of the new call to tuple_data_split() on the test
sequence is dependent on endianness.  This is fixed by setting a
special start value for the test sequence that produces the same
output regardless of the endianness of the machine.

Second, on versions older than v15, the new test case fails under
"force_parallel_mode = regress" with the following error:

	ERROR:  cannot access temporary tables during a parallel operation

This is because pageinspect's disk-accessing functions are
incorrectly marked PARALLEL SAFE on versions older than v15 (see
commit aeaaf520f4 for details).  This one is fixed by changing the
test sequence to be permanent.  The only reason it was previously
marked temporary was to avoid needing a DROP SEQUENCE command at
the end of the test.  Unlike some other tests in this file, the use
of a permanent sequence here shouldn't result in any test
instability like what was fixed by commit e2933a6e11.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan
Backpatch-through: 12
2024-09-13 10:16:40 -05:00
Michael Paquier
7b1ddbae36 pg_stat_statements: Add tests with extended query protocol
There are currently no tests in the tree checking that queries using the
extended query protocol are able to map with their query ID.

This can be achieved for some paths of the extended query protocol with
the psql meta-commands \bind or \bind_named, so let's add some tests
based on both.

I have found that to be a useful addition while working on a different
issue.

Discussion: https://postgr.es/m/ZuEt6MOEBSlifBfn@paquier.xyz
2024-09-13 09:41:06 +09:00
Nathan Bossart
05036a3155 Reintroduce support for sequences in pgstattuple and pageinspect.
Commit 4b82664156 restricted a number of functions provided by
contrib modules to only relations that use the "heap" table access
method.  Sequences always use this table access method, but they do
not advertise as such in the pg_class system catalog, so the
aforementioned commit also (presumably unintentionally) removed
support for sequences from some of these functions.  This commit
reintroduces said support for sequences to these functions and adds
a couple of relevant tests.

Co-authored-by: Ayush Vatsa
Reviewed-by: Robert Haas, Michael Paquier, Matthias van de Meent
Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com
Backpatch-through: 12
2024-09-12 16:31:29 -05:00
Peter Eisentraut
8b5c6a54c4 Replace gratuitous memmove() with memcpy()
The index access methods all had similar code that copied the
passed-in scan keys to local storage.  They all used memmove() for
that, which is not wrong, but it seems confusing not to use memcpy()
when that would work.  Presumably, this was all once copied from
ancient code and never adjusted.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
2024-09-11 15:21:36 +02:00
Noah Misch
65c310b310 Optimize pg_visibility with read streams.
We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.
The area lacked test coverage, so close that gap.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com
2024-09-10 15:21:33 -07:00
Peter Eisentraut
56fead44dc Add amgettreeheight index AM API routine
The only current implementation is for btree where it calls
_bt_getrootheight().  Other index types can now also use this to pass
information to their amcostestimate routine.  Previously, btree was
hardcoded and other index types could not hook into the optimizer at
this point.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-09-10 10:03:23 +02:00
Tom Lane
de239d01e7 Consistently use PageGetExactFreeSpace() in pgstattuple.
Previously this code used PageGetHeapFreeSpace on heap pages,
and usually used PageGetFreeSpace on index pages (though for some
reason GetHashPageStats used PageGetExactFreeSpace instead).
The difference is that those functions subtract off the size of
a line pointer, and PageGetHeapFreeSpace has some additional
rules about returning zero if adding another line pointer would
require exceeding MaxHeapTuplesPerPage.  Those things make sense
when testing to see if a new tuple can be put on the page, but
they seem pretty strange for pure statistics collection.

Additionally, statapprox_heap had a special rule about counting
a "new" page as being fully available space.  This also seems
strange, because it's not actually usable until VACUUM or some
such process initializes the page.  Moreover, it's inconsistent
with what pgstat_heap does, which is to count such a page as
having zero free space.  So make it work like pgstat_heap, which
as of this patch unconditionally calls PageGetExactFreeSpace.

This is more of a definitional change than a bug fix, so no
back-patch.  The module's documentation doesn't define exactly
what "free space" means either, so we left that as-is.

Frédéric Yhuel, reviewed by Rafia Sabih and Andreas Karlsson.

Discussion: https://postgr.es/m/3a18f843-76f6-4a84-8cca-49537fefa15d@dalibo.com
2024-09-09 14:34:10 -04:00
Tom Lane
218527d014 Don't bother checking the result of SPI_connect[_ext] anymore.
SPI_connect/SPI_connect_ext have not returned any value other than
SPI_OK_CONNECT since commit 1833f1a1c in v10; any errors are thrown
via ereport.  (The most likely failure is out-of-memory, which has
always been thrown that way, so callers had better be prepared for
such errors.)  This makes it somewhat pointless to check these
functions' result, and some callers within our code haven't been
bothering; indeed, the only usage example within spi.sgml doesn't
bother.  So it's likely that the omission has propagated into
extensions too.

Hence, let's standardize on not checking, and document the return
value as historical, while not actually changing these functions'
behavior.  (The original proposal was to change their return type
to "void", but that would needlessly break extensions that are
conforming to the old practice.)  This saves a small amount of
boilerplate code in a lot of places.

Stepan Neretin

Discussion: https://postgr.es/m/CAMaYL5Z9Uk8cD9qGz9QaZ2UBJFOu7jFx5Mwbznz-1tBbPDQZow@mail.gmail.com
2024-09-09 12:18:34 -04:00
Noah Misch
ddfc556a64 Revert "Optimize pg_visibility with read streams."
This reverts commit ed1b1ee59f and its
followup 1c61fd8b52.  They rendered
collect_corrupt_items() unable to detect corruption.

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com
2024-09-04 11:36:40 -07:00
Michael Paquier
b4db64270e Apply more quoting to GUC names in messages
This is a continuation of 17974ec259.  More quotes are applied to
GUC names in error messages and hints, taking care of what seems to be
all the remaining holes currently in the tree for the GUCs.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-09-04 13:50:44 +09:00
Noah Misch
1c61fd8b52 Fix stack variable scope from previous commit.
The defect came from me, not from that commit's credited author.  Per
buildfarm members olingo and grassquit.

Discussion: https://postgr.es/m/20240903192030.1e@rfd.leadboat.com
2024-09-03 12:44:54 -07:00
Noah Misch
ed1b1ee59f Optimize pg_visibility with read streams.
We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
2024-09-03 10:46:20 -07:00
Noah Misch
c582b75851 Add block_range_read_stream_cb(), to deduplicate code.
This replaces two functions for iterating over all blocks in a range.  A
pending patch will use this instead of adding a third.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/20240820184742.f2.nmisch@google.com
2024-09-03 10:46:20 -07:00
Michael Paquier
4236825197 Fix typos and grammar in code comments and docs
Author: Alexander Lakhin
Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
2024-09-03 14:49:04 +09:00
Daniel Gustafsson
a70e01d430 Remove support for OpenSSL older than 1.1.0
OpenSSL 1.0.2 has been EOL from the upstream OpenSSL project for
some time, and is no longer the default OpenSSL version with any
vendor which package PostgreSQL. By retiring support for OpenSSL
1.0.2 we can remove a lot of no longer required complexity for
managing state within libcrypto which is now handled by OpenSSL.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/ZG3JNursG69dz1lr@paquier.xyz
Discussion: https://postgr.es/m/CA+hUKGKh7QrYzu=8yWEUJvXtMVm_CNWH1L_TLWCbZMwbi1XP2Q@mail.gmail.com
2024-09-02 13:51:48 +02:00
Tom Lane
0e5c823806 Make postgres_fdw's query_cancel test less flaky.
This test occasionally shows

+WARNING:  could not get result of cancel request due to timeout

which appears to be because the cancel request is sometimes unluckily
sent to the remote session between queries, and then it's ignored.

This patch tries to make that less probable in three ways:

1. Use a test query that does not involve remote estimates, so that
no EXPLAINs are sent.
2. Make sure that the remote session is ready-to-go (transaction
started, SET commands sent) before we start the timer.
3. Increase the statement_timeout to 100ms, to give the local
session enough time to plan and issue the query.

We might have to go higher than 100ms to make this adequately
stable in the buildfarm, but let's see how it goes.

Back-patch to v17 where this test was introduced.

Jelte Fennema-Nio and Tom Lane

Discussion: https://postgr.es/m/578934.1725045685@sss.pgh.pa.us
2024-08-30 16:47:39 -04:00
Masahiko Sawada
52f1d6730b Fix memory counter update in ReorderBuffer.
Commit 5bec1d6bc5 changed the memory usage updates of the
ReorderBufferTXN to zero all at once by subtracting txn->size, rather
than updating it for each change. However, if TOAST reconstruction
data remained in the transaction when freeing it, there were cases
where it further subtracted the memory counter from zero, resulting in
an assertion failure.

This change calculates the memory size for each change and updates the
memory usage to precisely the amount that has been freed.

Backpatch to v17, where this was introducd.

Reviewed-by: Amit Kapila, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoAqkNUvicgKPT_dXzNoOwpPkVTg0QPPxEcWmzT0moCJ1g%40mail.gmail.com
Backpatch-through: 17
2024-08-26 11:00:07 -07:00
Robert Haas
e222534679 Treat number of disabled nodes in a path as a separate cost metric.
Previously, when a path type was disabled by e.g. enable_seqscan=false,
we either avoided generating that path type in the first place, or
more commonly, we added a large constant, called disable_cost, to the
estimated startup cost of that path. This latter approach can distort
planning. For instance, an extremely expensive non-disabled path
could seem to be worse than a disabled path, especially if the full
cost of that path node need not be paid (e.g. due to a Limit).
Or, as in the regression test whose expected output changes with this
commit, the addition of disable_cost can make two paths that would
normally be distinguishible in cost seem to have fuzzily the same cost.

To fix that, we now count the number of disabled path nodes and
consider that a high-order component of both the startup cost and the
total cost. Hence, the path list is now sorted by disabled_nodes and
then by total_cost, instead of just by the latter, and likewise for
the partial path list.  It is important that this number is a count
and not simply a Boolean; else, as soon as we're unable to respect
disabled path types in all portions of the path, we stop trying to
avoid them where we can.

Because the path list is now sorted by the number of disabled nodes,
the join prechecks must compute the count of disabled nodes during
the initial cost phase instead of postponing it to final cost time.

Counts of disabled nodes do not cross subquery levels; at present,
there is no reason for them to do so, since the we do not postpone
path selection across subquery boundaries (see make_subplan).

Reviewed by Andres Freund, Heikki Linnakangas, and David Rowley.

Discussion: http://postgr.es/m/CA+TgmoZ_+MS+o6NeGK2xyBv-xM+w1AfFVuHE4f_aq6ekHv7YSQ@mail.gmail.com
2024-08-21 10:12:30 -04:00
Peter Eisentraut
95b856de23 Remove incidental md5() function use from test
To allow test to pass in OpenSSL FIPS mode, similar to 657f5f223e, for
a new test that has been added since.

Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://www.postgresql.org/message-id/86763810-70a1-4872-8ba7-1676f788e5a2@eisentraut.org
2024-08-16 17:14:32 +02:00
Alexander Korotkov
e3ec9dc1bf Add missing wait_for_catchup() to pg_visibility tap test
e2ed7e3227 introduced check of pg_visibility on standby.  This commit adds
missing wait_for_catchup() to synchronize standby before querying it.
2024-08-16 00:58:32 +03:00
Alexander Korotkov
e2ed7e3227 Fix GetStrictOldestNonRemovableTransactionId() on standby
e85662df44 implemented GetStrictOldestNonRemovableTransactionId() function
for computation of xid horizon that avoid reporting of false errors.
However, GetStrictOldestNonRemovableTransactionId() uses
GetRunningTransactionData() even on standby leading to an assertion failure.

Given that we decided to ignore KnownAssignedXids and standby can't have
own running xids, we switch to use TransamVariables->nextXid as a xid horizon.

Also, revise the comment regarding ignoring KnownAssignedXids with more
detailed reasoning provided by Heikki.

Reported-by: Heikki Linnakangas
Discussion: https://postgr.es/m/42218c4f-2c8d-40a3-8743-4d34dd0e4cce%40iki.fi
Reviewed-by: Heikki Linnakangas
2024-08-16 00:18:55 +03:00
Heikki Linnakangas
fe8dd65bf2 Mark misc static global variables as const
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:48 +03:00
Nathan Bossart
8928817769 Remove volatile qualifiers from pg_stat_statements.c.
Prior to commit 0709b7ee72, which changed the spinlock primitives
to function as compiler barriers, access to variables within a
spinlock-protected section required using a volatile pointer, but
that is no longer necessary.

Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/Zqkv9iK7MkNS0KaN%40nathan
2024-08-06 10:56:37 -05:00
Masahiko Sawada
66e94448ab Restrict accesses to non-system views and foreign tables during pg_dump.
When pg_dump retrieves the list of database objects and performs the
data dump, there was possibility that objects are replaced with others
of the same name, such as views, and access them. This vulnerability
could result in code execution with superuser privileges during the
pg_dump process.

This issue can arise when dumping data of sequences, foreign
tables (only 13 or later), or tables registered with a WHERE clause in
the extension configuration table.

To address this, pg_dump now utilizes the newly introduced
restrict_nonsystem_relation_kind GUC parameter to restrict the
accesses to non-system views and foreign tables during the dump
process. This new GUC parameter is added to back branches too, but
these changes do not require cluster recreation.

Back-patch to all supported branches.

Reviewed-by: Noah Misch
Security: CVE-2024-7348
Backpatch-through: 12
2024-08-05 06:05:33 -07:00
Peter Eisentraut
9fb855fe1a Include bison header files into implementation files
Before Bison 3.4, the generated parser implementation files run afoul
of -Wmissing-variable-declarations (in spite of commit ab61c40bfa)
because declarations for yylval and possibly yylloc are missing.  The
generated header files contain an extern declaration, but the
implementation files don't include the header files.  Since Bison 3.4,
the generated implementation files automatically include the generated
header files, so then it works.

To make this work with older Bison versions as well, include the
generated header file from the .y file.

(With older Bison versions, the generated implementation file contains
effectively a copy of the header file pasted in, so including the
header file is redundant.  But we know this works anyway because the
core grammar uses this arrangement already.)

Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-02 10:25:11 +02:00