261 Commits

Author SHA1 Message Date
Álvaro Herrera
62d712ecfd
Introduce squashing of constant lists in query jumbling
pg_stat_statements produces multiple entries for queries like
    SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is individually jumbled.  Most of the time that's undesirable,
especially if the list becomes too large.

Fix this by introducing a new GUC query_id_squash_values which modifies
the node jumbling code to only consider the first and last element of a
list of constants, rather than each list element individually.  This
affects both the query_id generated by query jumbling, as well as
pg_stat_statements query normalization so that it suppresses printing of
the individual elements of such a list.

The default value is off, meaning the previous behavior is maintained.

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Sergey Dudoladov (mysterious, off-list)
Reviewed-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Marcos Pegoraro <marcos@f10.com.br>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Tested-by: Yasuo Honda <yasuo.honda@gmail.com>
Tested-by: Sergei Kornilov <sk@zsrv.org>
Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Tested-by: Chengxi Sun <sunchengxi@highgo.com>
Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
2025-03-18 18:56:11 +01:00
Robert Haas
95dbd827f2 EXPLAIN: Always use two fractional digits for row counts.
Commit ddb17e387aa28d61521227377b00f997756b8a27 attempted to avoid
confusing users by displaying digits after the decimal point only when
nloops > 1, since it's impossible to have a fraction row count after a
single iteration. However, this made the regression tests unstable since
parallal queries will have nloops>1 for all nodes below the Gather or
Gather Merge in normal cases, but if the workers don't start in time and
the leader finishes all the work, they will suddenly have nloops==1,
making it unpredictable whether the digits after the decimal point would
be displayed or not. Although 44cbba9a7f51a3888d5087fc94b23614ba2b81f2
seemed to fix the immediate failures, it may still be the case that there
are lower-probability failures elsewhere in the regression tests.

Various fixes are possible here. For example, it has previously been
proposed that we should try to display the digits after the decimal
point only if rows/nloops is an integer, but currently rows is storead
as a float so it's not theoretically an exact quantity -- precision
could be lost in extreme cases. It has also been proposed that we
should try to display the digits after the decimal point only if we're
under some sort of construct that could potentially cause looping
regardless of whether it actually does. While such ideas are not
without merit, this patch adopts the much simpler solution of always
display two decimal digits. If that approach stands up to scrutiny
from the buildfarm and human users, it spares us the trouble of doing
anything more complex; if not, we can reassess.

This commit incidentally reverts 44cbba9a7f51a3888d5087fc94b23614ba2b81f2,
which should no longer be needed.

Author: Robert Haas <robertmhaas@gmail.com>
Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Discussion: http://postgr.es/m/CA+TgmoazzVHn8sFOMFAEwoqBTDxKT45D7mvkyeHgqtoD2cn58Q@mail.gmail.com
2025-02-27 11:27:16 -05:00
Amit Langote
525392d572 Don't lock partitions pruned by initial pruning
Before executing a cached generic plan, AcquireExecutorLocks() in
plancache.c locks all relations in a plan's range table to ensure the
plan is safe for execution. However, this locks runtime-prunable
relations that will later be pruned during "initial" runtime pruning,
introducing unnecessary overhead.

This commit defers locking for such relations to executor startup and
ensures that if the CachedPlan is invalidated due to concurrent DDL
during this window, replanning is triggered. Deferring these locks
avoids unnecessary locking overhead for pruned partitions, resulting
in significant speedup, particularly when many partitions are pruned
during initial runtime pruning.

* Changes to locking when executing generic plans:

AcquireExecutorLocks() now locks only unprunable relations, that is,
those found in PlannedStmt.unprunableRelids (introduced in commit
cbc127917e), to avoid locking runtime-prunable partitions
unnecessarily.  The remaining locks are taken by
ExecDoInitialPruning(), which acquires them only for partitions that
survive pruning.

This deferral does not affect the locks required for permission
checking in InitPlan(), which takes place before initial pruning.
ExecCheckPermissions() now includes an Assert to verify that all
relations undergoing permission checks, none of which can be in the
set of runtime-prunable relations, are properly locked.

* Plan invalidation handling:

Deferring locks introduces a window where prunable relations may be
altered by concurrent DDL, invalidating the plan. A new function,
ExecutorStartCachedPlan(), wraps ExecutorStart() to detect and handle
invalidation caused by deferred locking. If invalidation occurs,
ExecutorStartCachedPlan() updates CachedPlan using the new
UpdateCachedPlan() function and retries execution with the updated
plan. To ensure all code paths that may be affected by this handle
invalidation properly, all callers of ExecutorStart that may execute a
PlannedStmt from a CachedPlan have been updated to use
ExecutorStartCachedPlan() instead.

UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. A new
CachedPlan.stmt_context, created as a child of CachedPlan.context,
allows freeing old PlannedStmts while preserving the CachedPlan
structure and its statement list. This ensures that loops over
statements in upstream callers of ExecutorStartCachedPlan() remain
intact.

ExecutorStart() and ExecutorStart_hook implementations now return a
boolean value indicating whether plan initialization succeeded with a
valid PlanState tree in QueryDesc.planstate, or false otherwise, in
which case QueryDesc.planstate is NULL. Hook implementations are
required to call standard_ExecutorStart() at the beginning, and if it
returns false, they should do the same without proceeding.

* Testing:

To verify these changes, the delay_execution module tests scenarios
where cached plans become invalid due to changes in prunable relations
after deferred locks.

* Note to extension authors:

ExecutorStart_hook implementations must verify plan validity after
calling standard_ExecutorStart(), as explained earlier. For example:

    if (prev_ExecutorStart)
        plan_valid = prev_ExecutorStart(queryDesc, eflags);
    else
        plan_valid = standard_ExecutorStart(queryDesc, eflags);

    if (!plan_valid)
        return false;

    <extension-code>

    return true;

Extensions accessing child relations, especially prunable partitions,
via ExecGetRangeTableRelation() must now ensure their RT indexes are
present in es_unpruned_relids (introduced in commit cbc127917e), or
they will encounter an error. This is a strict requirement after this
change, as only relations in that set are locked.

The idea of deferring some locks to executor startup, allowing locks
for prunable partitions to be skipped, was first proposed by Tom Lane.

Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions)
Reviewed-by: David Rowley <dgrowleyml@gmail.com> (earlier versions)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier versions)
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
2025-02-20 17:09:48 +09:00
Michael Paquier
ce5bcc4a9f pg_stat_statements: Add wal_buffers_full
wal_buffers_full tracks the number of times WAL buffers become full,
giving hints to be able to tune the GUC wal_buffers.

Up to now, this information was only available in pg_stat_wal.  With
this field available in WalUsage since eaf502747bac, exposing it in
pg_stat_statements is straight-forward, and it offers more granularity
at query level.

pg_stat_statements does not need a version bump as one has been done in
commit cf54a2c00254 for this development cycle.

Author: Bertrand Drouvot
Reviewed-by: Ilia Evdokimov
Discussion: https://postgr.es/m/Z6SOha5YFFgvpwQY@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-17 13:55:17 +09:00
Bruce Momjian
50e6eb731d Update copyright for 2025
Backpatch-through: 13
2025-01-01 11:21:55 -05:00
Michael Paquier
7f97b4734f Fix some comments related to library unloading
Library unloading has never been supported with its code removed in
ab02d702ef08, and there were some comments still mentioning that it was
a possible operation.

ChangAo has noticed the incorrect references in dfmgr.c, while I have
noticed the other ones while scanning the rest of the tree for similar
mistakes.

Author: ChangAo Chen, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/tencent_1D09840A1632D406A610C8C4E2491D74DB0A@qq.com
2024-12-23 14:46:49 +09:00
David Rowley
89988ac589 Fix further fallout from EXPLAIN ANALYZE BUFFERS change
c2a4078eb adjusted EXPLAIN ANALYZE to default the BUFFERS to ON.  This
(hopefully) fixes the last remaining issue with regression test failures
with -D RELCACHE_FORCE_RELEASE -D CATCACHE_FORCE_RELEASE builds, where
the planner accesses more buffers due to the cold caches.

Discussion: https://postgr.es/m/CAApHDvqLdzgz77JsE-yTki3w9UiKQ-uTMLRctazcu+99-ips3g@mail.gmail.com
2024-12-12 09:50:00 +13:00
Tom Lane
3eea7a0c97 Simplify executor's determination of whether to use parallelism.
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution.  The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun.  This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented.  That led to assertion failures in some corner
cases.  The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan.  (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)

This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.

Per report from Yugo Nagata, who also reviewed and tested
this patch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09 14:38:19 -05:00
Michael Paquier
5d4298e75f pg_stat_statements: Avoid some locking during PGSS entry scans
A single PGSS entry's spinlock is used to be able to modify "counters"
without holding pgss->lock exclusively, as mentioned at the top of
pg_stat_statements.c and within pgssEntry.

Within a single pgssEntry, stats_since and minmax_stats_since are never
modified without holding pgss->lock exclusively, so there is no need to
hold an entry's spinlock when reading stats_since and
minmax_stats_since, as done when scanning all the PGSS entries for
function calls of pg_stat_statements().

This also restores the consistency between the code and the comments
about the entry's spinlock usage.  This change is a performance
improvement (it can be argued that this is a logic bug), so there is no
need for a backpatch.  This saves two instructions from being read while
holding an entry's spinlock.

Author: Karina Litskevich
Reviewed-by: Michael Paquier, wenhui qiu
Discussion: https://postgr.es/m/CACiT8ibhCmzbcOxM0v4pRLH3abk-95LPkt7_uC2JMP+miPjxsg@mail.gmail.com
2024-11-11 09:02:30 +09:00
Peter Eisentraut
9be4e5d293 Remove unused #include's from contrib, pl, test .c files
as determined by IWYU

Similar to commit dbbca2cf299, but for contrib, pl, and src/test/.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-28 08:02:17 +01:00
Michael Paquier
6b652e6ce8 Set query ID for inner queries of CREATE TABLE AS and DECLARE
Some utility statements contain queries that can be planned and
executed: CREATE TABLE AS and DECLARE CURSOR.  This commit adds query ID
computation for the inner queries executed by these two utility
commands, with and without EXPLAIN.  This change leads to four new
callers of JumbleQuery() and post_parse_analyze_hook() so as extensions
can decide what to do with this new data.

Previously, extensions relying on the query ID, like pg_stat_statements,
were not able to track these nested queries as the query_id was 0.

For pg_stat_statements, this commit leads to additions under !toplevel
when pg_stat_statements.track is set to "all", as shown in its
regression tests.  The output of EXPLAIN for these two utilities gains a
"Query Identifier" if compute_query_id is enabled.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-28 09:03:20 +09:00
Michael Paquier
499edb0974 Track more precisely query locations for nested statements
Previously, a Query generated through the transform phase would have
unset stmt_location, tracking the starting point of a query string.

Extensions relying on the statement location to extract its relevant
parts in the source text string would fallback to use the whole
statement instead, leading to confusing results like in
pg_stat_statements for queries relying on nested queries, like:
- EXPLAIN, with top-level and nested query using the same query string,
and a query ID coming from the nested query when the non-top-level
entry.
- Multi-statements, with only partial portions of queries being
normalized.
- COPY TO with a query, SELECT or DMLs.

This patch improves things by keeping track of the statement locations
and propagate it to Query during transform, allowing PGSS to only show
the relevant part of the query for nested query.  This leads to less
bloat in entries for non-top-level entries, as queries can now be
grouped within the same (toplevel, queryid) duos in pg_stat_statements.
The result gives a stricter one-one mapping between query IDs and its
query strings.

The regression tests introduced in 45e0ba30fc40 produce differences
reflecting the new logic.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-24 09:29:54 +09:00
Tom Lane
14e5680eee Improve parser's reporting of statement start locations.
Up to now, the parser's reporting of a statement's stmt_location
included any preceding whitespace or comments.  This isn't really
desirable but was done to avoid accounting honestly for nonterminals
that reduce to empty.  It causes problems for pg_stat_statements,
which partially compensates by manually stripping whitespace, but
is not bright enough to strip /*-style comments.  There will be
more problems with an upcoming patch to improve reporting of errors
in extension scripts, so it's time to do something about this.

The thing we have to do to make it work right is to adjust
YYLLOC_DEFAULT to scan the inputs of each production to find the
first one that has a valid location (i.e., did not reduce to
empty).  In theory this adds a little bit of per-reduction overhead,
but in practice it's negligible.  I checked by measuring the time
to run raw_parser() on the contents of information_schema.sql, and
there was basically no change.

Having done that, we can rely on any nonterminal that didn't reduce
to completely empty to have a correct starting location, and we don't
need the kluges the stmtmulti production formerly used.

This should have a side benefit of allowing parse error reports to
include an error position in some cases where they formerly failed to
do so, due to trying to report the position of an empty nonterminal.
I did not go looking for an example though.  The one previously known
case where that could happen (OptSchemaEltList) no longer needs the
kluge it had; but I rather doubt that that was the only case.

Discussion: https://postgr.es/m/ZvV1ClhnbJLCz7Sm@msg.df7cb.de
2024-10-22 11:26:05 -04:00
Michael Paquier
45e0ba30fc pg_stat_statements: Add tests for nested queries with level tracking
There have never been any regression tests in PGSS for various query
patterns for nested queries combined with level tracking, like:
- Multi-statements.
- CREATE TABLE AS
- CREATE/REFRESH MATERIALIZED VIEW
- DECLARE CURSOR
- EXPLAIN, with a subset of the above supported.
- COPY.

All the tests added here track historical, sometimes confusing, existing
behaviors.  For example, EXPLAIN stores two PGSS entries with the same
top-level query string but two different query IDs as one is calculated
for the top-level EXPLAIN (this part is right) and a second one for the
inner query in the EXPLAIN (this part is not right).

A couple of patches are under discussion to improve the situation, and
all the tests added here will prove useful to evaluate the changes
discussed.

Author: Anthonin Bonnefoy
Reviewed-by: Michael Paquier, Jian He
Discussion: https://postgr.es/m/CAO6_XqqM6S9bQ2qd=75W+yKATwoazxSNhv5sjW06fjGAtHbTUA@mail.gmail.com
2024-10-22 13:05:51 +09:00
Michael Paquier
cf54a2c002 pg_stat_statements: Add columns to track parallel worker activity
The view pg_stat_statements gains two columns:
- parallel_workers_to_launch, the number of parallel workers planned to
be launched.
- parallel_workers_launched, the number of parallel workers actually
launched.

The ratio of both columns offers hints that parallel workers are lacking
on a per-statement basis, requiring some tuning, in coordination with
"calls", the number of times a query is executed.

As of now, these numbers are tracked within Gather and GatherMerge
nodes.  They could be extended to utilities that make use of parallel
workers (parallel btree and brin, VACUUM).

The module is bumped to 1.12.

Author: Guillaume Lelarge
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com
2024-10-09 08:30:45 +09:00
Peter Eisentraut
10b721821d Use macro to define the number of enum values
Refactoring in the interest of code consistency, a follow-up to 2e068db56e31.

The argument against inserting a special enum value at the end of the enum
definition is that a switch statement might generate a compiler warning unless
it has a default clause.

Aleksander Alekseev, reviewed by Michael Paquier, Dean Rasheed, Peter Eisentraut

Discussion: https://postgr.es/m/CAJ7c6TMsiaV5urU_Pq6zJ2tXPDwk69-NKVh4AMN5XrRiM7N%2BGA%40mail.gmail.com
2024-10-01 09:30:24 -04:00
Michael Paquier
dc68515968 Show values of SET statements as constants in pg_stat_statements
This is a continuation of work like 11c34b342bd7, done to reduce the
bloat of pg_stat_statements by applying more normalization to query
entries.  This commit is able to detect and normalize values in
VariableSetStmt, resulting in:
SET conf_param = $1

Compared to other parse nodes, VariableSetStmt is embedded in much more
places in the parser, impacting many query patterns in
pg_stat_statements.  A custom jumble function is used, with an extra
field in the node to decide if arguments should be included in the
jumbling or not, a location field being not enough for this purpose.
This approach allows for a finer tuning.

Clauses relying on one or more keywords are not normalized, for example:
* DEFAULT
* FROM CURRENT
* List of keywords.  SET SESSION CHARACTERISTICS AS TRANSACTION,
where it is critical to differentiate different sets of options, is a
good example of why normalization should not happen.

Some queries use VariableSetStmt for some subclauses with SET, that also
have their values normalized:
- ALTER DATABASE
- ALTER ROLE
- ALTER SYSTEM
- CREATE/ALTER FUNCTION

ba90eac7a995 has added test coverage for most of the existing SET
patterns.  The expected output of these tests shows the difference this
commit creates.  Normalization could be perhaps applied to more portions
of the grammar but what is done here is conservative, and good enough as
a starting point.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com
Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-30 14:02:00 +09:00
Michael Paquier
ba90eac7a9 pg_stat_statements: Expand tests for SET statements
There are many grammar flavors that depend on the parse node
VariableSetStmt.  This closes the gap in pg_stat_statements by providing
test coverage for what should be a large majority of them, improving more
the work begun in de2aca288569.  This will be used to ease the
evaluation of a path towards more normalization of SET queries with
query jumbling.

Note that SET NAMES (grammar from the standard, synonym of SET
client_encoding) is omitted on purpose, this could use UTF8 with a
conditional script where UTF8 is supported, but that does not seem worth
the maintenance cost for the sake of these tests.

The author has submitted most of these in a TAP test (filled in any
holes I could spot), still queries in a SQL file of pg_stat_statements
is able to achieve the same goal while being easier to look at when
testing normalization patterns.

Author: Greg Sabino Mullane, Michael Paquier
Discussion: https://postgr.es/m/CAKAnmmJtJY2jzQN91=2QAD2eAJAA-Per61eyO48-TyxEg-q0Rg@mail.gmail.com
2024-09-25 10:04:44 +09:00
Michael Paquier
933848d16d Add missing query ID reporting in extended query protocol
This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.

An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message.  This corresponds to the case where
execute_is_fetch is set, for example.

Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it.  Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced.  The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.

The test added in pg_stat_statements is a courtesy of Robert Haas.  This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.

Reported-by:  Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14
2024-09-18 09:59:09 +09:00
Tom Lane
94537982ec Remove obsolete comment in pg_stat_statements.
Commit 76db9cb63 removed the use of multiple nesting counters,
but missed one comment describing that arrangement.

Back-patch to v17 where 76db9cb63 came in, just to avoid confusion.

Julien Rouhaud

Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw
2024-09-14 11:42:31 -04:00
Michael Paquier
7b1ddbae36 pg_stat_statements: Add tests with extended query protocol
There are currently no tests in the tree checking that queries using the
extended query protocol are able to map with their query ID.

This can be achieved for some paths of the extended query protocol with
the psql meta-commands \bind or \bind_named, so let's add some tests
based on both.

I have found that to be a useful addition while working on a different
issue.

Discussion: https://postgr.es/m/ZuEt6MOEBSlifBfn@paquier.xyz
2024-09-13 09:41:06 +09:00
Nathan Bossart
8928817769 Remove volatile qualifiers from pg_stat_statements.c.
Prior to commit 0709b7ee72, which changed the spinlock primitives
to function as compiler barriers, access to variables within a
spinlock-protected section required using a volatile pointer, but
that is no longer necessary.

Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/Zqkv9iK7MkNS0KaN%40nathan
2024-08-06 10:56:37 -05:00
Fujii Masao
97f2bc5aa5 pg_stat_statements: Add regression test for privilege handling.
This commit adds a regression test to verify that pg_stat_statements
correctly handles privileges, improving its test coverage.

Author: Keisuke Kuroda
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/2224ccf2e12c41ccb81702ef3303d5ac@nttcom.co.jp
2024-07-24 20:54:51 +09:00
Michael Paquier
c145f321b6 Propagate query IDs of utility statements in functions
For utility statements defined within a function, the query tree is
copied to a PlannedStmt as utility commands do not require planning.
However, the query ID was missing from the information passed down.

This leads to plugins relying on the query ID like pg_stat_statements to
not be able to track utility statements within function calls.  Tests
are added to check this behavior, depending on pg_stat_statements.track.

This is an old bug.  Now, query IDs for utilities are compiled using
their parsed trees rather than the query string since v16
(3db72ebcbe20), leading to less bloat with utilities, so backpatch down
only to this version.

Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com
Backpatch-through: 16
2024-07-19 10:21:01 +09:00
Peter Eisentraut
17974ec259 Revise GUC names quoting in messages again
After further review, we want to move in the direction of always
quoting GUC names in error messages, rather than the previous (PG16)
wildly mixed practice or the intermittent (mid-PG17) idea of doing
this depending on how possibly confusing the GUC name is.

This commit applies appropriate quotes to (almost?) all mentions of
GUC names in error messages.  It partially supersedes a243569bf65 and
8d9978a7176, which had moved things a bit in the opposite direction
but which then were abandoned in a partial state.

Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
2024-05-17 11:44:26 +02:00
Nathan Bossart
3b42bdb471 Use new overflow-safe integer comparison functions.
Commit 6b80394781 introduced integer comparison functions designed
to be as efficient as possible while avoiding overflow.  This
commit makes use of these functions in many of the in-tree qsort()
comparators to help ensure transitivity.  Many of these comparator
functions should also see a small performance boost.

Author: Mats Kindahl
Reviewed-by: Andres Freund, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 14:05:36 -06:00
Bruce Momjian
29275b1d17 Update copyright for 2024
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz

Backpatch-through: 12
2024-01-03 20:49:05 -05:00
Peter Eisentraut
8ce5aef2e6 Make Perl warnings fatal in newly added TAP tests
New TAP tests added by commits 9a17be1e and 4710b67d missed to convert
warnings to FATAL.  This commit fixes that.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
2024-01-03 13:16:55 +01:00
Peter Eisentraut
1141e29b61 Revert "pg_stat_statements: Add coverage for entry_dealloc()"
This reverts commit 742f6b3e6df980d7dafa4a18a165d285483d5f0e.

The new test failed on big-endian platforms.

Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org
2023-12-31 15:30:57 +01:00
Peter Eisentraut
4710b67d4d pg_stat_statements: Add TAP test for testing restarts
This tests that pg_stat_statement contents are successfully kept
across restart.  (This similar to
src/test/recovery/t/029_stats_restart.pl for the stats collector.)

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org
2023-12-30 20:18:23 +01:00
Peter Eisentraut
742f6b3e6d pg_stat_statements: Add coverage for entry_dealloc()
This involves creating more than pg_stat_statements.max entries and
checking that the limit is kept and the least used entries are kicked
out.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org
2023-12-30 20:18:23 +01:00
Peter Eisentraut
3e527aeeed pg_stat_statements: Add test coverage for pg_stat_statements_reset_1_7
Run pg_stat_statements_reset() once while the appropriate extension
version is installed.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org
2023-12-27 10:48:01 +01:00
Peter Eisentraut
3727b8d0e3 pg_stat_statements: Add test coverage for pg_stat_statements_1_8()
This requires reading pg_stat_statements at least once while the 1.8
version of the extension is installed.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/40d1e4f2-835f-448f-a541-8ff5db75bf3d@eisentraut.org
2023-12-27 10:48:01 +01:00
Alexander Korotkov
dc9f8a7983 Track statement entry timestamp in contrib/pg_stat_statements
This patch adds 'stats_since' and 'minmax_stats_since' columns to the
pg_stat_statements view and pg_stat_statements() function.  The new min/max
reset mode for the pg_stat_stetments_reset() function is controlled by the
parameter minmax_only.

'stat_since' column is populated with the current timestamp when a new
statement is added to the pg_stat_statements hashtable.  It provides clean
information about statistics collection time intervals for each statement.
Besides it can be used by sampling solutions to detect situations when a
statement was evicted and stored again between samples.

Such a sampling solution could derive any pg_stat_statements statistic values
for an interval between two samples with the exception of all min/max
statistics. To address this issue this patch adds the ability to reset
min/max statistics independently of the statement reset using the new
minmax_only parameter of the pg_stat_statements_reset(userid oid, dbid oid,
queryid bigint, minmax_only boolean) function. The timestamp of such reset
is stored in the minmax_stats_since field for each statement.
pg_stat_statements_reset() function now returns the timestamp of a reset as the
result.

Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
2023-11-27 02:52:17 +02:00
Alexander Korotkov
6ab1dbd26b Add NOT NULL checking of pg_stat_statements_reset() in tests
This is preliminary patch.  It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.

Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
2023-11-27 02:52:17 +02:00
Michael Paquier
108161bcb9 pg_stat_statements: Remove duplicated tests for SET statements
This looks like a copy-paste mistake introduced in de2aca288569, that
has added checks for more patterns of SET statements while ignoring the
original test block that existed.

Backpatch down to where this has been introduced, as this shaves some
cycles.

Author: Sergei Kornilov
Discussion: https://postgr.es/m/5689421699428803@mail-sendbernar-production-main-46.myt.yp-c.yandex.net
Backpatch-through: 16
2023-11-09 10:04:31 +09:00
Tom Lane
76db9cb636 Fix some issues with tracking nesting level in pg_stat_statements.
When we decide that we don't want to track execution time of a
specific planner or ProcessUtility call, we still have to increment
the nesting depth, or we'll make the wrong determination of whether
we are at top level when considering nested statements.  (PREPARE
and EXECUTE are exceptions, for reasons explained in the code.)

Counting planner nesting depth separately from executor nesting depth
was a mistake: it causes us to make the wrong determination of whether
we are at top level when considering nested statements that get
executed during planning (as a result of constant-folding of
functions, for example).  Merge those counters into one.

In passing, get rid of the PGSS_HANDLED_UTILITY macro in favor of
explicitly listing statement types.  It seems somewhat coincidental
that PREPARE and EXECUTE are handled alike in each of the places where
that was used: the reasoning tends to be different for each one.
Thus, the macro seems as likely to encourage future bugs as prevent
them, since it's quite unclear whether any future statement type that
might need special-casing here would also need the same choices at
each spot.

Sergei Kornilov, Julien Rouhaud, and Tom Lane, per bug #17552 from
Maxim Boguk.  This is pretty clearly a bug fix, but it's also a
behavioral change that might surprise somebody, so no back-patch.

Discussion: https://postgr.es/m/17552-213b534c56ab5d02@postgresql.org
2023-11-08 12:01:28 -05:00
Peter Eisentraut
611806cd72 Add trailing commas to enum definitions
Since C99, there can be a trailing comma after the last value in an
enum definition.  A lot of new code has been introducing this style on
the fly.  Some new patches are now taking an inconsistent approach to
this.  Some add the last comma on the fly if they add a new last
value, some are trying to preserve the existing style in each place,
some are even dropping the last comma if there was one.  We could
nudge this all in a consistent direction if we just add the trailing
commas everywhere once.

I omitted a few places where there was a fixed "last" value that will
always stay last.  I also skipped the header files of libpq and ecpg,
in case people want to use those with older compilers.  There were
also a small number of cases where the enum type wasn't used anywhere
(but the enum values were), which ended up confusing pgindent a bit,
so I left those alone.

Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org
2023-10-26 09:20:54 +02:00
Michael Paquier
5147ab1dd3 pg_stat_statements: Add local_blk_{read|write}_time
This commit adds to pg_stat_statements the two new fields for local
buffers introduced by 295c36c0c1fa, adding the time spent to read and
write these blocks.  These are similar to what is done for temp and
shared blocks.  This information available only if track_io_timing is
enabled.

Like for 5a3423ad8ee17, no version bump is required in the module.

Author: Nazir Bilal Yavuz
Reviewed-by: Robert Haas, Melanie Plageman
Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
2023-10-19 14:03:31 +09:00
Michael Paquier
13d00729d4 Rename I/O timing statistics columns to shared_blk_{read|write}_time
These two counters, defined in BufferUsage to track respectively the
time spent while reading and writing blocks have historically only
tracked data related to shared buffers, when track_io_timing is enabled.

An upcoming patch to add specific counters for local buffers will take
advantage of this rename as it has come up that no data is currently
tracked for local buffers, and tracking local and shared buffers using
the same fields would be inconsistent with the treatment done for temp
buffers.  Renaming the existing fields clarifies what the block type of
each stats field is.

pg_stat_statement is updated to reflect the rename.  No extension
version bump is required as 5a3423ad8ee17 has done one, affecting v17~.

Author: Nazir Bilal Yavuz
Reviewed-by: Robert Haas, Melanie Plageman
Discussion: https://postgr.es/m/CAN55FZ19Ss279mZuqGbuUNxka0iPbLgYuOQXqAKewrjNrp27VA@mail.gmail.com
2023-10-19 11:26:40 +09:00
Daniel Gustafsson
c5a032e518 Update oldextversions test for pg_stat_statements 1.11
Commit 5a3423ad8e updated pg_stat_statements to 1.11 due to added
columns in the view for jit deform counters. Fixing oldextversion
was however missed in that commit.  This adds a test for the 1.11
version view definition.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ2Y7c2VEg4S1KDXFcaHjyF8=KZa_rMV6WOe+KwE-KE97g@mail.gmail.com
2023-10-13 13:50:18 +02:00
Michael Paquier
11c34b342b Show parameters of CALL as constants in pg_stat_statements
This commit changes the query jumbling of CallStmt so as its IN/OUT
parameters are able to show up as constants with a parameter symbol in
pg_stat_statements, like:
CALL proc1($1, $2);
CALL proc2($1, $2, $3);

The transformed FuncExpr is used in the query ID computation instead of
the FuncCall generated by the parser, so as it is sensitive to the OID
of the procedure and its list of input arguments.  The output arguments
are handled in a separate list in CallStmt, which is also included in
the computation.

Tests are added to pg_stat_statements to show how this affects CALL with
IN/OUT parameters as well as overloaded functions.

Like 638d42a3c520 or 31de7e60da34, this improves the monitoring of
workloads with a lot of CALL statements, preventing unnecessary bloat
when these use different input (or event output) values.

Author: Sami Imseih
Discussion: https://postgr.es/m/B44FA29D-EBD0-4DD9-ABC2-16F1CB087074@amazon.com
2023-09-28 15:17:55 +09:00
Andres Freund
7369798a83 Fix tracking of temp table relation extensions as writes
Karina figured out that I (Andres) confused BufferUsage.temp_blks_written with
BufferUsage.local_blks_written in fcdda1e4b5.

Tests in core PG can't easily test this, as BufferUsage is just used for
EXPLAIN (ANALYZE, BUFFERS) and pg_stat_statements. Thus this commit adds tests
for this to pg_stat_statements.

Reported-by: Karina Litskevich <litskevichkarina@gmail.com>
Author: Karina Litskevich <litskevichkarina@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CACiT8ibxXA6+0amGikbeFhm8B84XdQVo6D0Qfd1pQ1s8zpsnxQ@mail.gmail.com
Backpatch: 16-, where fcdda1e4b5 was merged
2023-09-13 19:14:09 -07:00
Daniel Gustafsson
5a3423ad8e Add JIT deform_counter
generation_counter includes time spent on both JIT:ing expressions
and tuple deforming which are configured independently via options
jit_expressions and jit_tuple_deforming.  As they are  combined in
the same counter it's not apparent what fraction of time the tuple
deforming takes.

This adds deform_counter dedicated to tuple deforming, which allows
seeing more directly the influence jit_tuple_deforming is having on
the query. The counter is exposed in EXPLAIN and pg_stat_statements
bumpin pg_stat_statements to 1.11.

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20220612091253.eegstkufdsu4kfls@erthalion.local
2023-09-08 15:05:12 +02:00
Michael Paquier
bb45156f34 Show names of DEALLOCATE as constants in pg_stat_statements
This commit switches query jumbling so as prepared statement names are
treated as constants in DeallocateStmt.  A boolean field is added to
DeallocateStmt to make a distinction between ALL and named prepared
statements, as "name" was used to make this difference before, NULL
meaning DEALLOCATE ALL.

Prior to this commit, DEALLOCATE was not tracked in pg_stat_statements,
for the reason that it was not possible to treat its name parameter as a
constant.  Now that query jumbling applies to all the utility nodes,
this reason does not apply anymore.

Like 638d42a3c520, this can be a huge advantage for monitoring where
prepared statement names are randomly generated, preventing bloat in
pg_stat_statements.  A couple of tests are added to track the new
behavior.

Author: Dagfinn Ilmari Mannsåker, Michael Paquier
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/ZMhT9kNtJJsHw6jK@paquier.xyz
2023-08-27 17:27:44 +09:00
Michael Paquier
638d42a3c5 Show GIDs of two-phase commit commands as constants in pg_stat_statements
This relies on the "location" field added to TransactionStmt in 31de7e6,
now applied to the "gid" field used by 2PC commands.  These commands are
now reported like:
COMMIT PREPARED $1
PREPARE TRANSACTION $1
ROLLBACK PREPARED $1

Applying constants for these commands is a huge advantage for workloads
that rely a lot on 2PC commands with different GIDs.  Some tests are
added to track the new behavior.

Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/ZMhT9kNtJJsHw6jK@paquier.xyz
2023-08-12 10:44:15 +09:00
Michael Paquier
31de7e60da Show savepoint names as constants in pg_stat_statements
In pg_stat_statements, savepoint names now show up as constants with a
parameter symbol, using as base query string the one added as a new
entry to the PGSS hash table, leading to:
RELEASE $1
ROLLBACK TO $1
SAVEPOINT $1

Applying constants to these query parts is a huge advantage for
workloads that generate randomly savepoint points, like ORMs (Django is
at the origin of this patch).  The ODBC driver is a second layer that
likes a lot savepoints, though it does not use a random naming pattern.

A "location" field is added to TransactionStmt, now set only for
savepoints.  The savepoint name is ignored by the query jumbling.  The
location can be extended to other query patterns, if required, like 2PC
commands.  Some tests are added to pg_stat_statements for all the query
patterns supported by the parser.

ROLLBACK, ROLLBACK TO SAVEPOINT and ROLLBACK TRANSACTION TO SAVEPOINT
have the same Node representation, so all these are equivalents.  The
same happens for RELEASE and RELEASE SAVEPOINT.

Author: Greg Sabino Mullane
Discussion: https://postgr.es/m/CAKAnmm+2s9PA4OaumwMJReWHk8qvJ_-g1WqxDRDAN1BSUfxyTw@mail.gmail.com
2023-07-27 09:42:33 +09:00
Michael Paquier
bc8e9a6a25 pg_stat_statements: Fix second comment related to entry resets
This should have been part of dc73db6, but it got lost in the mix.
Oversight in 6b4d23f.

Author: Japin Li
Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Backpatch-through: 14
2023-06-29 09:17:26 +09:00
Michael Paquier
dc73db6743 pg_stat_statements: Fix incorrect comment with entry resets
Oversight in 6b4d23f.

Author: Japin Li, Richard Guo
Discussion: https://postgr.es/m/MEYP282MB1669FC91C764E277821936D3B624A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
Backpatch-through: 14
2023-06-29 08:04:47 +09:00
Heikki Linnakangas
876d17d62f Fix comment on clearing padding.
Author: Japin Li
Discussion: https://www.postgresql.org/message-id/MEYP282MB16696317B5DA7D0D92306149B627A@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2023-06-27 10:12:25 +03:00