Add a NodeTag field to struct Bitmapset. This is free because of
alignment considerations on 64-bit hardware. While it adds some
space on 32-bit machines, we aren't optimizing for that case anymore.
The advantage is that data structures such as Lists of Bitmapsets
are now first-class objects to the Node infrastructure, and don't
require special-case code to handle.
This patch includes removal of one such special case, in indxpath.c:
bms_equal_any() can now be replaced by list_member(). There may be
more existing code that could be simplified, but I didn't look very
hard. We also get to drop the read_write_ignore annotations on a
couple of RelOptInfo fields.
The outfuncs/readfuncs support is arranged so that nothing changes
in the string representation of a Bitmapset field; therefore, this
doesn't need a catversion bump.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/109089.1668197158@sss.pgh.pa.us
The current code looks for the sample file in the source directory, but
it seems better to test against the installed sample file.
Backpatch to release 15 where the test was introduced.
Discussion: https://postgr.es/m/73eea68e-3b6f-5f63-6024-25ed26b52016@dunslane.net
Reviewed by Tom Lane, Alvaro Herrera, Michael Paquier.
Currently this only allows for one argument, which must be present, and
always returns a single string. With this change the following now all
work:
$all_config = $node->config_data;
%config_map = ($node->config_data);
$incdir = $node->config_data('--include-dir');
($incdir, $sharedir) = $node->config_data(
qw(--include-dir --share-dir));
Backpatch to release 15 where this was introduced.
Discussion: https://postgr.es/m/73eea68e-3b6f-5f63-6024-25ed26b52016@dunslane.net
Reviewed by Tom Lane, Alvaro Herrera, Michael Paquier.
Instead of dozens of mostly-duplicate pg_foo_aclcheck() functions,
write one common function object_aclcheck() that can handle almost all
of them. We already have all the information we need, such as which
system catalog corresponds to which catalog table and which column is
the ACL column.
There are a few pg_foo_aclcheck() that don't work via the generic
function and have special APIs, so those stay as is.
I also changed most pg_foo_aclmask() functions to static functions,
since they are not used outside of aclchk.c.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://www.postgresql.org/message-id/flat/95c30f96-4060-2f48-98b5-a4392d3b6066@enterprisedb.com
Instead of dozens of mostly-duplicate pg_foo_ownercheck() functions,
write one common function object_ownercheck() that can handle almost
all of them. We already have all the information we need, such as
which system catalog corresponds to which catalog table and which
column is the owner column.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://www.postgresql.org/message-id/flat/95c30f96-4060-2f48-98b5-a4392d3b6066@enterprisedb.com
Test files should now ignore has_wal_read_bug() so long as
wait_for_catchup() is their only known way of reaching the bug. That's
at least five files today, a number expected to grow over time. This
commit removes skip logic from three. By doing so, systems having the
bug regain the ability to catch other kinds of defects via those three
tests. The other two, 002_databases.pl and 031_recovery_conflict.pl,
have been unprotected. Back-patch to v15, where done_testing() first
became our standard.
Discussion: https://postgr.es/m/20221030031639.GA3082137@rfd.leadboat.com
It's safe to mark this as immutable, because it does not depend
on the timezone GUC setting. Oversight in commit 600b04d6b.
(There's an argument that timezone definitions do change from
time to time, but we have not worried about that in marking
other timestamp-related functions; for example AT TIME ZONE
has always been considered immutable. The situation is no
worse than our problems with time-varying locales, surely.)
Przemysław Sztoch
Discussion: https://postgr.es/m/eaa3fabe-50fc-bbe8-b096-ce62ddadab85@sztoch.pl
The original report was concerned with a possible inconsistency
between the heap and the visibility map, which I was unable to
confirm. The concern has been retracted.
However, there did seem to be a torn page hazard when using
checksums. By not setting the heap page LSN during redo, the
protections of minRecoveryPoint were bypassed. Fixed, along with a
misleading comment.
It may have been impossible to hit this problem in practice, because
it would require a page tear between the checksum and the flags, so I
am marking this as a theoretical risk. But, as discussed, it did
violate expectations about the page LSN, so it may have other
consequences.
Backpatch to all supported versions.
Reported-by: Konstantin Knizhnik
Reviewed-by: Konstantin Knizhnik
Discussion: https://postgr.es/m/fed17dac-8cb8-4f5b-d462-1bb4908c029e@garret.ru
Backpatch-through: 11
The stanza "SET STORAGE may need to add a TOAST table" does not
test what it's supposed to, and hasn't done so since we added
the ability to store constant column default values as metadata.
We need to use a non-constant default to get the expected table
rewrite to actually happen.
Fix that, and add the missing checks that would have exposed the
problem to begin with.
Noted while reviewing a patch that made changes in this test case.
Back-patch to v11 where the problem came in.
Replace the stopgap fix I made in 0e758ae89 with a cleaner one.
The real problem with 4ab5dae94 is that it contorted this function's
logic substantially, by introducing a third code path that required
different behavior in the function's main loop. That seems quite
unnecessary on closer inspection: the new IsBinaryUpgrade case can
just share the behavior of the other immediate-unlink cases. Hence,
revert 4ab5dae94 and most of 0e758ae89 (keeping the latter's
save/restore errno fix), and add IsBinaryUpgrade to the set of
conditions tested to choose immediate unlink.
Also fix some additional places with sloppy handling of errno,
to ensure we have an invariant that we always continue processing
after any non-ENOENT failure of do_truncate. I doubt that that's
fixing any bug of field importance, so I don't feel it necessary to
back-patch; but we might as well get it right while we're here.
Also improve the comments, which had drifted a bit from what the
code actually does, and neglected to mention some important
considerations.
Back-patch to v15, not because this is fixing any bug but because
it doesn't seem like a good idea for v15's mdunlinkfork logic to be
significantly different from both v14 and v16.
Discussion: https://postgr.es/m/3797575.1667924888@sss.pgh.pa.us
Previously, trying to set storage parameters on a partitioned table
always led to "unrecognized parameter foo", because the code expected
there might be some valid parameters; but there aren't any. The docs
make clear that it's intended that there never will be any, so let's
replace this useless search with a more to-the-point message.
Simon Riggs and Karina Litskevich
Discussion: https://postgr.es/m/CANbhV-H=eZ9kTR9mUgKGK0Qv9uXP=U+dQg3rinQHfTdFMhBA2A@mail.gmail.com
Add a little to the header comments for these functions to make it
clearer what guarantees about commit behavior are provided to callers.
(See commit f92944137 for context.)
Although this is only a comment change, it's really documentation
aimed at authors of extensions, so it seems appropriate to back-patch.
Yugo Nagata and Tom Lane, per further discussion of bug #17434.
Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
The code building an absolute path to a file included, as prefixed by
'@' in authentication files, for user and database lists uses the same
logic as for GUCs, except that it has no need to know about DataDir as
there is always a calling file to rely to build the base directory path.
The refactoring done in a1a7bb8 makes this move straight-forward, and
unifies the code used for GUCs and authentication files, and the
intention is to rely also on that for the upcoming patch to be able to
include full files from HBA or ident files.
Note that this gets rid of an inconsistency introduced in 370f909, that
copied the logic coming from GUCs but applied it for files included in
authentication files, where the result buffer given to
join_path_components() must have a size of MAXPGPATH. Based on a
double-check of the existing code, all the other callers of
join_path_components() already do that, except the code path changed
here.
Discussion: https://postgr.es/m/Y2igk7q8OMpg+Yta@paquier.xyz
Commit fede15417 introduced FILTER by jamming it into the existing
example introducing HAVING, which seems pedagogically poor to me;
and it added no information about what the keyword actually does.
Not to mention that the claimed output didn't match the sample
data being used in this running example.
Revert that and instead make an independent example using FILTER.
To help drive home the point that it's a per-aggregate filter,
we need to use two aggregates not just one; for consistency
expand all the examples in this segment to do that.
Also adjust the example using WHERE ... LIKE so that it'd produce
nonempty output with this sample data, and show that output.
Back-patch, as the previous patch was. (Sadly, v10 is now out
of scope.)
Discussion: https://postgr.es/m/166794307526.652.9073408178177444190@wrigleys.postgresql.org
The planner simplifies boolean comparisons such as "x = true" and
"x = false" down to "x" and "NOT x" respectively, to have a canonical
form to ease comparisons. However, if we want to use an index on x,
the index AM APIs require us to reconstitute the comparison-operator
form of the indexqual. While that works, in bitmap indexscans the
canonical form of the qual was emitted as a "filter" condition
although it really only needs to be a "recheck" condition, because
create_bitmap_scan_plan didn't recognize the equivalence of that
form with the generated indexqual. booleq() is pretty cheap so that
likely doesn't make very much difference, but it's unsightly so
let's clean it up.
To fix, add a case to predicate_implied_by() to recognize the
equivalence of such clauses. This is a relatively low-cost place to
add a check, and perhaps it will have additional use cases in future.
Richard Guo and Tom Lane, per discussion of bug #17618 from Sindy
Senorita.
Discussion: https://postgr.es/m/17618-7a2240bfaa7e84ae@postgresql.org
Instead of waking up 10 times per second to check for various timeout
conditions, keep track of when we next have periodic work to do.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CA%2BhUKGJGhX4r2LPUE3Oy9BX71Eum6PBcS8L3sJpScR9oKaTVaA%40mail.gmail.com
\d+ is already able to show if a partition or a child table is
"PARTITIONED" via its relkind, hence the addition of a keyword for
"FOREIGN" in the relation description is basically free.
Author: Ian Lawrence Barwick
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAB8KJ=iwzbEz2HR9EhNxQLVhMk2G_OYtQPJ9V=jWLadseggrOA@mail.gmail.com
This change impacts pg_receivewal and pg_basebackup, for the pre-padding
with zeros of all the new non-compressed WAL segments, so as the code is
more robust on partial writes. This makes the code consistent with the
backend (XLogFileInitInternal) when wal_init_zeros is enabled for the
WAL segment initialization.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACUq7nAb7=bJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA@mail.gmail.com
This routine is designed to write zeros to a file using vectored I/O,
for a size given by its caller, being useful when it comes to
initializing a file with a final size already known.
XLogFileInitInternal() in xlog.c is changed to use this new routine when
initializing WAL segments with zeros (wal_init_zero enabled). Note that
the aligned buffers used for the vectored I/O writes have a size of
XLOG_BLCKSZ, and not BLCKSZ anymore, as pg_pwrite_zeros() relies on
PGAlignedBlock while xlog.c originally used PGAlignedXLogBlock.
This routine will be used in a follow-up patch to do the pre-padding of
WAL segments for pg_receivewal and pg_basebackup when these are not
compressed.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael
Paquier
Discussion: https://www.postgresql.org/message-id/CALj2ACUq7nAb7%3DbJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA%40mail.gmail.com
A NULL result should be reported when a stats timestamp is set to 0, but
c037471 missed that, leading to a confusing timestamp value after for
example a DML on a freshly-created relation with no scans done on it
yet.
This impacted the following attributes for two system views:
- pg_stat_all_tables.last_idx_scan
- pg_stat_all_tables.last_seq_scan
- pg_stat_all_indexes.last_idx_scan
Reported-by: Robert Treat
Analyzed-by: Peter Eisentraut
Author: Dave Page
Discussion: https://postgr.es/m/CABV9wwPzMfSaz3EfKXXDxKmMprbxwF5r6WPuxqA=5mzRUqfTGg@mail.gmail.com
MSVC does not understand that ereport(ERROR) does not return, so just
return the first enum PartitionStrategy value to keep the compiler from
complaining about the missing return.
Discussion: https://postgr.es/m/20221104161934.GB16921@telsasoft.com
Commit 4ab5dae94 broke mdunlinkfork's logic for removing additional
segments of a multi-gigabyte table, because it neglected to advance
"segno" after unlinking the first segment, in the code path where it
chooses to unlink that one immediately. Then the main remove loop
gets ENOENT at segment zero and figures it's done, so we never remove
whatever additional segments might exist.
The main problem here is with large temporary tables, but WAL replay
of a drop of a large regular table would also fail to remove extra
segments. The third case where this path is taken is for non-main
forks; but I doubt it matters for those since they probably never
exceed 1GB.
The simplest fix is just to increment segno after that unlink().
(Probably this logic could do with a more thorough rethink, but not
with mere hours to go before 15.1 wraps.)
While here, also fix an incautious assumption that
register_forget_request cannot change errno. I don't think that
that has any really bad consequences, as we'd end up trying to unlink
the zero'th segment either way, but it greatly complicates reasoning
about what could happen here. Also make a couple of other cosmetic
fixes.
Per bug #17679 from Balazs Szilfai. Back-patch into v15, as the
faulty patch was.
Discussion: https://postgr.es/m/17679-1095d04450cf6a6e@postgresql.org
The code in charge of listing and classifying a set of configuration
files in a directory was located in guc-file.l, being used currently for
GUCs under "include_dir". This code is planned to be used for an
upcoming feature able to include configuration files for ident and HBA
files from a directory, similarly to GUCs. In both cases, the file
names, suffixed by ".conf", have to be ordered alphabetically. This
logic is moved to a new file, called conffiles.c, so as it is easier to
share this facility between GUCs and the HBA/ident parsing logic.
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/Y2IgaH5YzIq2b+iR@paquier.xyz
We weren't actually using the passed-down list for anything, other
than computing the new value to be passed down further. I (tgl)
probably had the idea that we'd need this data eventually; but
no use-case has emerged in a good long while, so let's just stop
expending useless cycles here.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs48KLy9aBb=sZ5MoNmnqAcGHaW_JTGWLCgoE_uMW7S6C-A@mail.gmail.com
Commit aa0105141 repeated one of the oldest mistakes in our book:
thinking that OID is the same as int32. It isn't of course, and
unsurprisingly the first person who came along with a database
OID above 2 billion broke it. Repair.
Per bug #17677 from Sergey Pankov. Back-patch to v15.
Discussion: https://postgr.es/m/17677-a99fa067d7ed71c9@postgresql.org
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.
This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
Commit f56f8f8da6 added some code in CloneFkReferencing that's way too
lax about a Constraint node it manufactures, not initializing enough
struct members -- initially_valid in particular was forgotten. This
causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to
be marked as not validated. Set initially_valid true, which fixes the
bug.
While at it, make the struct initialization more complete. Very similar
code was added in two other places by the same commit; make them all
follow the same pattern for consistency, though no bugs are apparent
there.
This bug has never been reported: I only happened to notice while
working on commit 614a406b4f. The test case that was added there with
the improper result is repaired.
Backpatch to 12.
Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
If a syntax error occurred in a SQL-language or PL/pgSQL-language
CREATE FUNCTION or DO command executed in a logical replication worker,
we'd suffer a null pointer dereference or assertion failure. That
seems like a rather contrived case, but nonetheless worth fixing.
The cause is that function_parse_error_transpose assumes it must be
executing within the context of a Portal, but logical/worker.c
doesn't create a Portal since it's not running the standard executor.
We can just back off the hard Assert check and make it fail gracefully
if there's not an ActivePortal. (I have a feeling that the aggressive
check here was my fault originally, probably because I wasn't sure if
the case would always hold and wanted to find out. Well, now we know.)
The hazard seems to exist in all branches that have logical replication,
so back-patch to v10.
Maxim Orlov, Anton Melnikov, Masahiko Sawada, Tom Lane
Discussion: https://postgr.es/m/b570c367-ba38-95f3-f62d-5f59b9808226@inbox.ru
Discussion: https://postgr.es/m/adf0452f-8c6b-7def-d35e-ab516c80088e@inbox.ru
Casting the result of palloc etc. to the intended type is more per
project style anyway.
(The fact that cpluspluscheck doesn't notice these problems is
because it doesn't expand any macros, which seems like a troubling
shortcoming. Don't have a good idea about improving that.)
Back-patch to v13, which is as far as the patch applies cleanly;
doesn't seem worth working harder.
David Geier
Discussion: https://postgr.es/m/aa5d88a3-71f4-3455-11cf-82de0372c941@gmail.com
If we have no special-case code in s_lock.h for the current platform,
but the compiler has __sync_lock_test_and_set, use that instead of
failing. It's unlikely that anybody's __sync_lock_test_and_set
would be so awful as to be worse than our semaphore-based fallback,
but if it is, they can (continue to) use --disable-spinlocks.
This allows removal of the RISC-V special case installed by commit
c32fcac56, which generated exactly the same code but only on that
platform. Usefully, the RISC-V buildfarm animals should now test
at least the int variant of this patch.
I've manually tested both variants on ARM by dint of removing the
ARM-specific stanza. We don't want to drop that, because it already
has some special knowledge and is likely to grow more over time.
Likewise, this is not meant to preclude installing special cases
for other arches if that proves worthwhile.
Per discussion of a request to install the same code for loongarch64.
Like the previous patch, we might as well back-patch to supported
branches.
Discussion: https://postgr.es/m/761ac43d44b84d679ba803c2bd947cc0@HSMAILSVR04.hs.handsome.com.cn