The original report was concerned with a possible inconsistency
between the heap and the visibility map, which I was unable to
confirm. The concern has been retracted.
However, there did seem to be a torn page hazard when using
checksums. By not setting the heap page LSN during redo, the
protections of minRecoveryPoint were bypassed. Fixed, along with a
misleading comment.
It may have been impossible to hit this problem in practice, because
it would require a page tear between the checksum and the flags, so I
am marking this as a theoretical risk. But, as discussed, it did
violate expectations about the page LSN, so it may have other
consequences.
Backpatch to all supported versions.
Reported-by: Konstantin Knizhnik
Reviewed-by: Konstantin Knizhnik
Discussion: https://postgr.es/m/fed17dac-8cb8-4f5b-d462-1bb4908c029e@garret.ru
Backpatch-through: 11
The stanza "SET STORAGE may need to add a TOAST table" does not
test what it's supposed to, and hasn't done so since we added
the ability to store constant column default values as metadata.
We need to use a non-constant default to get the expected table
rewrite to actually happen.
Fix that, and add the missing checks that would have exposed the
problem to begin with.
Noted while reviewing a patch that made changes in this test case.
Back-patch to v11 where the problem came in.
Replace the stopgap fix I made in 0e758ae89 with a cleaner one.
The real problem with 4ab5dae94 is that it contorted this function's
logic substantially, by introducing a third code path that required
different behavior in the function's main loop. That seems quite
unnecessary on closer inspection: the new IsBinaryUpgrade case can
just share the behavior of the other immediate-unlink cases. Hence,
revert 4ab5dae94 and most of 0e758ae89 (keeping the latter's
save/restore errno fix), and add IsBinaryUpgrade to the set of
conditions tested to choose immediate unlink.
Also fix some additional places with sloppy handling of errno,
to ensure we have an invariant that we always continue processing
after any non-ENOENT failure of do_truncate. I doubt that that's
fixing any bug of field importance, so I don't feel it necessary to
back-patch; but we might as well get it right while we're here.
Also improve the comments, which had drifted a bit from what the
code actually does, and neglected to mention some important
considerations.
Back-patch to v15, not because this is fixing any bug but because
it doesn't seem like a good idea for v15's mdunlinkfork logic to be
significantly different from both v14 and v16.
Discussion: https://postgr.es/m/3797575.1667924888@sss.pgh.pa.us
Previously, trying to set storage parameters on a partitioned table
always led to "unrecognized parameter foo", because the code expected
there might be some valid parameters; but there aren't any. The docs
make clear that it's intended that there never will be any, so let's
replace this useless search with a more to-the-point message.
Simon Riggs and Karina Litskevich
Discussion: https://postgr.es/m/CANbhV-H=eZ9kTR9mUgKGK0Qv9uXP=U+dQg3rinQHfTdFMhBA2A@mail.gmail.com
Add a little to the header comments for these functions to make it
clearer what guarantees about commit behavior are provided to callers.
(See commit f92944137 for context.)
Although this is only a comment change, it's really documentation
aimed at authors of extensions, so it seems appropriate to back-patch.
Yugo Nagata and Tom Lane, per further discussion of bug #17434.
Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org
The code building an absolute path to a file included, as prefixed by
'@' in authentication files, for user and database lists uses the same
logic as for GUCs, except that it has no need to know about DataDir as
there is always a calling file to rely to build the base directory path.
The refactoring done in a1a7bb8 makes this move straight-forward, and
unifies the code used for GUCs and authentication files, and the
intention is to rely also on that for the upcoming patch to be able to
include full files from HBA or ident files.
Note that this gets rid of an inconsistency introduced in 370f909, that
copied the logic coming from GUCs but applied it for files included in
authentication files, where the result buffer given to
join_path_components() must have a size of MAXPGPATH. Based on a
double-check of the existing code, all the other callers of
join_path_components() already do that, except the code path changed
here.
Discussion: https://postgr.es/m/Y2igk7q8OMpg+Yta@paquier.xyz
Commit fede15417 introduced FILTER by jamming it into the existing
example introducing HAVING, which seems pedagogically poor to me;
and it added no information about what the keyword actually does.
Not to mention that the claimed output didn't match the sample
data being used in this running example.
Revert that and instead make an independent example using FILTER.
To help drive home the point that it's a per-aggregate filter,
we need to use two aggregates not just one; for consistency
expand all the examples in this segment to do that.
Also adjust the example using WHERE ... LIKE so that it'd produce
nonempty output with this sample data, and show that output.
Back-patch, as the previous patch was. (Sadly, v10 is now out
of scope.)
Discussion: https://postgr.es/m/166794307526.652.9073408178177444190@wrigleys.postgresql.org
The planner simplifies boolean comparisons such as "x = true" and
"x = false" down to "x" and "NOT x" respectively, to have a canonical
form to ease comparisons. However, if we want to use an index on x,
the index AM APIs require us to reconstitute the comparison-operator
form of the indexqual. While that works, in bitmap indexscans the
canonical form of the qual was emitted as a "filter" condition
although it really only needs to be a "recheck" condition, because
create_bitmap_scan_plan didn't recognize the equivalence of that
form with the generated indexqual. booleq() is pretty cheap so that
likely doesn't make very much difference, but it's unsightly so
let's clean it up.
To fix, add a case to predicate_implied_by() to recognize the
equivalence of such clauses. This is a relatively low-cost place to
add a check, and perhaps it will have additional use cases in future.
Richard Guo and Tom Lane, per discussion of bug #17618 from Sindy
Senorita.
Discussion: https://postgr.es/m/17618-7a2240bfaa7e84ae@postgresql.org
Instead of waking up 10 times per second to check for various timeout
conditions, keep track of when we next have periodic work to do.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/CA%2BhUKGJGhX4r2LPUE3Oy9BX71Eum6PBcS8L3sJpScR9oKaTVaA%40mail.gmail.com
\d+ is already able to show if a partition or a child table is
"PARTITIONED" via its relkind, hence the addition of a keyword for
"FOREIGN" in the relation description is basically free.
Author: Ian Lawrence Barwick
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAB8KJ=iwzbEz2HR9EhNxQLVhMk2G_OYtQPJ9V=jWLadseggrOA@mail.gmail.com
This change impacts pg_receivewal and pg_basebackup, for the pre-padding
with zeros of all the new non-compressed WAL segments, so as the code is
more robust on partial writes. This makes the code consistent with the
backend (XLogFileInitInternal) when wal_init_zeros is enabled for the
WAL segment initialization.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael
Paquier
Discussion: https://postgr.es/m/CALj2ACUq7nAb7=bJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA@mail.gmail.com
This routine is designed to write zeros to a file using vectored I/O,
for a size given by its caller, being useful when it comes to
initializing a file with a final size already known.
XLogFileInitInternal() in xlog.c is changed to use this new routine when
initializing WAL segments with zeros (wal_init_zero enabled). Note that
the aligned buffers used for the vectored I/O writes have a size of
XLOG_BLCKSZ, and not BLCKSZ anymore, as pg_pwrite_zeros() relies on
PGAlignedBlock while xlog.c originally used PGAlignedXLogBlock.
This routine will be used in a follow-up patch to do the pre-padding of
WAL segments for pg_receivewal and pg_basebackup when these are not
compressed.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Andres Freund, Thomas Munro, Michael
Paquier
Discussion: https://www.postgresql.org/message-id/CALj2ACUq7nAb7%3DbJNbK3yYmp-SZhJcXFR_pLk8un6XgDzDF3OA%40mail.gmail.com
A NULL result should be reported when a stats timestamp is set to 0, but
c037471 missed that, leading to a confusing timestamp value after for
example a DML on a freshly-created relation with no scans done on it
yet.
This impacted the following attributes for two system views:
- pg_stat_all_tables.last_idx_scan
- pg_stat_all_tables.last_seq_scan
- pg_stat_all_indexes.last_idx_scan
Reported-by: Robert Treat
Analyzed-by: Peter Eisentraut
Author: Dave Page
Discussion: https://postgr.es/m/CABV9wwPzMfSaz3EfKXXDxKmMprbxwF5r6WPuxqA=5mzRUqfTGg@mail.gmail.com
MSVC does not understand that ereport(ERROR) does not return, so just
return the first enum PartitionStrategy value to keep the compiler from
complaining about the missing return.
Discussion: https://postgr.es/m/20221104161934.GB16921@telsasoft.com
Commit 4ab5dae94 broke mdunlinkfork's logic for removing additional
segments of a multi-gigabyte table, because it neglected to advance
"segno" after unlinking the first segment, in the code path where it
chooses to unlink that one immediately. Then the main remove loop
gets ENOENT at segment zero and figures it's done, so we never remove
whatever additional segments might exist.
The main problem here is with large temporary tables, but WAL replay
of a drop of a large regular table would also fail to remove extra
segments. The third case where this path is taken is for non-main
forks; but I doubt it matters for those since they probably never
exceed 1GB.
The simplest fix is just to increment segno after that unlink().
(Probably this logic could do with a more thorough rethink, but not
with mere hours to go before 15.1 wraps.)
While here, also fix an incautious assumption that
register_forget_request cannot change errno. I don't think that
that has any really bad consequences, as we'd end up trying to unlink
the zero'th segment either way, but it greatly complicates reasoning
about what could happen here. Also make a couple of other cosmetic
fixes.
Per bug #17679 from Balazs Szilfai. Back-patch into v15, as the
faulty patch was.
Discussion: https://postgr.es/m/17679-1095d04450cf6a6e@postgresql.org
The code in charge of listing and classifying a set of configuration
files in a directory was located in guc-file.l, being used currently for
GUCs under "include_dir". This code is planned to be used for an
upcoming feature able to include configuration files for ident and HBA
files from a directory, similarly to GUCs. In both cases, the file
names, suffixed by ".conf", have to be ordered alphabetically. This
logic is moved to a new file, called conffiles.c, so as it is easier to
share this facility between GUCs and the HBA/ident parsing logic.
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/Y2IgaH5YzIq2b+iR@paquier.xyz
We weren't actually using the passed-down list for anything, other
than computing the new value to be passed down further. I (tgl)
probably had the idea that we'd need this data eventually; but
no use-case has emerged in a good long while, so let's just stop
expending useless cycles here.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs48KLy9aBb=sZ5MoNmnqAcGHaW_JTGWLCgoE_uMW7S6C-A@mail.gmail.com
Commit aa0105141 repeated one of the oldest mistakes in our book:
thinking that OID is the same as int32. It isn't of course, and
unsurprisingly the first person who came along with a database
OID above 2 billion broke it. Repair.
Per bug #17677 from Sergey Pankov. Back-patch to v15.
Discussion: https://postgr.es/m/17677-a99fa067d7ed71c9@postgresql.org
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.
This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
Commit f56f8f8da6 added some code in CloneFkReferencing that's way too
lax about a Constraint node it manufactures, not initializing enough
struct members -- initially_valid in particular was forgotten. This
causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to
be marked as not validated. Set initially_valid true, which fixes the
bug.
While at it, make the struct initialization more complete. Very similar
code was added in two other places by the same commit; make them all
follow the same pattern for consistency, though no bugs are apparent
there.
This bug has never been reported: I only happened to notice while
working on commit 614a406b4f. The test case that was added there with
the improper result is repaired.
Backpatch to 12.
Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
If a syntax error occurred in a SQL-language or PL/pgSQL-language
CREATE FUNCTION or DO command executed in a logical replication worker,
we'd suffer a null pointer dereference or assertion failure. That
seems like a rather contrived case, but nonetheless worth fixing.
The cause is that function_parse_error_transpose assumes it must be
executing within the context of a Portal, but logical/worker.c
doesn't create a Portal since it's not running the standard executor.
We can just back off the hard Assert check and make it fail gracefully
if there's not an ActivePortal. (I have a feeling that the aggressive
check here was my fault originally, probably because I wasn't sure if
the case would always hold and wanted to find out. Well, now we know.)
The hazard seems to exist in all branches that have logical replication,
so back-patch to v10.
Maxim Orlov, Anton Melnikov, Masahiko Sawada, Tom Lane
Discussion: https://postgr.es/m/b570c367-ba38-95f3-f62d-5f59b9808226@inbox.ru
Discussion: https://postgr.es/m/adf0452f-8c6b-7def-d35e-ab516c80088e@inbox.ru
Casting the result of palloc etc. to the intended type is more per
project style anyway.
(The fact that cpluspluscheck doesn't notice these problems is
because it doesn't expand any macros, which seems like a troubling
shortcoming. Don't have a good idea about improving that.)
Back-patch to v13, which is as far as the patch applies cleanly;
doesn't seem worth working harder.
David Geier
Discussion: https://postgr.es/m/aa5d88a3-71f4-3455-11cf-82de0372c941@gmail.com
If we have no special-case code in s_lock.h for the current platform,
but the compiler has __sync_lock_test_and_set, use that instead of
failing. It's unlikely that anybody's __sync_lock_test_and_set
would be so awful as to be worse than our semaphore-based fallback,
but if it is, they can (continue to) use --disable-spinlocks.
This allows removal of the RISC-V special case installed by commit
c32fcac56, which generated exactly the same code but only on that
platform. Usefully, the RISC-V buildfarm animals should now test
at least the int variant of this patch.
I've manually tested both variants on ARM by dint of removing the
ARM-specific stanza. We don't want to drop that, because it already
has some special knowledge and is likely to grow more over time.
Likewise, this is not meant to preclude installing special cases
for other arches if that proves worthwhile.
Per discussion of a request to install the same code for loongarch64.
Like the previous patch, we might as well back-patch to supported
branches.
Discussion: https://postgr.es/m/761ac43d44b84d679ba803c2bd947cc0@HSMAILSVR04.hs.handsome.com.cn
Avoid having to list all the possible object types twice. Instead,
only _getObjectDescription() needs to know about specific object
types. It communicates back to _printTocEntry() whether an owner is
to be set.
In passing, remove the logic to use ALTER TABLE to set the owner of
views and sequences. This is no longer necessary. Furthermore, if
pg_dump doesn't recognize the object type, this is now a fatal error,
not a warning.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0a00f923-599a-381b-923f-0d802a727715@enterprisedb.com
Since partitions can be foreign tables not only plain tables, but
logical replication only supports plain tables, we'd better check the
relkind of a partition after we find it. (There was some discussion
of checking this when adding a partitioned table to a subscription;
but that would be inadequate since the troublesome partition could be
added later.) Without this, the situation leads to a segfault or
assertion failure.
In passing, add a separate variable for the target Relation of
a cross-partition UPDATE; reusing partrel seemed mighty confusing
and error-prone.
Shi Yu and Tom Lane, per report from Ilya Gladyshev. Back-patch
to v13 where logical replication into partitioned tables became
a thing.
Discussion: https://postgr.es/m/6b93e3748ba43298694f376ca8797279d7945e29.camel@gmail.com
There are some cases (e.g. when the subscription is created using the
connect = false option) where the remote replication slot was not created
automatically and the user must create it manually before the subscription
can be activated. There was not enough information in the docs for users
to do this easily.
Author: Peter Smith
Reviewd by: Shi yu, Amit Kapila
Discussion: https://postgr.es/m/CAHut+PvqdqOanheWSHDyhQiF+Z-7w=-+k4U+bwbT=b6YQ_hrXQ@mail.gmail.com
Previously, the description of XLOG_RUNNING_XACTS showed only
top-transaction XIDs and whether subtransactions overflowed. This commit
improves it to show individual subtransaction XIDs. This also improves the
description of overflowed subtransactions.
This additional information can be helpful for testing and debugging
purposes.
Author: Masahiko Sawada
Reviewd by: Fujii Masao, Kyotaro Horiguchi, Ashutosh Bapat, Bharath Rupireddy
Discussion: https://postgr.es/m/CAD21AoAqvaE+XEeXHHPdAGQPcCoGXxuoeutq_nWhUSQvTt5+tA@mail.gmail.com
pg_ident_file_mappings.line_number was described as a line number in
pg_ident.conf for a "rule" number, but this should refer to a "map".
The same inconsistent term was used in the main paragraph describing the
view.
Extracted from a patch by the same author. Issue introduced by
a2c8499 where this view has been added.
Author: Julien Rouhaud
Discussion: https://postgr.es/m/20221026031948.cbrnzgy5e7glsq2d@jrouhaud
Backpatch-through: 15
These two options are only available with COPY FROM, so the extra logic
in charge of checking the validity of the attributes given has no
purpose.
Author: Zhang Mingli
Reviewed-by: Richard Guo, Kyotaro Horiguchi
Discussion: https://postgr.es/m/F28F0B5A-766F-4D33-BF44-43B3A052D833@gmail.com