OpenSSL 3 introduced the concept of providers to support modularization,
and moved the outdated ciphers to the new legacy provider. In case it's
not loaded in the users openssl.cnf file there will be a lot of regress
test failures, so add alternative outputs covering those.
Also document the need to load the legacy provider in order to use older
ciphers with OpenSSL-enabled pgcrypto.
This will be backpatched to all supported version once there is sufficient
testing in the buildfarm of OpenSSL 3.
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/FEF81714-D479-4512-839B-C769D2605F8A@yesql.se
The PX layer in pgcrypto is handling digest padding on its own uniformly
for all backend implementations. Starting with OpenSSL 3.0.0, DecryptUpdate
doesn't flush the last block in case padding is enabled so explicitly
disable it as we don't use it.
This will be backpatched to all supported version once there is sufficient
testing in the buildfarm of OpenSSL 3.
Reviewed-by: Peter Eisentraut, Michael Paquier
Discussion: https://postgr.es/m/FEF81714-D479-4512-839B-C769D2605F8A@yesql.se
ALTER TABLE .. SET {LOGGED,UNLOGGED,ACCESS METHOD} would never do a
table-level object access hook, which was inconsistent with SET
TABLESPACE. Note that contrary to SET TABLESPACE, the no-op case is
left off for those commands as this requires tracking if commands have
been called, but they may not execute a physical rewrite. Another thing
worth noting is that the physical file swap at the end of a rewrite
does a couple of access calls for internal objects created for the swap
operation (internal objects are for example skipped by the tests of
sepgsql), but this does not trigger the hook for the table on which the
operation is done.
f41872d, that added support for SET LOGGED/UNLOGGED in ALTER TABLE,
visibly forgot to consider that.
Based on what I checked, two regression tests of sepgsql in ddl.sql are
going to log more information with this test, something that buildfarm
member rhinoceros will tell soon enough. I am not completely sure of
their format though, so these are not refreshed yet.
This is arguably a bug, but no backpatch is done as this could cause a
behavior change for anybody using object access hooks.
Reported-by: Jeff Davis
Discussion: https://postgr.es/m/YQJKV29/1a60uG68@paquier.xyz
If the replacement string doesn't contain \1...\9, then we don't
need sub-match locations, so we can use the REG_NOSUB optimization
here too. There's already a pre-scan of the replacement string
to look for backslashes, so extend that to check for digits, and
refactor to allow that to happen before we compile the regexp.
While at it, try to speed up the pre-scan by using memchr() instead
of a handwritten loop. It's likely that this is lost in the noise
compared to the regexp processing proper, but maybe not. In any
case, this coding is shorter.
Also, add some test cases to improve the poor coverage of
appendStringInfoRegexpSubstr().
Discussion: https://postgr.es/m/3534632.1628536485@sss.pgh.pa.us
Identifying the precise match locations for parenthesized subexpressions
is a fairly expensive task given the way our regexp engine works, both
at regexp compile time (where we must create an optimized NFA for each
parenthesized subexpression) and at runtime (where determining exact
match locations requires laborious search).
Up to now we've made little attempt to optimize this situation. This
patch identifies cases where we know at compile time that we won't
need to know subexpression match locations, and teaches the regexp
compiler to not bother creating per-subexpression regexps for
parenthesis pairs that are not referenced by backrefs elsewhere in
the regexp. (To preserve semantics, we obviously still have to
pin down the match locations of backref references.) Users could
have obtained the same results before this by being careful to
write "non capturing" parentheses wherever possible, but few people
bother with that.
Discussion: https://postgr.es/m/2219936.1628115334@sss.pgh.pa.us
Here we add additional parsing of Makefiles to determine when to add
references to libpgport and libpgcommon. We also remove the need for
adding the current contrib_extrasource by adding sine very basic logic to
implement the Makefile rules which add .l and .y files when they exist for
a given .o file in the Makefile.
This is just some very basic additional parsing of Makefiles to try to
keep things more consistent between builds using make and MSVC builds.
This happens to work with how our current Makefiles are laid out, but it
could easily be broken in the future if someone chooses do something in
the Makefile that we don't have parsing support for. We will cross that
bridge when we come to it.
Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvoPULi5JW3933NxgwxOmu9Ncvpcyt87UhEHAUX16QqmpA@mail.gmail.com
In ec34040af I added a mention that there was no point in setting
maintenance_work_limit to anything higher than 1GB for vacuum, but that
was incorrect as ginInsertCleanup() also looks at what
maintenance_work_mem is set to during VACUUM and that's not limited to
1GB.
Here I attempt to make it more clear that the limitation is only around
the number of dead tuple identifiers that we can collect during VACUUM.
I've also added a note to autovacuum_work_mem to mention this limitation.
I didn't do that in ec34040af as I'd had some wrong-headed ideas about
just limiting the maximum value for that GUC to 1GB.
Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvpGwOAvunp-E-bN_rbAs3hmxMoasm5pzkYDbf36h73s7w@mail.gmail.com
Backpatch-through: 9.6, same as ec34040af
This saves a few lines of code. Also add a comment to mention why we use
ExplainPropertyInteger instead of ExplainPropertyUInteger given that
queryid is a uint64 type.
Author: David Rowley
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/CAApHDvqhSLYpSU_EqUdN39w9Uvb8ogmHV7_3YhJ0S3aScGBjsg@mail.gmail.com
Backpatch-through: 14, where this code was originally added
02a6a54ec added code to make use of the POPCNT instruction when available
for many of our common platforms. Here we do the same for MSVC for x86_64
machines.
MSVC's intrinsic functions for popcnt seem to differ from GCCs in that
they always appear to emit the popcnt instructions. In GCC the behavior
will depend on if the source file was compiled with -mpopcnt or not. For
this reason, the MSVC intrinsic function has been lumped into the
pg_popcount*_asm function, however doing that sort of invalidates the name
of that function, so let's rename it to pg_popcount*_fast().
Author: David Rowley
Reviewed-by: John Naylor
Discussion: https://postgr.es/m/CAApHDvqL3cbbK%3DGzNcwzsNR9Gi%2BaUvTudKkC4XgnQfXirJ_oRQ%40mail.gmail.com
Since commit e462856a7a, pg_upgrade automatically creates a script to
update extensions, so mention that instead of ALTER EXTENSION.
Backpatch-through: 9.6
Copy-and-pasteo in 665c5855e, evidently. The 9.6 docs toolchain
whined about duplicate index entries, though our modern toolchain
doesn't. In any case, these GUCs surely are not about the
default settings of these values.
I had committer's remorse almost immediately after pushing cb76fbd7e,
upon finding that removing capturing subexpressions' subREs from the
data structure broke my proposed patch for REG_NOSUB optimization.
Revert that data structure change. Instead, address the concern
about not changing capturing subREs' endpoints by not changing the
endpoints. We don't need to, because the point of that bit was just
to ensure that the atom has endpoints distinct from the outer state
pair that we're stringing the branch between. We already made
suitable states in the parenthesized-subexpression case, so the
additional ones were just useless overhead. This seems more
understandable than Spencer's original coding, and it ought to be
a shade faster too by saving a few state creations and arc changes.
(I actually see a couple percent improvement on Jacobson's web
corpus, though that's barely above the noise floor so I wouldn't
put much stock in that result.)
Also, fix the logic added by ea1268f63 to ensure that the subRE
recorded in v->subs[subno] is exactly the one with capno == subno.
Spencer's original coding recorded the child subRE of the capture
node, which is okay so far as having the right endpoint states is
concerned, but as of cb76fbd7e the capturing subRE itself always
has those endpoints too. I think the inconsistency is confusing
for the REG_NOSUB optimization.
As before, backpatch to v14.
Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
It appears that check_track_commit_timestamp was declared but has never
been defined in our code base. Likely this is just leftover cruft from
a development version of the original patch to add commit timestamps.
Let's just remove the useless declaration. The inclusion of guc.h also
seems surplus to requirements.
Author: Andrey Lepikhov
Discussion: https://postgr.es/m/f49aefb5-edbb-633a-af07-3e777023a94d@postgrespro.ru
Up to now, we remembered the definition of a capturing parenthesis
subexpression by storing a pointer to the associated subRE node.
That was okay before, because that subRE didn't get modified anymore
while parsing the rest of the regexp. However, in the wake of
commit ea1268f63, that's no longer true: the outer invocation of
parseqatom() feels free to scribble on that subRE. This seems to
work anyway, because the states we jam into the child atom in the
"prepare a general-purpose state skeleton" stanza aren't really
semantically different from the original endpoints of the child atom.
But that would be mighty easy to break, and it's definitely not how
things worked before.
Between this and the issue fixed in the prior commit, it seems best
to get rid of this dependence on subRE nodes entirely. We don't need
the whole child subRE for future backrefs, only its starting and ending
NFA states; so let's just store pointers to those.
Also, in the corner case where we make an extra subRE to handle
immediately-nested capturing parentheses, it seems like it'd be smart
to have the extra subRE have the same begin/end states as the original
child subRE does (s/s2 not lp/rp). I think that linking it from lp to
rp might actually be semantically wrong, though since Spencer's original
code did it that way, I'm not totally certain. Using s/s2 is certainly
not wrong, in any case.
Per report from Mark Dilger. Back-patch to v14 where the problematic
patches came in.
Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
Commit cebc1d34e taught parseqatom() to optimize cases where a branch
contains only one, "messy", atom by getting rid of excess subRE nodes.
The way we really should do that is to keep the subRE built for the
"messy" child atom; but to avoid changing parseqatom's nominal API,
I made it delete that node after copying its fields to the outer subRE
made by parsebranch(). It seems that that actually worked at the time;
but it became dangerous after ea1268f63, because that later commit
allowed the lower invocation of parse() to return a subRE that was also
pointed to by some v->subs[] entry. This meant we could wind up with a
dangling pointer in v->subs[], allowing a later backref to misbehave,
but only if that subRE struct had been reused in between. So the damage
seems confined to cases like '((...))...(...\2'.
To fix, do what I should have done before and modify parseqatom's API
to make it possible for it to remove the caller's subRE instead of the
callee's. That's safer because we know that subRE isn't complete yet,
so noplace else will have a pointer to it.
Per report from Mark Dilger. Back-patch to v14 where the problematic
patches came in.
Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com
As reported by a few OSX buildfarm animals there exist at least one path where
temporary files exist during AtProcExit_Files() processing. As temporary file
cleanup causes pgstat reporting, the assertions added in ee3f8d3d3a caused
failures.
This is not an OSX specific issue, we were just lucky that timing on OSX
reliably triggered the problem. The known way to cause this is a FATAL error
during perform_base_backup() with a MANIFEST used - adding an elog(FATAL)
after InitializeBackupManifest() reliably reproduces the problem in isolation.
The problem is that the temporary file created in InitializeBackupManifest()
is not cleaned up via resource owner cleanup as WalSndResourceCleanup()
currently is only used for non-FATAL errors. That then allows to reach
AtProcExit_Files() with existing temporary files, causing the assertion
failure.
To fix this problem, move temporary file cleanup to a before_shmem_exit() hook
and add assertions ensuring that no temporary files are created before / after
temporary file management has been initialized / shut down. The cleanest way
to do so seems to be to split fd.c initialization into two, one for plain file
access and one for temporary file access.
Right now there's no need to perform further fd.c cleanup during process exit,
so I just renamed AtProcExit_Files() to BeforeShmemExit_Files(). Alternatively
we could perform another pass through the files to check that no temporary
files exist, but the added assertions seem to provide enough protection
against that.
It might turn out that the assertions added in ee3f8d3d3a will cause too much
noise - in that case we'll have to downgrade them to a WARNING, at least
temporarily.
This commit is not necessarily the best approach to address this issue, but it
should resolve the buildfarm failures. We can revise later.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210807190131.2bm24acbebl4wl6i@alap3.anarazel.de
Rather than trying to pick table aliases that won't conflict with
any possible user-defined matview column name, adjust the queries'
syntax so that the aliases are only used in places where they can't be
mistaken for column names. Mostly this consists of writing "alias.*"
not just "alias", which adds clarity for humans as well as machines.
We do have the issue that "SELECT alias.*" acts differently from
"SELECT alias", but we can use the same hack ruleutils.c uses for
whole-row variables in SELECT lists: write "alias.*::compositetype".
We might as well revert to the original aliases after doing this;
they're a bit easier to read.
Like 75d66d10e, back-patch to all supported branches.
Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us
This is a step toward storing stats in dynamic shared memory. As dynamic
shared memory segments are detached from just after before_shmem_exit()
callbacks are processed, but before on_shmem_exit() callbacks are, no stats
can be collected after before_shmem_exit() callbacks have been processed.
Parallel worker shutdown can cause stats to be emitted during DSM detach
callbacks, e.g. for SharedFileSet (which closes its files, which can causes
fd.c to emit stats about temporary files). Therefore parallel worker shutdown
needs to complete during the processing of before_shmem_exit callbacks.
One might think this problem could instead be solved by carefully ordering the
attaching to DSM segments, so that the pgstats segments get detached from
later than the parallel query ones. That turns out to not work because the
stats hash might need to grow which can cause new segments to be
allocated, which then will be detached from earlier.
There are two code changes:
First, call ParallelWorkerShutdown() via before_shmem_exit. That's a good idea
on its own, because other shutdown callbacks like ShutdownPostgres and
ShutdownAuxiliaryProcess are called via before_*.
Second, explicitly detach from the parallel query DSM segment, thereby
ensuring all stats are emitted during ParallelWorkerShutdown().
There are nicer solutions to these problems, but it's not obvious which of
those solutions is the correct one. As the shared memory stats work already is
a huge amount of work...
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de
Previously pgstat_initialize() was called in InitPostgres() and
AuxiliaryProcessMain(). As it turns out there was at least one case where we
reported stats before pgstat_initialize() was called, see
AutoVacWorkerMain()'s intentionally early call to pgstat_report_autovac().
This turns out to not be a problem with the current pgstat implementation as
pgstat_initialize() only registers a shutdown callback. But in the shared
memory based stats implementation we are working towards pgstat_initialize()
has to do more work.
After b406478b87 BaseInit() is a central place where initialization shared by
normal backends and auxiliary backends can be put. Obviously BaseInit() is
called before InitPostgres() registers ShutdownPostgres. Previously
ShutdownPostgres was the first before_shmem_exit callback, now that's commonly
pgstats. That should be fine.
Previously pgstat_initialize() was not called in bootstrap mode, but there
does not appear to be a need for that. It's now done unconditionally.
To detect future issues like this, assertions are added to a few places
verifying that the pgstat subsystem is initialized and not yet shut down.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
Casting a value that's already of a type with a specific typmod
to an unspecified typmod doesn't do anything so far as run-time
behavior is concerned. However, it really ought to change the
exposed type of the expression to match. Up to now,
coerce_type_typmod hasn't bothered with that, which creates gotchas
in contexts such as recursive unions. If for example one side of
the union is numeric(18,3), but it needs to be plain numeric to
match the other side, there's no direct way to express that.
This is easy enough to fix, by inserting a RelabelType to update the
exposed type of the expression. However, it's a bit nervous-making
to change this behavior, because it's stood for a really long time.
(I strongly suspect that it's like this in part because the logic
pre-dates the introduction of RelabelType in 7.0. The commit log
message for 57b30e8e2 is interesting reading here.) As a compromise,
we'll sneak the change into 14beta3, and consider back-patching to
stable branches if no complaints emerge in the next three months.
Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com
Formerly, the numeric code tested whether an integer value of a larger
type would fit in a smaller type by casting it to the smaller type and
then testing if the reverse conversion produced the original value.
That's perfectly fine, except that it caused a test failure on
buildfarm animal castoroides, most likely due to a compiler bug.
Instead, do these tests by comparing against PG_INT16/32_MIN/MAX. That
matches existing code in other places, such as int84(), which is more
widely tested, and so is less likely to go wrong.
While at it, add regression tests covering the numeric-to-int8/4/2
conversions, and adjust the recently added tests to the style of
434ddfb79a (on the v11 branch) to make failures easier to diagnose.
Per buildfarm via Tom Lane, reviewed by Tom Lane.
Discussion: https://postgr.es/m/2394813.1628179479%40sss.pgh.pa.us
For EXEC_BACKEND InitProcess()/InitAuxiliaryProcess() needs to have been
called well before we call BaseInit(), as SubPostmasterMain() needs LWLocks to
work. Having the order of initialization differ between platforms makes it
unnecessarily hard to understand the system and to add initialization points
for new subsystems without a lot of duplication.
To be able to change the order, BaseInit() cannot trigger
CreateSharedMemoryAndSemaphores() anymore - obviously that needs to have
happened before we can call InitProcess(). It seems cleaner to create shared
memory explicitly in single user/bootstrap mode anyway.
After this change the separation of bufmgr initialization into
InitBufferPoolAccess() / InitBufferPoolBackend() is not meaningful anymore so
the latter is removed.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
This was an oversight in 07bf378509. While it does reduce the benefit of the
simplification due to that commit, it still seems like a win to me.
It seems like it might be a good idea to have a function mirroring
InitPostmasterChild() / InitStandaloneProcess() for postmaster in miscinit.c
to make it easier to keep initialization between the three possible
environment in sync.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210805214109.lzfk3r3ae37bahmv@alap3.anarazel.de
For one, the existing location lead to somewhat awkward code in main(). For
another, the new location is easier to understand anyway.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
Neither is actually initialized as an auxiliary process, so it does not really
make sense to reserve a PGPROC etc for them.
This keeps checker mode implemented by exiting partway through bootstrap
mode. That might be worth changing at some point, perhaps if we ever extend
checker mode to be a more general tool.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
After the preceding commits the auxprocess code is independent from
bootstrap.c - so a dedicated file seems less confusing.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
Kept separate for ease of review, particularly because pgindent insists on
reflowing a few comments.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
There practically was no shared code between the two, once all the ifs are
removed. And it was quite confusing that aux processes weren't actually
started by the call to AuxiliaryProcessMain() in main().
There's more to do, AuxiliaryProcessMain() should move out of bootstrap.c, and
BootstrapModeMain() shouldn't use/be part of AuxProcType.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
It is confusing that aux processes are started with --forkboot, given that
bootstrap mode cannot run below postmaster.
Author: Andres Freund <andres@anarazel.de>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de
postgres_fdw imported generated columns from the remote tables as plain
columns, and caused failures like "ERROR: cannot insert a non-DEFAULT
value into column "foo"" when inserting into the foreign tables, as it
tried to insert values into the generated columns. To fix, we do the
following under the assumption that generated columns in a postgres_fdw
foreign table are defined so that they represent generated columns in
the underlying remote table:
* Send DEFAULT for the generated columns to the foreign server on insert
or update, not generated column values computed on the local server.
* Add to postgresImportForeignSchema() an option "import_generated" to
include column generated expressions in the definitions of foreign
tables imported from a foreign server. The option is true by default.
The assumption seems reasonable, because that would make a query of the
postgres_fdw foreign table return values for the generated columns that
are consistent with the generated expression.
While here, fix another issue in postgresImportForeignSchema(): it tried
to include column generated expressions as column default expressions in
the foreign table definitions when the import_default option was enabled.
Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated
columns were added.
Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
This fixes a long-standing bug when using to_char() to format a
numeric value in scientific notation -- if the value's exponent is
less than -NUMERIC_MAX_DISPLAY_SCALE-1 (-1001), it produced a
division-by-zero error.
The reason for this error was that get_str_from_var_sci() divides its
input by 10^exp, which it produced using power_var_int(). However, the
underflow test in power_var_int() causes it to return zero if the
result scale is too small. That's not a problem for power_var_int()'s
only other caller, power_var(), since that limits the rscale to 1000,
but in get_str_from_var_sci() the exponent can be much smaller,
requiring a much larger rscale. Fix by introducing a new function to
compute 10^exp directly, with no rscale limit. This also allows 10^exp
to be computed more efficiently, without any numeric multiplication,
division or rounding.
Discussion: https://postgr.es/m/CAEZATCWhojfH4whaqgUKBe8D5jNHB8ytzemL-PnRx+KCTyMXmg@mail.gmail.com
Up to now we did a PQconsumeInput() for each pipelined query, asking the OS
for more input - which it often won't have, as all results might already have
been sent. That turns out to have a noticeable performance impact.
Alvaro Herrera reviewed the idea to add the PQisBusy() check, but not this
concrete patch.
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20210720180039.23rivhdft3l4mayn@alap3.anarazel.de
Backpatch: 14, where libpq/pgbench pipelining was introduced.