This mainly de-duplicates code. As evaluating a system variable isn't
the hottest path and the current inline implementation ends up calling
out to an external function anyway, this is OK from a performance POV.
The main motivation for de-duplicating is the upcoming slot
abstraction work, after which there's not guaranteed to be a HeapTuple
backing the slot.
Author: Andres Freund, Amit Khandekar
Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
Add another transfer mode --clone to pg_upgrade (besides the existing
--link and the default copy), using special file cloning calls. This
makes the file transfer faster and more space efficient, achieving
speed similar to --link mode without the associated drawbacks.
On Linux, file cloning is supported on Btrfs and XFS (if formatted with
reflink support). On macOS, file cloning is supported on APFS.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Previously the expressions were built with the HashJoinState as a
parent. That's incorrect.
Currently this does not appear to be harmful, but for the upcoming
'slot abstraction' work this proves to be problematic, as the
underlying slot types can differ between Hash and HashJoin. It's
possible that this already causes a problem, but I've not been able to
come up with a scenario. Therefore don't backpatch at this point.
Author: Andres Freund
Discussion: https://postgr.es/m/20180220224318.gw4oe5jadhpmcdnm@alap3.anarazel.de
The installcheck run takes a sizable fraction of test.sh to run. Using
a parallel schedule reduces that noticably.
It's possible that we want to backpatch this at some point, to reduce
buildfarm times, but for now lets just see if this upsets the
buildfarm.
Author: Andres Freund
Discussion: https://postgr.es/m/20181105210039.hh4vvi4vwoq5ba2q@alap3.anarazel.de
The planner doesn't make any use of root->total_table_pages until it
estimates costs of indexscans, so we don't need to compute it as
early as that's currently done. By doing the calculation between
set_base_rel_sizes and set_base_rel_pathlists, we can omit relations
that get removed from the query by partition pruning or constraint
exclusion, which seems like a more accurate basis for costing.
(Historical note: I think at the time this code was written, there
was not a separation between the "set sizes" and "set pathlists"
steps, so that this approach would have been impossible at the time.
But now that we have that separation, this is clearly the better way
to do things.)
David Rowley, reviewed by Edmund Horner
Discussion: https://postgr.es/m/CAKJS1f-NG1mRM0VOtkAG7=ZLQWihoqees9R4ki3CKBB0-fRfCA@mail.gmail.com
The code added by commit c203d6cf8 causes a crash in at least one case,
where a potentially-optimizable expression index has a storage type
different from the input data type. A cursory code review turned up
numerous other problems that seem impractical to fix on short notice.
Andres argued for revert of that patch some time ago, and if additional
senior committers had been paying attention, that's likely what would
have happened, but we were not :-(
At this point we can't just revert, at least not in v11, because that would
mean an ABI break for code touching relcache entries. And we should not
remove the (also buggy) support for the recheck_on_update index reloption,
since it might already be used in some databases in the field. So this
patch just does the as-little-invasive-as-possible measure of disabling
the feature as though recheck_on_update were forced off for all indexes.
I also removed the related regression tests (which would otherwise fail)
and the user-facing documentation of the reloption.
We should undertake a more thorough code cleanup if the patch can't be
fixed, but not under the extreme time pressure of being already overdue
for 11.1 release.
Per report from Ondřej Bouda and subsequent private discussion among
pgsql-release.
Discussion: https://postgr.es/m/20181106185255.776mstcyehnc63ty@alvherre.pgsql
A ConvertRowtypeExpr is used to translate a whole-row reference of a
child to that of a parent. The planner produces nested
ConvertRowtypeExpr while translating whole-row reference of a leaf
partition in a multi-level partition hierarchy. Executor then
translates the whole-row reference from the leaf partition into all
the intermediate parent's whole-row references before arriving at the
final whole-row reference. It could instead translate the whole-row
reference from the leaf partition directly to the top-most parent's
whole-row reference skipping any intermediate translations.
Ashutosh Bapat, with tests by Kyotaro Horiguchi and some
editorialization by me. Reviewed by Andres Freund, Pavel Stehule,
Kyotaro Horiguchi, Dmitry Dolgov, Tom Lane.
Forward to POSIX pread() and pwrite(), or emulate them if unavailable.
The emulation is not perfect as the file position is changed, so
we'll put pg_ prefixes on the names to minimize the risk of confusion
in future patches that might inadvertently try to mix pread() and read()
on the same file descriptor.
Author: Thomas Munro
Reviewed-by: Tom Lane, Jesper Pedersen
Discussion: https://postgr.es/m/CAEepm=02rapCpPR3ZGF2vW=SBHSdFYO_bz_f-wwWJonmA3APgw@mail.gmail.com
The "rb" prefix is used by Ruby, so that our existing code results
in name collisions that break plruby. We discussed ways to prevent
that by adjusting dynamic linker options, but it seems that at best
we'd move the pain to other cases. Renaming to avoid the collision
is the only portable fix anyway. Fortunately, our rbtree code is
not (yet?) widely used --- in core, there's only a single usage
in GIN --- so it seems likely that we can get away with a rename.
I chose to do this basically as s/rb/rbt/g, except for places where
there already was a "t" after "rb". The patch could have been made
smaller by only touching linker-visible symbols, but it would have
resulted in oddly inconsistent-looking code. Better to make it look
like "rbt" was the plan all along.
Back-patch to v10. The rbtree.c code exists back to 9.5, but
rb_iterate() which is the actual immediate source of pain was added
in v10, so it seems like changing the names before that would have
more risk than benefit.
Per report from Pavel Raiskup.
Discussion: https://postgr.es/m/4738198.8KVIIDhgEB@nb.usersys.redhat.com
I added HAVE_IPV6 to Makefile.global way back in commit 7703e55c3
so that we could transmit its value to the shell-script version of
initdb. Since initdb was rewritten in C, it's been finding that
out from pg_config.h instead, so this is useless. Keeping it here
just wastes configure and make cycles, plus it's a potential
two-sources-of-truth problem.
pg_promote uses nothing relying on a global state, so it is fine to mark
it as parallel-safe, conclusion based on a detailed analysis from Robert
Haas. This also fixes an inconsistency where pg_proc.dat missed to mark
the function with its previous value for proparallel, update which does
not matter now as the default is used.
Based on a discussion between multiple folks: Laurenz Albe, Robert Haas,
Amit Kapila, Tom Lane and myself.
Discussion: https://postgr.es/m/20181029082530.GL14242@paquier.xyz
I removed the item about the pg_stat_statements change from
release-11.sgml, as part of a sweep to delete items already committed
in 11.0; but actually we'd best keep it to ensure that people who've
pg_upgraded their databases will take the requisite action. Also make
said action more visible by making it into its own para. Noted by
Jonathan Katz.
The entry with OID 4035, for GIST jsonb_ops, is unused; apparently
it was added in preparation for index support that never materialized.
Remove it, and add a regression test case to detect future mistakes
of the same kind.
Discussion: https://postgr.es/m/17188.1541379745@sss.pgh.pa.us
When a partition is created as part of a trigger processing, it is
possible that the partition which just gets created changes the
properties of the table the executor of the ongoing command relies on,
causing a subsequent crash. This has been found possible when for
example using a BEFORE INSERT which creates a new partition for a
partitioned table being inserted to.
Any attempt to do so is blocked when working on a partition, with
regression tests added for both CREATE TABLE PARTITION OF and ALTER
TABLE ATTACH PARTITION.
Reported-by: Dmitry Shalashov
Author: Amit Langote
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/15437-3fe01ee66bd1bae1@postgresql.org
Backpatch-through: 10
Those tables have no physical storage, making this option unusable with
partition trees as at commit time an actual truncation was attempted.
There are still issues with the way ON COMMIT actions are done when
mixing several action types, however this impacts as well inheritance
trees, so this issue will be dealt with later.
Reported-by: Rajkumar Raghuwanshi
Author: Amit Langote
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAKcux6mhgcjSiB_egqEAEFgX462QZtncU8QCAJ2HZwM-wWGVew@mail.gmail.com
Modifying the parse tree at execution time is, or at least ought to be,
verboten. It seems quite difficult to actually cause a crash this way
in v11 (although you can exhibit it pretty easily in HEAD by messing
with plan_cache_mode). Nonetheless, it's risky, so fix and back-patch.
Discussion: https://postgr.es/m/13789.1541359611@sss.pgh.pa.us
exec_stmt_call() tried to extract information out of a CALL statement's
argument list without using expand_function_arguments(), apparently in
the hope of saving a few nanoseconds by not processing defaulted
arguments. It got that quite wrong though, leading to crashes with
named arguments, as well as failure to enforce writability of the
argument for a defaulted INOUT parameter. Fix and simplify the logic
by using expand_function_arguments() before examining the list.
Also, move the argument-examination to just after producing the CALL
command's plan, before invoking the called procedure. This ensures
that we'll track possible changes in the procedure's argument list
correctly, and avoids a hazard of the plan cache being flushed while
the procedure executes.
Also fix assorted falsehoods and omissions in associated documentation.
Per bug #15477 from Alexey Stepanov.
Patch by me, with some help from Pavel Stehule. Back-patch to v11.
Discussion: https://postgr.es/m/15477-86075b1d1d319e0a@postgresql.org
Discussion: https://postgr.es/m/CAFj8pRA6UsujpTs9Sdwmk-R6yQykPx46wgjj+YZ7zxm4onrDyw@mail.gmail.com
This only became a problem with 4c640f4f38, which didn't synchronize
the value agg_strict_input_check.nargs is set to, with the guard
condition for emitting the operation.
Besides such instructions being unnecessary overhead, currently the
LLVM JIT provider doesn't support them. It seems more sensible to
avoid generating such instruction than supporting them. Add assertions
to make it easier to debug a potential further occurance.
Discussion: https://postgr.es/m/2a505161-2727-2473-7c46-591ed108ac52@email.cz
Backpatch: 11-, like 4c640f4f38.
I (Andres) broke this unintentionally in 69c3936a14, by checking
strictness for all input expressions computed for an aggregate, rather
than just the input for the aggregate transition function.
Reported-By: Ondřej Bouda
Bisected-By: Tom Lane
Diagnosed-By: Andrew Gierth
Discussion: https://postgr.es/m/2a505161-2727-2473-7c46-591ed108ac52@email.cz
Backpatch: 11-, like 69c3936a14
On Windows, in UTF8 database encoding, what char2wchar() produces is
UTF16 not UTF32, ie, characters above U+FFFF will be represented by
surrogate pairs. t_isdigit() and siblings did not account for this
and failed to provide a large enough result buffer. That in turn
led to bogus "invalid multibyte character for locale" errors, because
contrary to what you might think from char2wchar()'s documentation,
its Windows code path doesn't cope sanely with buffer overflow.
The solution for t_isdigit() and siblings is pretty clear: provide
a 3-wchar_t result buffer not 2.
char2wchar() also needs some work to provide more consistent, and more
accurately documented, buffer overrun behavior. But that's a bigger job
and it doesn't actually have any immediate payoff, so leave it for later.
Per bug #15476 from Kenji Uno, who deserves credit for identifying the
cause of the problem. Back-patch to all active branches.
Discussion: https://postgr.es/m/15476-4314f480acf0f114@postgresql.org
When creating partitioned indexes, the tablespace was not being saved
for the parent index. This meant that subsequently created partitions
would not use the right tablespace for their indexes.
ALTER INDEX SET TABLESPACE and ALTER INDEX ALL IN TABLESPACE raised
errors when tried; fix them too. This requires bespoke code for
ATExecCmd() that applies to the special case when the tablespace move is
just a catalog change.
Discussion: https://postgr.es/m/20181102003138.uxpaca6qfxzskepi@alvherre.pgsql
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first. Note that a
fair percentage of the entries apply only to prior branches because
their issue was already fixed in 11.0.
The solution arrived at in commit e74dd00f5 presumes that the compiler
has a suitable default -isysroot setting ... but further experience
shows that in many combinations of macOS version, XCode version, Xcode
command line tools version, and phase of the moon, Apple's compiler
will *not* supply a default -isysroot value.
We could potentially go back to the approach used in commit 68fc227dd,
but I don't have a lot of faith in the reliability or life expectancy of
that either. Let's just revert to the approach already shipped in 11.0,
namely specifying an -isysroot switch globally. As a partial response to
the concerns raised by Jakob Egger, adjust the contents of Makefile.global
to look like
CPPFLAGS = -isysroot $(PG_SYSROOT) ...
PG_SYSROOT = /path/to/sysroot
This allows overriding the sysroot path at build time in a relatively
painless way.
Add documentation to installation.sgml about how to use the PG_SYSROOT
option. I also took the opportunity to document how to work around
macOS's "System Integrity Protection" feature.
As before, back-patch to all supported versions.
Discussion: https://postgr.es/m/20840.1537850987@sss.pgh.pa.us
NULL keys in left joins were skipped when building batch files.
Repair, by making the keep_nulls argument to ExecHashGetHashValue()
depend on whether this is a left outer join, as we do in other
paths.
Bug #15475. Thinko in 1804284042. Back-patch to 11.
Reported-by: Paul Schaap
Diagnosed-by: Andrew Gierth
Dicussion: https://postgr.es/m/15475-11a7a783fed72a36%40postgresql.org
When restoring slot information from disk at startup and filling in
shared memory information, the startup process would issue a PANIC
message if more slots are found than what max_replication_slots allows,
and then Postgres generates a core dump, recommending to increase
max_replication_slots. This gives users a switch to crash Postgres at
will by creating slots, lower the configuration to not support it, and
then restart it.
Making Postgres crash hard in this case is overdoing it just to give a
recommendation to users. So instead use a FATAL, which makes Postgres
fail to start without crashing, still giving the recommendation. This
is more consistent with what happens for prepared transactions for
example.
Author: Michael Paquier
Reviewed-by: Andres Freund
Discussion: https://postgr.es/m/20181030025109.GD1644@paquier.xyz
This has been deprecated and effectively unused for a long time.
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>