The decision to reuse values of parameters from a previous connection
has been based on whether the new target is a conninfo string. Add this
means of overriding that default. This feature arose as one component
of a fix for security vulnerabilities in pg_dump, pg_dumpall, and
pg_upgrade, so back-patch to 9.1 (all supported versions). In 9.3 and
later, comment paragraphs that required update had already-incorrect
claims about behavior when no connection is open; fix those problems.
Security: CVE-2016-5424
If ANALYZE found no repeated non-null entries in its sample, it set the
column's stadistinct value to -1.0, intending to indicate that the entries
are all distinct. But what this value actually means is that the number
of distinct values is 100% of the table's rowcount, and thus it was
overestimating the number of distinct values by however many nulls there
are. This could lead to very poor selectivity estimates, as for example
in a recent report from Andreas Joseph Krogh. We should discount the
stadistinct value by whatever we've estimated the nulls fraction to be.
(That is what will happen if we choose to use a negative stadistinct for
a column that does have repeated entries, so this code path was just
inconsistent.)
In addition to fixing the stadistinct entries stored by several different
ANALYZE code paths, adjust the logic where get_variable_numdistinct()
forces an "all distinct" estimate on the basis of finding a relevant unique
index. Unique indexes don't reject nulls, so there's no reason to assume
that the null fraction doesn't apply.
Back-patch to all supported branches. Back-patching is a bit of a judgment
call, but this problem seems to affect only a few users (else we'd have
identified it long ago), and it's bad enough when it does happen that
destabilizing plan choices in a worse direction seems unlikely.
Patch by me, with documentation wording suggested by Dean Rasheed
Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena>
Discussion: <16143.1470350371@sss.pgh.pa.us>
This is required for the result to be a legal tsvector value.
Noted while fooling with Andreas Seltenreich's ts_delete() crash.
Discussion: <87invhoj6e.fsf@credativ.de>
The help message for pg_basebackup specifies that the numbers 0 through 9
are accepted as valid values of -Z option. But, previously -Z 0 was rejected
as an invalid compression level.
Per discussion, it's better to make pg_basebackup treat 0 as valid
compression level meaning no compression, like pg_dump.
Back-patch to all supported versions.
Reported-By: Jeff Janes
Reviewed-By: Amit Kapila
Discussion: CAMkU=1x+GwjSayc57v6w87ij6iRGFWt=hVfM0B64b1_bPVKRqg@mail.gmail.com
This text was added by commit ff213239c, and not long thereafter obsoleted
by commit 4adc2f72a (which made the test depend on NBuffers instead); but
nobody noticed the need for an update. Commit 9563d5b5e adds some further
dependency on maintenance_work_mem, but the existing verbiage seems to
cover that with about as much precision as we really want here. Let's
just take it all out rather than leaving ourselves open to more errors of
omission in future. (That solution makes this change back-patchable, too.)
Noted by Peter Geoghegan.
Discussion: <CAM3SWZRVANbj9GA9j40fAwheQCZQtSwqTN1GBTVwRrRbmSf7cg@mail.gmail.com>
The docs failed to explain that LIKE INCLUDING INDEXES would not preserve
the names of indexes and associated constraints. Also, it wasn't mentioned
that EXCLUDE constraints would be copied by this option. The latter
oversight seems enough of a documentation bug to justify back-patching.
In passing, do some minor copy-editing in the same area, and add an entry
for LIKE under "Compatibility", since it's not exactly a faithful
implementation of the standard's feature.
Discussion: <20160728151154.AABE64016B@smtp.hushmail.com>
The description of udt_privileges view contained an incorrect copy-pasted word.
Back-patch to 9.2 where udt_privileges view was added.
Author: Alexander Law
The SQL standard appears to specify that IS [NOT] NULL's tests of field
nullness are non-recursive, ie, we shouldn't consider that a composite
field with value ROW(NULL,NULL) is null for this purpose.
ExecEvalNullTest got this right, but eval_const_expressions did not,
leading to weird inconsistencies depending on whether the expression
was such that the planner could apply constant folding.
Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be
used as a substitute test if a simple null check is wanted for a rowtype
argument. That motivated reordering things so that IS [NOT] DISTINCT FROM
is described before IS [NOT] NULL. In HEAD, I went a bit further and added
a table showing all the comparison-related predicates.
Per bug #14235. Back-patch to all supported branches, since it's certainly
undesirable that constant-folding should change the semantics.
Report and patch by Andrew Gierth; assorted wordsmithing and revised
regression test cases by me.
Report: <20160708024746.1410.57282@wrigleys.postgresql.org>
These functions were added in commits fbe5a3fb7 and a104a017f,
but commit 45639a052 removed their only callers. Put the related
code in foreign.c back to the way it was in 9.5, to avoid pointless
cross-version diffs.
Etsuro Fujita
Patch: <d674a3f1-6b63-519c-ef3f-f3188ed6a178@lab.ntt.co.jp>
9.4 added a second description of GET DIAGNOSTICS that was totally
independent of the existing one, resulting in each description lying to the
extent that it claimed the set of status items it described was complete.
Fix that, and do some minor markup improvement.
Also some other small fixes per bug #14258 from Dilian Palauzov.
Discussion: <20160718181437.1414.40802@wrigleys.postgresql.org>
brin-extensibility-inclusion-table was confused in places about the
difference between strategy 4 (RTOverRight) and strategy 5 (RTRight).
Alexander Law
To ensure that "make installcheck" can be used safely against an existing
installation, we need to be careful about what global object names
(database, role, and tablespace names) we use; otherwise we might
accidentally clobber important objects. There's been a weak consensus that
test databases should have names including "regression", and that test role
names should start with "regress_", but we didn't have any particular rule
about tablespace names; and neither of the other rules was followed with
any consistency either.
This commit moves us a long way towards having a hard-and-fast rule that
regression test databases must have names including "regression", and that
test role and tablespace names must start with "regress_". It's not
completely there because I did not touch some test cases in rolenames.sql
that test creation of special role names like "session_user". That will
require some rethinking of exactly what we want to test, whereas the intent
of this patch is just to hit all the cases in which the needed renamings
are cosmetic.
There is no enforcement mechanism in this patch either, but if we don't
add one we can expect that the tests will soon be violating the convention
again. Again, that's not such a cosmetic change and it will require
discussion. (But I did use a quick-hack enforcement patch to find these
cases.)
Discussion: <16638.1468620817@sss.pgh.pa.us>
For some reason this option wasn't discussed at all in client-auth.sgml.
Document it there, and be more explicit about its relationship to the
"cert" authentication method. Per gripe from Srikanth Venkatesh.
I failed to resist the temptation to do some minor wordsmithing in the
same area, too.
Discussion: <20160713110357.1410.30407@wrigleys.postgresql.org>
Clarify that the reason for recommending that pg_temp be put last is to
prevent temporary tables from capturing unqualified table names. Per
discussion with Albe Laurenz.
Discussion: <A737B7A37273E048B164557ADEF4A58B5386C6E1@ntex2010i.host.magwien.gv.at>
Add display of proparallel (parallel-safety) when the server is >= 9.6,
and display of proacl (access privileges) for all server versions.
Minor tweak of column ordering to keep related columns together.
Michael Paquier
Discussion: <CAB7nPqTR3Vu3xKOZOYqSm-+bSZV0kqgeGAXD6w5GLbkbfd5Q6w@mail.gmail.com>
temp_file_limit is a per-process limit, not a per-session limit across
all cooperating parallel processes; change wording accordingly, per a
suggestion from Tom Lane.
Also, document under max_parallel_workers_per_gather the fact that each
process involved in a parallel query may use as many resources as a
separate session. Caveat emptor.
Per a complaint from Peter Geoghegan.
Per discussion on pgsql-hackers, conninfo is better as the column name
because it's more commonly used in PostgreSQL.
Catalog version bumped due to the change of pg_proc.
Author: Michael Paquier
Document that index storage is dependent on the operating system's
collation library ordering, and any change in that ordering can create
invalid indexes.
Discussion: 20160617154311.GB19359@momjian.us
Backpatch-through: 9.1
In the previous design, the GetForeignUpperPaths FDW callback hook was
called before we got around to labeling upper relations with the proper
consider_parallel flag; this meant that any upper paths created by an FDW
would be marked not-parallel-safe. While that's probably just as well
right now, we aren't going to want it to be true forever. Hence, abandon
the idea that FDWs should be allowed to inject upper paths before the core
code has gotten around to creating the relevant upper relation. (Well,
actually they still can, but it's on their own heads how well it works.)
Instead, adopt the same API already designed for create_upper_paths_hook:
we call GetForeignUpperPaths after each upperrel has been created and
populated with the paths the core planner knows how to make.
Commit b1a9bad9e7 introduced a stats view to provide insight into the
running WAL receiver, but neglected to include the connection string in
it, as reported by Michaël Paquier. This commit fixes that omission.
(Any security-sensitive information is not disclosed).
While at it, close the mild security hole that we were exposing the
password in the connection string in shared memory. This isn't
user-accessible, but it still looks like a good idea to avoid having the
cleartext password in memory.
Author: Michaël Paquier, Álvaro Herrera
Review by: Vik Fearing
Discussion: https://www.postgresql.org/message-id/CAB7nPqStg4M561obo7ryZ5G+fUydG4v1Ajs1xZT1ujtu+woRag@mail.gmail.com
Fix some now-obsolete statements that were overlooked in commits
6734a1cac, 3dbbd0f02, 028350f61. Document the behavior of <0>.
Also do a little bit of rearranging and copy-editing for clarity.
Add a section to xaggr.sgml, as we have done in the past for other
extensions to the aggregation functionality. Assorted wordsmithing
and other minor improvements.
David Rowley and Tom Lane
The original specification for this called for the deserialization function
to have signature "deserialize(serialtype) returns transtype", which is a
security violation if transtype is INTERNAL (which it always would be in
practice) and serialtype is not (which ditto). The patch blithely overrode
the opr_sanity check for that, which was sloppy-enough work in itself,
but the indisputable reason this cannot be allowed to stand is that CREATE
FUNCTION will reject such a signature and thus it'd be impossible for
extensions to create parallelizable aggregates.
The minimum fix to make the signature type-safe is to add a second, dummy
argument of type INTERNAL. But to lock it down a bit more and make misuse
of INTERNAL-accepting functions less likely, let's get rid of the ability
to specify a "serialtype" for an aggregate and just say that the only
useful serialtype is BYTEA --- which, in practice, is the only interesting
value anyway, due to the usefulness of the send/recv infrastructure for
this purpose. That means we only have to allow "serialize(internal)
returns bytea" and "deserialize(bytea, internal) returns internal" as
the signatures for these support functions.
In passing fix bogus signature of int4_avg_combine, which I found thanks
to adding an opr_sanity check on combinefunc signatures.
catversion bump due to removing pg_aggregate.aggserialtype and adjusting
signatures of assorted built-in functions.
David Rowley and Tom Lane
Discussion: <27247.1466185504@sss.pgh.pa.us>
If there's anyplace in our SGML docs that explains this behavior, I can't
find it right at the moment. Add an explanation in "Dependency Tracking"
which seems like the authoritative place for such a discussion. Per
gripe from Michelle Schwan.
While at it, update this section's example of a dependency-related
error message: they last looked like that in 8.3. And remove the
explanation of dependency updates from pre-7.3 installations, which
is probably no longer worth anybody's brain cells to read.
The bogus error message example seems like an actual documentation bug,
so back-patch to all supported branches.
Discussion: <20160620160047.5792.49827@wrigleys.postgresql.org>
Dilian Palauzov pointed out in bug #14201 that the docs failed to mention
the possibility of %R producing '(' due to an unmatched parenthesis.
He proposed just adding that in the same style as the other options were
listed; but it seemed to me that the sentence was already nearly
unintelligible, so I rewrote it a bit more extensively.
Report: <20160619121113.5789.68274@wrigleys.postgresql.org>
This requires some core changes as well so that we can properly
WAL-log the truncation. Specifically, it changes the format of the
XLOG_SMGR_TRUNCATE WAL record, so bump XLOG_PAGE_MAGIC.
Patch by me, reviewed but not fully endorsed by Andres Freund.
If you really want to vacuum every single page in the relation,
regardless of apparent visibility status or anything else, you can use
this option. In previous releases, this behavior could be achieved
using VACUUM (FREEZE), but because we can now recognize all-frozen
pages as not needing to be frozen again, that no longer works. There
should be no need for routine use of this option, but maybe bugs or
disaster recovery will necessitate its use.
Patch by me, reviewed by Andres Freund.
The main point of doing this is to allow the cutoff to be set very small,
even zero, to allow parallel-query behavior to be tested on relatively
small tables such as we typically use in the regression tests. But it
might be of use to users too. The number-of-workers scaling behavior in
create_plain_partial_paths() is pretty ad-hoc and subject to change, so
we won't expose anything about that, but the notion of not considering
parallel query at all for tables below size X seems reasonably stable.
Amit Kapila, per a suggestion from me
Discussion: <17170.1465830165@sss.pgh.pa.us>
The new pg_check_visible() and pg_check_frozen() functions can be used to
verify that the visibility map bits for a relation's data pages match the
actual state of the tuples on those pages.
Amit Kapila and Robert Haas, reviewed (in earlier versions) by Andres
Freund. Additional testing help by Thomas Munro.
While beneficial, both for throughput and average/worst case latency, in
a significant number of workloads, there are other workloads in which
backend_flush_after can cause significant performance regressions in
comparison to < 9.6 releases. The regression is most likely when the hot
data set is bigger than shared buffers, but significantly smaller than
the operating system's page cache.
I personally think that the benefit of enabling backend flush control is
considerably bigger than the potential downsides, but a fair argument
can be made that not regressing is more important than improving
performance/latency. As the latter is the consensus, change the default
to 0.
The other settings introduced in 428b1d6b2 do not have the same
potential for regressions, so leave them enabled.
Benchmarks leading up to changing the default have been performed by
Mithun Cy, Ashutosh Sharma and Robert Haas.
Discussion: CAD__OuhPmc6XH=wYRm_+Q657yQE88DakN4=Ybh2oveFasHkoeA@mail.gmail.com
Document these as "nearest integer >= argument" and "nearest integer <=
argument", which will hopefully be less confusing than the old formulation.
New wording is from Matlab via Dean Rasheed.
I changed the pg_description entries as well as the SGML docs. In the
back branches, this will only affect installations initdb'd in the future,
but it should be harmless otherwise.
Discussion: <CAEZATCW3yzJo-NMSiQs5jXNFbTsCEftZS-Og8=FvFdiU+kYuSA@mail.gmail.com>
This terminology provoked widespread complaints. So, instead, rename
the GUC max_parallel_degree to max_parallel_workers_per_gather
(leaving room for a possible future GUC max_parallel_workers that acts
as a system-wide limit), and rename the parallel_degree reloption to
parallel_workers. Rename structure members to match.
These changes create a dump/restore hazard for users of PostgreSQL
9.6beta1 who have set the reloption (or applied the GUC using ALTER
USER or ALTER DATABASE).
Fix grammar, improve examples, etc.
I did not attempt to document the current behavior concerning distance-zero
matches, because I think that's broken and needs to change, so I'm not
going to use up brain cells figuring out how to explain how it works now.
One way or the other, there's still more to write here.
Commit 6820094d1 mixed up types of parent object (table) with type of
sub-object being commented on. Noticed while fixing docs for
COMMENT ON ACCESS METHOD.
Backpatch to 9.5, like that commit.
It was previously suggested that "esoteric" operations such as creating
a new access method would require direct manipulation of the system
catalogs, but that example has gone away, and I can't think of a new one
to replace it, so just put in some weasel wording.
Mostly these are just comments but there are a few in documentation
and a handful in code and tests. Hopefully this doesn't cause too much
unnecessary pain for backpatching. I relented from some of the most
common like "thru" for that reason. The rest don't seem numerous
enough to cause problems.
Thanks to Kevin Lyda's tool https://pypi.python.org/pypi/misspellings
Per discussion, this is a more understandable and future-proof way of
exposing the setting to users. On-disk, we can still store it in words,
so as to not break on-disk compatibility with beta1.
Along the way, clean up the code associated with Bloom reloptions.
Provide explicit macros for default and maximum lengths rather than
having magic numbers buried in multiple places in the code. Drop
the adjustBloomOptions() code altogether: it was useless in view of
the fact that reloptions.c already performed default-substitution and
range checking for the options. Rename a couple of macros and types
for more clarity.
Discussion: <23767.1464926580@sss.pgh.pa.us>
Oracle recommends using VARCHAR2 not VARCHAR, allegedly because they might
someday change VARCHAR to be spec-compliant about distinguishing null from
empty string. (I'm not holding my breath, though.) Our examples of PL/SQL
code were using VARCHAR, which while not wrong is missing the pedagogical
opportunity to talk about converting Oracle type names to Postgres. So
switch the examples to use VARCHAR2, and add some text about what to do
with common Oracle type names like VARCHAR2 and NUMBER. (There is probably
more to be said here, but those are the ones I'm sure about offhand.)
Per suggestion from rapg12@gmail.com.
Discussion: <20160521140046.22591.24672@wrigleys.postgresql.org>
Correct obsolete install instructions, as noted by Daniel Gustafsson.
Clarify the test code's prerequisites.
Discussion: <88E617F2-7721-4C4E-84F4-886A2041C1D0@yesql.se>
David Johnston pointed out that the original text here had been obsoleted
by SQL:2008, which allowed ORDER BY in subqueries. We could weaken the
text to describe ORDER-BY-in-subqueries as an optional SQL feature that's
possibly unportable; but then the exact same statements would apply to
the alternative it's being compared to (ORDER-BY-in-aggregate-calls).
So really that would be pretty useless; let's just take out the sentence
entirely. Instead, point out the hazard that any extra processing in the
upper query might cause the subquery output order to be destroyed.
Discussion: <CAKFQuwbAX=iO9QbpN7_jr+BnUWm9FYX8WbEPUvG0p+nZhp6TZg@mail.gmail.com>
We didn't have any real user documentation about how index-only scans
work or how to design indexes to exploit them. Remedy that.
Per gripe from David Johnston.
In the documentation for nextval(), point out explicitly that INSERT ...
ON CONFLICT will call nextval() if needed for the insertion case, whether
or not it ends up following the ON CONFLICT path. This seems to be a
matter of some confusion, cf bug #14126, so let's be clear about it.
Also mention the issue in the CREATE SEQUENCE reference page, since that
is another place where people might expect such things to be covered.
Minor wording improvements nearby, as well.
Back-patch to 9.5 where ON CONFLICT was introduced.
Describe compat_realm = 0 as "disabled" not "enabled", per discussion
with Christian Ullrich. I failed to resist the temptation to do some
other minor copy-editing in the same area.
Hash indexes are not WAL-logged, and so do not maintain the LSN of
index pages. Since the "snapshot too old" feature counts on
detecting error conditions using the LSN of a table and all indexes
on it, this makes it impossible to safely do early vacuuming on any
table with a hash index, so add this to the tests for whether the
xid used to vacuum a table can be adjusted based on
old_snapshot_threshold.
While at it, add a paragraph to the docs for old_snapshot_threshold
which specifically mentions this and other aspects of the feature
which may otherwise surprise users.
Problem reported and patch reviewed by Amit Kapila
Call out the major enhancements in this release as identified by
pgsql-advocacy discussion, and rearrange some of the entries to
make those items more prominent. Other minor improvements per
advice from Vitaly Burovoy, Masahiko Sawada, Peter Geoghegan,
and Andres Freund.
The similarity of the original names to SQL keywords seems like a bad
idea. Rename them before we're stuck with 'em forever.
In passing, minor code and docs cleanup.
Discussion: <4875.1462210058@sss.pgh.pa.us>
These functions behave like the backend's least/greatest functions,
not like min/max, so the originally-chosen names invite confusion.
Per discussion, rename to least/greatest.
I also took it upon myself to make them return double if any input is
double. The previous behavior of silently coercing all inputs to int
surely does not meet the principle of least astonishment.
Copy-edit some of the other new functions' documentation, too.
Somebody added pg_replication_origin, pg_replication_origin_status and
pg_replication_slots to catalogs.sgml without a whole lot of concern for
either alphabetical order or the difference between a table and a view.
Clean up the mess.
Back-patch to 9.5, not so much because this is critical as because if
I don't it will result in a cross-branch divergence in release-9.5.sgml,
which would be a maintenance hazard.
If we're not going to reject such setups entirely, throwing a WARNING in
check_synchronous_standby_names() is unhelpful, because it will cause the
warning to be logged again every time the postmaster receives SIGHUP.
Per discussion, just remove the warning.
In passing, improve the documentation for synchronous_commit, which had not
gotten the word that now there can be more than one synchronous standby.
Adjust the way we detect the locale. As a result the minumum Windows
version supported by VS2015 and later is Windows Vista. Add some tweaks
to remove new compiler warnings. Remove documentation references to the
now obsolete msysGit.
Michael Paquier, somewhat edited by me, reviewed by Christian Ullrich.
Backpatch to 9.5
Commit 989be0810d added a flex/bison lexer/parser to interpret
synchronous_standby_names. It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times. That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.
Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data. To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.
While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message. (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.) It had some outright
bugs in the face of invalid input, too.
I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers. The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break. I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection. Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.
Kyotaro Horiguchi and Tom Lane
Change max_parallel_degree default from 0 to 2. It is possible that
this is not a good idea, or that we should go with 1 worker rather
than 2, but we won't find out without trying it. Along the way,
reword the documentation for max_parallel_degree a little bit to
hopefully make it more clear.
Discussion: 20160420174631.3qjjhpwsvvx5bau5@alap3.anarazel.de
Several issues:
1) checkpoint_flush_after doc and code disagreed about the default
2) new GUCs were missing from postgresql.conf.sample
3) Outdated source-code comment about bgwriter_flush_after's default
4) Sub-optimal categories assigned to new GUCs
5) Docs suggested backend_flush_after is PGC_SIGHUP, but it's PGC_USERSET.
6) Spell out int as integer in the docs, as done elsewhere
Reported-By: Magnus Hagander, Fujii Masao
Discussion: CAHGQGwETyTG5VYQQ5C_srwxWX7RXvFcD3dKROhvAWWhoSBdmZw@mail.gmail.com
This includes the rest of the documentation that was not included
in 7117685. A larger restructure would still be wanted, but with
this commit the documentation of the new features is complete.