Turns out the only reason initdb -X worked is that pg_mkdir_p won't
whine if you point it at something that's a symlink to a directory.
Otherwise, the attempt to create pg_xlog/ just like all the other
subdirectories would have failed. Let's be a little more explicit
about what's happening. Oversight in my patch for bug #13853
(mea culpa for not testing -X ...)
When we're creating subdirectories of PGDATA during initdb, we know darn
well that the parent directory exists (or should exist) and that the new
subdirectory doesn't (or shouldn't). There is therefore no need to use
anything more complicated than mkdir(). Using pg_mkdir_p() just opens us
up to unexpected failure modes, such as the one exhibited in bug #13853
from Nuri Boardman. It's not very clear why pg_mkdir_p() went wrong there,
but it is clear that we didn't need to be trying to create parent
directories in the first place. We're not even saving any code, as proven
by the fact that this patch nets out at minus five lines.
Since this is a response to a field bug report, back-patch to all branches.
This new view provides insight into the state of a running WAL receiver
in a HOT standby node.
The information returned includes the PID of the WAL receiver process,
its status (stopped, starting, streaming, etc), start LSN and TLI, last
received LSN and TLI, timestamp of last message send and receipt, latest
end-of-WAL LSN and time, and the name of the slot (if any).
Access to the detailed data is only granted to superusers; others only
get the PID.
Author: Michael Paquier
Reviewer: Haribabu Kommi
Commit e710b65c inserted code in md5_crypt_verify to disable and later
re-enable interrupts, with a CHECK_FOR_INTERRUPTS call as part of the
second step, to process any interrupts that had been held off. Commit
6647248e removed the interrupt disable/re-enable code, but left behind
the CHECK_FOR_INTERRUPTS, even though this is now an entirely random,
pointless place for one. md5_crypt_verify doesn't run long enough to
need such a check, and if it did, this would still be the wrong place
to put one.
We tell people to examine the postmaster log if they're unsure why they are
getting auth failures, but actually only a few relatively-uncommon failure
cases were given their own log detail messages in commit 64e43c59b8.
Expand on that so that every failure case detected within md5_crypt_verify
gets a specific log detail message. This should cover pretty much every
ordinary password auth failure cause.
So far I've not noticed user demand for a similar level of auth detail
for the other auth methods, but sooner or later somebody might want to
work on them. This is not that patch, though.
pg_ctl is using isatty() to verify whether the process is running in a
terminal, and if not it sends its output to Windows' Event Log ... which
does the wrong thing when the output has been redirected to a pipe, as
reported in bug #13592.
To fix, make pg_ctl use the code we already have to detect service-ness:
in the master branch, move src/backend/port/win32/security.c to src/port
(with suitable tweaks so that it runs properly in backend and frontend
environments); pg_ctl already has access to pgport so it Just Works. In
older branches, that's likely to cause trouble, so instead duplicate the
required code in pg_ctl.c.
Author: Michael Paquier
Bug report and diagnosis: Egon Kocjan
Backpatch: all supported branches
Although these temp tables will get removed from template1 at the end of
the standalone-backend run, that's too late to keep them from getting
copied into the template0 and postgres databases, now that we use only a
single backend run for the whole sequence. While no real harm is done
by the extra copies (since they'd be deleted on first use of the temp
schema), it's still unsightly, and it would mean some wasted cycles for
every database creation for the life of the installation.
Oversight in commit c4a8812cf6. Noticed by Amit Langote.
In commit 921191912c I claimed that we weren't testing encoding
conversion functions, but further poking around reveals that we did
have an equivalent though hard-wired set of tests in conversion.sql.
AFAICS there is no advantage to doing it like that as compared to letting
the catalog contents drive the test, so let the opr_sanity addition stand
and remove the now-redundant tests in conversion.sql.
Also, remove some infrastructure in src/backend/utils/mb/conversion_procs
for building conversion.sql's list of tests. That was unmaintained, and
had not corresponded to the actual contents of conversion.sql since 2007
or perhaps even further back.
The order of inclusion of .o files makes a difference in linker output;
not a functional difference, but still a bitwise difference, which annoys
some packagers who would like reproducible builds.
Report and patch by Christoph Berg
A pointless and confusing error message is shown to the user when
attempting to identify a 9.3 or older remote server with a 9.5/9.6
pg_receivexlog, because the return signature of IDENTIFY_SYSTEM was
changed in 9.4. There's no good reason for the warning message, so
shuffle code around to keep it quiet.
(pg_recvlogical is also affected by this commit, but since it obviously
cannot work with 9.3 that doesn't actually matter much.)
Backpatch to 9.5.
Reported by Marco Nenciarini, who also wrote the initial patch. Further
tweaked by Robert Haas and Fujii Masao; reviewed by Michael Paquier and
Craig Ringer.
In light of commit ea0d494dae, it seems like a good idea to add
a regression test that will complain about random functions taking or
returning cstring. Only I/O support functions and encoding conversion
functions should be declared that way.
While at it, add some checks that encoding conversion functions are
declared properly. Since pg_conversion isn't populated manually,
it's not quite as necessary to check its contents as it is for catalogs
like pg_proc; but one thing we definitely have not tested in the past
is whether the identified conproc for a conversion actually does that
conversion vs. some other one.
Using cstring as the input type was a poor decision, because that's not
really a full-fledged type. In particular, it lacks implicit coercions
from text or varchar, meaning that usages like to_regproc('foo'||'bar')
wouldn't work; basically the only case that did work without explicit
casting was a simple literal constant argument.
The lack of field complaints about this suggests that hardly anyone
is using these functions, so hopefully fixing it won't cause much of
a compatibility problem. They've only been there since 9.4, anyway.
Petr Korobeinikov
While the in-core authentication mechanism doesn't need to access
pg_shseclabel at all, it's reasonable to think that an authentication
hook will want to look at the label for the role logging in, or for rows
in other catalogs used during the authentication phase of startup.
Catalog version bumped, because this changes the "is nailed" status for
pg_shseclabel.
Author: Adam Brightwell
This requires adding some more infrastructure to handle both case-sensitive
and case-insensitive matching, as well as the ability to match a prefix of
a previous word. So it ends up being about a wash line-count-wise, but
it's just as big a readability win here as in the SQL tab completion rules.
Michael Paquier, some adjustments by me
In the refactoring in commit d37b816dc9,
we mostly kept to the original design whereby only the last few words
on the line were matched to identify a completable pattern. However,
after commit d854118c8d, there's really
no reason to do it like that: where it's sensible, we can use patterns
that expect to match the entire input line. And mostly, it's sensible.
Matching the entire line greatly reduces the odds of a false match that
leads to offering irrelevant completions. Moreover (though I've not
tried to measure this), it should make tab completion faster since
many of the patterns will be discarded after a single integer comparison
that finds that the wrong number of words appear on the line.
There are certain identifiable places where we still need to use
TailMatches because the statement in question is allowed to appear
embedded in a larger statement. These are just a small minority of
the existing patterns, though, so the benefit of switching where
possible is large.
It's possible that this patch has removed some within-line matching
behaviors that are in fact desirable, but we can put those back when
we get complaints. Most of the removed behaviors are certainly silly.
Michael Paquier, with some further adjustments by me
Commit 43cd468cf0 added some wording to create_policy.sgml purporting
to warn users against a race condition of the sort that had been noted some
time ago by Peter Geoghegan. However, that warning was far too vague to be
useful (or at least, I completely failed to grasp what it was on about).
Since the problem case occurs with a security design pattern that lots of
people are likely to try to use, we need to be as clear as possible about
it. Provide a concrete example in the main-line docs in place of the
original warning.
Some time back we agreed that row_security=off should not be a way to
bypass RLS entirely, but only a way to get an error if it was being
applied. However, the code failed to act that way for table owners.
Per discussion, this is a must-fix bug for 9.5.0.
Adjust the logic in rls.c to behave as expected; also, modify the
error message to be more consistent with the new interpretation.
The regression tests need minor corrections as well. Also update
the comments about row_security in ddl.sgml to be correct. (The
official description of the GUC in config.sgml is already correct.)
I failed to resist the temptation to do some other very minor
cleanup as well, such as getting rid of a duplicate extern declaration.
Aside from any consistency arguments, this is logically necessary because
the I/O functions for these types also handle numeric OID values. Without
a quoting rule it is impossible to distinguish numeric OIDs from role or
namespace names that happen to contain only digits.
Also change the to_regrole and to_regnamespace functions to dequote their
arguments. While not logically essential, this seems like a good idea
since the other to_reg* functions do it. Anyone who really wants raw
lookup of an uninterpreted name can fall back on the time-honored solution
of (SELECT oid FROM pg_namespace WHERE nspname = whatever).
Report and patch by Jim Nasby, reviewed by Michael Paquier
Can't release the AccessExclusiveLock on the target table until commit.
Otherwise there is a race condition whereby other backends might service
our cache invalidation signals before they can actually see the updated
catalog rows.
Just to add insult to injury, RemovePolicyById was closing the rel (with
incorrect lock drop) and then passing the now-dangling rel pointer to
CacheInvalidateRelcache. Probably the reason this doesn't fall over on
CLOBBER_CACHE buildfarm members is that some outer level of the DROP logic
is still holding the rel open ... but it'd have bit us on the arse
eventually, no doubt.
The CHECK_IS_BINARY_UPGRADE macro is not sufficient security protection
if we're going to dereference pass-by-reference arguments before it.
But in any case we really need to explicitly check PG_ARGISNULL for all
the arguments of a non-strict function, not only the ones we expect null
values for.
Oversight in commits 30982be4e5 and
f92fc4c95d. Found by Andreas Seltenreich.
(The other usages in pg_upgrade_support.c seem safe.)
pgwin32_recv() has treated a non-error return of zero bytes from WSARecv()
as being a reason to block ever since the current implementation was
introduced in commit a4c40f140d. However, so far as one can tell
from Microsoft's documentation, that is just wrong: what it means is
graceful connection closure (in stream protocols) or receipt of a
zero-length message (in message protocols), and neither case should result
in blocking here. The only reason the code worked at all was that control
then fell into the retry loop, which did *not* treat zero bytes specially,
so we'd get out after only wasting some cycles. But as of 9.5 we do not
normally reach the retry loop and so the bug is exposed, as reported by
Shay Rojansky and diagnosed by Andres Freund.
Remove the unnecessary test on the byte count, and rearrange the code
in the retry loop so that it looks identical to the initial sequence.
Back-patch to 9.5. The code is wrong all the way back, AFAICS, but
since it's relatively harmless in earlier branches we'll leave it alone.
Commit c7e27becd2 fixed this on the backend side, but we neglected
the fact that several code paths in pg_dump were printing reloptions
values that had not gotten massaged by ruleutils. Apply essentially the
same quoting logic in those places, too.
spg_text_inner_consistent is capable of reconstructing an empty string
to pass down to the next index level; this happens if we have an empty
string coming in, no prefix, and a dummy node label. (In practice, what
is needed to trigger that is insertion of a whole bunch of empty-string
values.) Then, we will arrive at the next level with in->level == 0
and a non-NULL (but zero length) in->reconstructedValue, which is valid
but the Assert tests weren't expecting it.
Per report from Andreas Seltenreich. This has no impact in non-Assert
builds, so should not be a problem in production, but back-patch to
all affected branches anyway.
In passing, remove a couple of useless variable initializations and
shorten the code by not duplicating DatumGetPointer() calls.
As pointed out by Michael Paquier, recovery_min_apply_delay didn't exist
in 9.0-9.3, making the release note text not very useful. Instead make it
talk about recovery_target_xid, which did exist then.
9.0 is already out of support, but we can fix the text in the newer
branches' copies of its release notes.
The need for this is shown by the files it missed in Bruce's recent run.
I fixed it so that it will actually adjust the case when needed.
In passing, also make it skip .po files, since those will just get
overwritten anyway from the translation repository.
On closer inspection, the reason copyright.pl was missing files is
that it is looking for 'Copyright (c)' and they had 'Copyright (C)'.
Fix that, and update a couple more that grepping for that revealed.
flatten_reloptions() supposed that it didn't really need to do anything
beyond inserting commas between reloption array elements. However, in
principle the value of a reloption could be nearly anything, since the
grammar allows a quoted string there. Any restrictions on it would come
from validity checking appropriate to the particular option, if any.
A reloption value that isn't a simple identifier or number could thus lead
to dump/reload failures due to syntax errors in CREATE statements issued
by pg_dump. We've gotten away with not worrying about this so far with
the core-supported reloptions, but extensions might allow reloption values
that cause trouble, as in bug #13840 from Kouhei Sutou.
To fix, split the reloption array elements explicitly, and then convert
any value that doesn't look like a safe identifier to a string literal.
(The details of the quoting rule could be debated, but this way is safe
and requires little code.) While we're at it, also quote reloption names
if they're not safe identifiers; that may not be a likely problem in the
field, but we might as well try to be bulletproof here.
It's been like this for a long time, so back-patch to all supported
branches.
Kouhei Sutou, adjusted some by me
A report from Andy Colson showed that gincostestimate() was not being
nearly paranoid enough about whether to believe the statistics it finds in
the index metapage. The problem is that the metapage stats (other than the
pending-pages count) are only updated by VACUUM, and in the worst case
could still reflect the index's original empty state even when it has grown
to many entries. We attempted to deal with that by scaling up the stats to
match the current index size, but if nEntries is zero then scaling it up
still gives zero. Moreover, the proportion of pages that are entry pages
vs. data pages vs. pending pages is unlikely to be estimated very well by
scaling if the index is now orders of magnitude larger than before.
We can improve matters by expanding the use of the rule-of-thumb estimates
I introduced in commit 7fb008c5ee: if the index has grown by more
than a cutoff amount (here set at 4X growth) since VACUUM, then use the
rule-of-thumb numbers instead of scaling. This might not be exactly right
but it seems much less likely to produce insane estimates.
I also improved both the scaling estimate and the rule-of-thumb estimate
to account for numPendingPages, since it's reasonable to expect that that
is accurate in any case, and certainly pages that are in the pending list
are not either entry or data pages.
As a somewhat separate issue, adjust the estimation equations that are
concerned with extra fetches for partial-match searches. These equations
suppose that a fraction partialEntries / numEntries of the entry and data
pages will be visited as a consequence of a partial-match search. Now,
it's physically impossible for that fraction to exceed one, but our
estimate of partialEntries is mostly bunk, and our estimate of numEntries
isn't exactly gospel either, so we could arrive at a silly value. In the
example presented by Andy we were coming out with a value of 100, leading
to insane cost estimates. Clamp the fraction to one to avoid that.
Like the previous patch, back-patch to all supported branches; this
problem can be demonstrated in one form or another in all of them.
Recovery does not achieve its goal of zeroing all pg_multixact entries
whose accompanying WAL records never reached disk. Remove that claim
and justify its expendability. Detail the need for TrimMultiXact(),
which has little in common with the TrimCLOG() rationale. Merge two
tightly-related comments. Stop presenting pg_multixact as specific to
heap_lock_tuple(); PostgreSQL 9.3 extended its use to heap_update().
Noticed while investigating a report from Andres Freund.
Empty ulink elements default to displaying the URL, so there is no need
to specify the URL again. This was already done for most occurrences,
but some cases didn't follow this convention.
postgresImportForeignSchema pays attention to IMPORT's EXCEPT and LIMIT TO
options, but only as an efficiency hack, not for correctness' sake. The
FDW documentation does explain that, but someone using postgres_fdw.c
as a coding guide might not remember it, so let's add a comment here.
Per question from Regina Obe.
Fix an oversight in commit 321eed5f0f: replacing an operator's
selectivity functions needs to result in a corresponding update in
pg_depend. We have a function that can handle that, but it was not
called by AlterOperator().
To fix this without enlarging pg_operator.h's #include list beyond
what clients can safely include, split off the function definitions
into a new file pg_operator_fn.h, similarly to what we've done for
some other catalog header files. It's not entirely clear whether
any client-side code needs to include pg_operator.h, but it seems
prudent to assume that there is some such code somewhere.
In the extreme edge case where contended pages are the only ones that
escape being scanned, the previous commit would have allowed us to think
that relfrozenxid could be advanced, which is exactly wrong.
VACUUM can skip heap pages altogether when there's a run of consecutive
pages that are all-visible according to the visibility map. This causes it
to not update its nonempty_pages count, just as if those pages were empty,
which means that at the end we will think they are candidates for deletion.
Thus, we may take the table's AccessExclusive lock only to find that no
pages are really truncatable. This usually causes no real problems on a
master server, thanks to the lock being acquired only conditionally; but on
hot-standby servers, the same lock must be acquired unconditionally which
can result in unnecessary query cancellations.
To improve matters, force examination of the table's last page whenever
we reach there with a nonempty_pages count that would allow a truncation
attempt. If it's not empty, we'll advance nonempty_pages and thereby
prevent the truncation attempt.
If we are unable to acquire cleanup lock on that page, there's no need to
force it, unless we're doing an anti-wraparound vacuum. We can just check
for tuples with a shared buffer lock and then give up. (When we are doing
an anti-wraparound vacuum, and decide it's okay to skip the page because it
contains no freezable tuples, this patch still improves matters because
nonempty_pages is properly updated, which it was not before.)
Since only the last page is special-cased in this way, we might attempt a
truncation that will release many fewer pages than the normal heuristic
would suggest; at worst, only one page would be truncated. But that seems
all right, because the situation won't repeat during the next vacuum.
The real problem with the old logic is that the useless truncation attempt
happens every time we vacuum, so long as the state of the last few dozen
pages doesn't change.
This is a longstanding deficiency, but since the consequences aren't very
severe in most scenarios, I'm not going to risk a back-patch.
Jeff Janes and Tom Lane
Improve markup, particularly of the table of functions; add or improve
examples for some of the functions; wordsmith some of the function
descriptions.
The rationale for the way targetlist processing is done wasn't clearly
stated anywhere, and I for one had forgotten some of the details. Having
just painfully re-learned them, add some breadcrumbs for the next person.