Commit 3c163a7fc's original choice to ignore all #define symbols whose
names begin with underscore turns out to be too simplistic. On Windows,
some Perl installations are built with -D_USE_32BIT_TIME_T, and we must
absorb that or we get the wrong result for sizeof(PerlInterpreter).
This effectively re-reverts commit ef58b87df, which injected that symbol
in a hacky way, making it apply to all of Postgres not just PL/Perl.
More significantly, it did so on *all* 32-bit Windows builds, even when
the Perl build to be used did not select this option; so that it fails
to work properly with some newer Perl builds.
By making this change, we would be introducing an ABI break in 32-bit
Windows builds; but fortunately we have not used type time_t in any
exported Postgres APIs in a long time. So it should be OK, both for
PL/Perl itself and for third-party extensions, if an extension library
is built with a different _USE_32BIT_TIME_T setting than the core code.
Patch by me, based on research by Ashutosh Sharma and Robert Haas.
Back-patch to all supported branches, as commit 3c163a7fc was.
Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
The sole useful effect of this function, to check that no catcache
entries have positive refcounts at transaction end, has really been
obsolete since we introduced ResourceOwners in PG 8.1. We reduced the
checks to assertions years ago, so that the function was a complete
no-op in production builds. There have been previous discussions about
removing it entirely, but consensus up to now was that it had some small
value as a cross-check for bugs in the ResourceOwner logic.
However, it now emerges that it's possible to trigger these assertions
if you hit an assert-enabled backend with SIGTERM during a call to
SearchCatCacheList, because that function temporarily increases the
refcounts of entries it's intending to add to a catcache list construct.
In a normal ERROR scenario, the extra refcounts are cleaned up by
SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a
transaction abort and exit without ever executing PG_CATCH handlers.
There's a case to be made that this is a generic hazard and we should
consider restructuring elog(FATAL) handling so that pending PG_CATCH
handlers do get run. That's pretty scary though: it could easily create
more problems than it solves. Preliminary stress testing by Andreas
Seltenreich suggests that there are not many live problems of this ilk,
so we rejected that idea.
There are more-localized ways to fix the problem; the most principled
one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY.
But adding cycles to SearchCatCacheList isn't very appealing. We could
also weaken the assertions in AtEOXact_CatCache in some more or less
ad-hoc way, but that just makes its raison d'etre even less compelling.
In the end, the most reasonable solution seems to be to just remove
AtEOXact_CatCache altogether, on the grounds that it's not worth trying
to fix it. It hasn't found any bugs for us in many years.
Per report from Jeevan Chalke. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com
At least on my machine, a run with code coverage enabled produces some
".gcov" files whose names begin with ".". "rm -f *.gcov" fails to match
those, so they don't get cleaned up by "make clean". Fix it.
Perusal of the code coverage report shows that the existing regression
test cases for LIMIT/OFFSET don't exercise the nodeLimit code paths
involving backwards scan, empty results, or null values of LIMIT/OFFSET.
Improve the coverage.
Perusal of the code coverage report shows that the existing regression
test cases for INTERSECT and EXCEPT seemingly all prefer the SETOP_HASHED
implementation. Add some test cases in which we force use of the
SETOP_SORTED mode.
Since PostgreSQL 9.6, rolreplication no longer determines whether a role
can run pg_start_backup() and pg_stop_backup(), so remove that.
Add that this attribute determines whether a role can create and drop
replication slots.
Reported-by: Fujii Masao <masao.fujii@gmail.com>
Previously the -M switch had to appear before any switch that directly
or indirectly specified a benchmarking script. This was both confusing
and inadequately documented, as per gripe from Tatsuo Ishii. We can
remove the restriction at the cost of making an extra pass over the
lists of SQL commands, which seems like a cheap price (the string scans
themselves likely cost much more). The change is just to not extract
parameters from the SQL commands until we have finished parsing the
switches and know the final value of -M.
Per discussion, we'll treat this as a low-grade bug fix and sneak it
into v10, rather than holding it for v11.
Tom Lane, reviewed by Tatsuo Ishii and Fabien Coelho
Discussion: https://postgr.es/m/20170802.110328.1963639094551443169.t-ishii@sraoss.co.jp
Discussion: https://postgr.es/m/10208.1502465077@sss.pgh.pa.us
Various bugs can cause crashes, so don't use that function before ICU
53. It will fall back to the code path used for other encodings.
Since we now tie the function availability to an ICU version, we don't
need the configure test anymore. That also resolves the issue that the
test result was previously hardcoded for Windows.
researched by Daniel Verite <daniel@manitou-mail.org>, Peter Geoghegan
<pg@bowt.ie>, Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/f1438ec6-22aa-4029-9a3b-26f79d330e72%40manitou-mail.org
The previous message didn't mention the name of the table or the
bounds. Put the table name in the primary error message and the
bounds in the detail message.
Amit Langote, changed slightly by me. Suggestions on the exac
phrasing from Tom Lane, David G. Johnston, and Dean Rasheed.
Discussion: http://postgr.es/m/CA+Tgmoae6bpwVa-1BMaVcwvCCeOoJ5B9Q9-RHWo-1gJxfPBZ5Q@mail.gmail.com
Many places that mentioned only Gather should also mention Gather
Merge, or should be phrased in a more neutral way. Be more clear
about the fact that max_parallel_workers_per_gather affects the number
of workers the planner may want to use. Fix a typo. Explain how
Gather Merge works. Adjust wording around parallel scans to be a bit
more clear. Adjust wording around parallel-restricted operations for
the fact that uncorrelated subplans are no longer restricted.
Patch by me, reviewed by Erik Rijkers
Discussion: http://postgr.es/m/CA+TgmoZsTjgVGn=ei5ht-1qGFKy_m1VgB3d8+Rg304hz91N5ww@mail.gmail.com
We must advance the oldest XID that can be safely looked up in clog
*before* truncating CLOG, and the oldest XID that can't be reused
*after* truncating CLOG. This assertion, and the accompanying
comment, are confused; remove them.
Reported by Neha Sharma.
Discussion: http://postgr.es/m/CANiYTQumC3T=UMBMd1Hor=5XWZYuCEQBioL3ug0YtNQCMMT5wQ@mail.gmail.com
find_composite_type_dependencies correctly found columns that are of
the specified type, and columns that are of arrays of that type, but
not columns that are domains or ranges over the given type, its array
type, etc. The most general way to handle this seems to be to assume
that any type that is directly dependent on the specified type can be
treated as a container type, and processed recursively (allowing us
to handle nested cases such as ranges over domains over arrays ...).
Since a type's array type already has such a dependency, we can drop
the existing special case for the array type.
The very similar logic in get_rels_with_domain was likewise a few
bricks shy of a load, as it supposed that a directly dependent type
could *only* be a sub-domain. This is already wrong for ranges over
domains, and it'll someday be wrong for arrays over domains.
Add test cases illustrating the problems, and back-patch to all
supported branches.
Discussion: https://postgr.es/m/15268.1502309024@sss.pgh.pa.us
FreeBSD's make, for one, sets the MAKELEVEL environment variable when
invoking commands. In the special Makefile we provide to hand off control
from a non-GNU make to GNU make, this causes GNU make to think it is a
child make invocation rather than top-level. That interferes with the hack
added in commit dcae5facc to cause the temp-install tree to be made only by
the top-level invocation of gmake. Unset the variable to prevent that.
Likewise unset MAKEFLAGS, which FreeBSD's make also sets, and which could
easily confuse gmake. There are no reports of actual trouble from that,
but it seems better to be proactive.
Back-patch to 9.5 where dcae5facc came in.
Thomas Munro, hacked a bit more by me
Discussion: https://postgr.es/m/CAEepm=1ueww35AXTkt1A3gyzZUqv5XCzh8RUNvJZAQAW=eOhVw@mail.gmail.com
Commit 1efc7e538 did a poor job of emulating existing logic for touching
Datums that might be expanded-object pointers. It didn't check for typlen
being -1 first, which meant it could crash on fixed-length pass-by-ref
values, and probably on cstring values as well. It also didn't use
DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently
harmless is not according to documentation nor prevailing style.
I also think the lack of any explanation as to why datumSerialize makes
these particular nonobvious choices is pretty awful, so fix that.
Per report from Jarred Ward. Back-patch to 9.6 where this code came in.
Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com
In commit 5c77690f6, we added polling in front of most of the
get_slot_xmins calls in 001_stream_rep.pl, but today's results from
buildfarm member nightjar show that at least one more poll loop
is needed.
Proactively add a poll loop before the next-to-last get_slot_xmins call
as well. It may be that there is no race condition there because the
standby_2 server is shut down at that point, but I'm quite tired of
fighting with this test script. The empirical evidence that it's safe,
from the buildfarm, is no stronger than the evidence for the other
call that nightjar just proved unsafe.
The only remaining get_slot_xmins calls without wait_slot_xmins
protection are the first two, which should be OK since nothing has
happened at that point. It's tempting to ignore that special case
and merge get_slot_xmins and wait_slot_xmins into a single function.
I didn't go that far though.
Discussion: https://postgr.es/m/18436.1502228036@sss.pgh.pa.us
Similar to what was fixed in commit 9915de6c1c for replication slots,
but this time it's related to replication origins: DROP SUBSCRIPTION
attempts to drop the replication origin, but that fails if the
replication worker process hasn't yet marked it unused. This causes
failures in the buildfarm:
ERROR: could not drop replication origin with OID 1, in use by PID 34069
Like the aforementioned commit, fix by having the process running DROP
SUBSCRIPTION sleep until the worker marks the the replication origin
struct as free. This uses a condition variable on each replication
origin shmem state struct, so that the session trying to drop can sleep
and expect to be awakened by the process keeping the origin open.
Also fix a SGML markup in the previous commit.
Discussion: https://postgr.es/m/20170808001433.rozlseaf4m2wkw3n@alvherre.pgsql
In commit 9915de6c1c, we introduced a new wait point for replication
slots and incorrectly labelled it as wait event PG_WAIT_LOCK. That's
wrong, so invent an appropriate new wait event instead, and document it
properly.
While at it, fix numerous other problems in the vicinity:
- two different walreceiver wait events were being mixed up in a single
wait event (which wasn't documented either); split it out so that they
can be distinguished, and document the new events properly.
- ParallelBitmapPopulate was documented but didn't exist.
- ParallelBitmapScan was not documented (I think this should be called
"ParallelBitmapScanInit" instead.)
- Logical replication wait events weren't documented
- various symbols had been added in dartboard order in various places.
Put them in alphabetical order instead, as was originally intended.
Discussion: https://postgr.es/m/20170808181131.mu4fjepuh5m75cyq@alvherre.pgsql
The check for IPC::Run we added in commit c254970ad is useful in simple
cases, but there are real use-cases where "prove" is coming from a
different Perl installation than the "perl" we want to use to build.
In such cases asking whether "perl" knows about IPC::Run is irrelevant
and can cause an unnecessary configure failure. Hence, if user has
specified a value for PROVE, skip the IPC::Run check. Per discussion
with Andrew Dunstan.
Discussion: https://postgr.es/m/E1dcE5n-0005Sk-UE@gemulon.postgresql.org
This would lead to failures if local and remote tables have a different
column order. The tests previously didn't catch that because they only
tested the initial data copy. So add another test that exercises the
apply worker.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
The relation attribute map was not initialized for dropped columns,
leading to errors later on.
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Reported-by: Scott Milliken <scott@deltaex.com>
Bug: #14769
lo_put() surely should require UPDATE permission, the same as lowrite(),
but it failed to check for that, as reported by Chapman Flack. Oversight
in commit c50b7c09d; backpatch to 9.4 where that was introduced.
Tom Lane and Michael Paquier
Security: CVE-2017-7548
Commit 3eefc51053 claimed to make
pg_user_mappings enforce the qualifications user_mapping_options had
been enforcing, but its removal of a longstanding restriction left them
distinct when the current user is the subject of a mapping yet has no
server privileges. user_mapping_options emits no rows for such a
mapping, but pg_user_mappings includes full umoptions. Change
pg_user_mappings to show null for umoptions. Back-patch to 9.2, like
the above commit.
Reviewed by Tom Lane. Reported by Jeff Janes.
Security: CVE-2017-7547
Some authentication methods allowed it, others did not. In the client-side,
libpq does not even try to authenticate with an empty password, which makes
using empty passwords hazardous: an administrator might think that an
account with an empty password cannot be used to log in, because psql
doesn't allow it, and not realize that a different client would in fact
allow it. To clear that confusion and to be be consistent, disallow empty
passwords in all authentication methods.
All the authentication methods that used plaintext authentication over the
wire, except for BSD authentication, already checked that the password
received from the user was not empty. To avoid forgetting it in the future
again, move the check to the recv_password_packet function. That only
forbids using an empty password with plaintext authentication, however.
MD5 and SCRAM need a different fix:
* In stable branches, check that the MD5 hash stored for the user does not
not correspond to an empty string. This adds some overhead to MD5
authentication, because the server needs to compute an extra MD5 hash, but
it is not noticeable in practice.
* In HEAD, modify CREATE and ALTER ROLE to clear the password if an empty
string, or a password hash that corresponds to an empty string, is
specified. The user-visible behavior is the same as in the stable branches,
the user cannot log in, but it seems better to stop the empty password from
entering the system in the first place. Secondly, it is fairly expensive to
check that a SCRAM hash doesn't correspond to an empty string, because
computing a SCRAM hash is much more expensive than an MD5 hash by design,
so better avoid doing that on every authentication.
We could clear the password on CREATE/ALTER ROLE also in stable branches,
but we would still need to check at authentication time, because even if we
prevent empty passwords from being stored in pg_authid, there might be
existing ones there already.
Reported by Jeroen van der Ham, Ben de Graaff and Jelte Fennema.
Security: CVE-2017-7546
We're planning to put an underscore before the major version number in
branch names for v10 and later. Make sure the recipe in RELEASE_CHANGES
reflects that.
In passing, add a reminder to consider doing pgindent right before
the branch.
Discussion: https://postgr.es/m/E1dAkjZ-0003MG-0U@gemulon.postgresql.org
The callers for GetOldestSafeDecodingTransactionId() all inverted the
argument for the argument introduced in 2bef06d516. Luckily this
appears to be inconsequential for the moment, as we wait for
concurrent in-progress transaction when assembling a
snapshot. Additionally this could only make a difference when adding a
second logical slot, because only a pre-existing slot could cause an
issue by lowering the returned xid dangerously much.
Reported-By: Antonin Houska
Discussion: https://postgr.es/m/32704.1496993134@localhost
Backport: 9.4-, where 2bef06d516 was backpatched to.