While this isn't a lot of code, it's been essentially untestable for
a very long time, because libpq doesn't support anything older than
protocol 2.0, and has not since release 6.3. There's no reason to
believe any other client-side code still uses that protocol, either.
Discussion: <2661.1475849167@sss.pgh.pa.us>
SCO OpenServer and SCO UnixWare are more or less dead platforms.
We have never had a buildfarm member testing the "sco" port, and
the last "unixware" member was last heard from in 2012, so it's
fair to doubt that the code even compiles anymore on either one.
Remove both ports. We can always undo this if someone shows up
with an interest in maintaining and testing these platforms.
Discussion: <17177.1476136994@sss.pgh.pa.us>
It was perhaps not entirely clear that internal self-references shouldn't
be schema-qualified even if the view name is written with a schema.
Spell it out.
Discussion: <871sznz69m.fsf@metapensiero.it>
Since commit ecb0d20a9 hasn't crashed and burned, here's the promised
docs update for it.
In addition to explaining that Linux and FreeBSD ports now use POSIX
semaphores, I did some wordsmithing on pre-existing wording; in
particular trying to clarify which SysV parameters need to be set with
an eye to total usage across all applications.
Upcoming changes to the hash table code used, among others, for grouping
and set operations will change the output order for a few queries. To
make it less likely that actual bugs are hidden between regression test
ordering changes, and to make the tests robust against platform
dependant ordering, add ORDER BYs guaranteeing the output order.
As it's possible that some of the changes expose platform dependant
ordering, push this earlier, to let the buildfarm shake out potentially
unstable results.
Discussion: <20160727004333.r3e2k2y6fvk2ntup@alap3.anarazel.de>
Ordinarily there would not be an async result sitting around at this
point, but it appears that in corner cases there can be. Considering
all the work we're about to launch, it's hardly going to cost anything
noticeable to check.
It's been like this forever, so back-patch to all supported branches.
Report: <CAD-Qf1eLUtBOTPXyFQGW-4eEsop31tVVdZPu4kL9pbQ6tJPO8g@mail.gmail.com>
We've had support for using unnamed POSIX semaphores instead of System V
semaphores for quite some time, but it was not used by default on any
platform. Since many systems have rather small limits on the number of
SysV semaphores allowed, it seems desirable to switch to POSIX semaphores
where they're available and don't create performance or kernel resource
problems. Experimentation by me shows that unnamed POSIX semaphores
are at least as good as SysV semaphores on Linux, and we previously had
a report from Maksym Sobolyev that FreeBSD is significantly worse with
SysV semaphores than POSIX ones. So adjust those two platforms to use
unnamed POSIX semaphores, if configure can find the necessary library
functions. If this goes well, we may switch other platforms as well,
but it would be advisable to test them individually first.
It's not currently contemplated that we'd encourage users to select
a semaphore API for themselves, but anyone who wants to experiment
can add PREFERRED_SEMAPHORES=UNNAMED_POSIX (or NAMED_POSIX, or SYSV)
to their configure command line to do so.
I also tweaked configure to report which API it's selected, mainly
so that we can tell that from buildfarm reports.
I did not touch the user documentation's discussion about semaphores;
that will need some adjustment once the dust settles.
Discussion: <8536.1475704230@sss.pgh.pa.us>
The transfunction was told that its first argument and result were
of the window function output type, not the aggregate state type.
This'd only matter if the transfunction consults get_fn_expr_argtype,
which typically only polymorphic functions would do.
Although we have several regression tests around polymorphic aggs,
none of them detected this mistake --- in fact, they still didn't
fail when I injected the same mistake into nodeAgg.c. So add some
more tests covering both plain agg and window-function-agg cases.
Per report from Sebastian Luque. Back-patch to 9.6 where the error
was introduced (by sloppy refactoring in commit 804163bc2, looks like).
Report: <87int2qkat.fsf@gmail.com>
Historically, we've allowed users to add a CHECK constraint to a child
table and then add an identical CHECK constraint to the parent. This
results in "merging" the two constraints so that the pre-existing
child constraint ends up with both conislocal = true and coninhcount > 0.
However, if you tried to do it in the other order, you got a duplicate
constraint error. This is problematic for pg_dump, which needs to issue
separated ADD CONSTRAINT commands in some cases, but has no good way to
ensure that the constraints will be added in the required order.
And it's more than a bit arbitrary, too. The goal of complaining about
duplicated ADD CONSTRAINT commands can be served if we reject the case of
adding a constraint when the existing one already has conislocal = true;
but if it has conislocal = false, let's just make the ADD CONSTRAINT set
conislocal = true. In this way, either order of adding the constraints
has the same end result.
Another problem was that the code allowed creation of a parent constraint
marked convalidated that is merged with a child constraint that is
!convalidated. In this case, an inheritance scan of the parent table could
emit some rows violating the constraint condition, which would be an
unexpected result given the marking of the parent constraint as validated.
Hence, forbid merging of constraints in this case. (Note: valid child and
not-valid parent seems fine, so continue to allow that.)
Per report from Benedikt Grundmann. Back-patch to 9.2 where we introduced
possibly-not-valid check constraints. The second bug obviously doesn't
apply before that, and I think the first doesn't either, because pg_dump
only gets into this situation when dealing with not-valid constraints.
Report: <CADbMkNPT-Jz5PRSQ4RbUASYAjocV_KHUWapR%2Bg8fNvhUAyRpxA%40mail.gmail.com>
Discussion: <22108.1475874586@sss.pgh.pa.us>
The need for this was previously obscured even on picky platforms
by the hack we used to support direct cross-module references in
the transforms contrib modules. Now that that hack is gone, the
undefined symbol is exposed, as reported by Robert Haas.
Back-patch to 9.5 where we started to use -Wl,-undefined,dynamic_lookup.
I'm a bit surprised that the older branches don't seem to contain
any gettext references in this module, but since they don't fail
at build time, they must not. (We might be able to get away with
leaving this alone in 9.5/9.6, but I think it's cleaner if the
reference gets resolved at link time.)
Report: <CA+TgmoaHJKU5kcWZcYduATYVT7Mnx+8jUnycaYYL7OtCwCigug@mail.gmail.com>
I somehow had assumed that in the spinlock (in turn possibly using
semaphores) based fallback atomics implementation 32 bit writes could be
done without a lock. As far as the write goes that's correct, since
postgres supports only platforms with single-copy atomicity for aligned
32bit writes. But writing without holding the spinlock breaks
read-modify-write operations like pg_atomic_compare_exchange_u32(),
since they'll potentially "miss" a concurrent write, which can't happen
in actual hardware implementations.
In 9.6+ when using the fallback atomics implementation this could lead
to buffer header locks not being properly marked as released, and
potentially some related state corruption. I don't see a related danger
in 9.5 (earliest release with the API), because pg_atomic_write_u32()
wasn't used in a concurrent manner there.
The state variable of local buffers, before this change, were
manipulated using pg_atomic_write_u32(), to avoid unnecessary
synchronization overhead. As that'd not be the case anymore, introduce
and use pg_atomic_unlocked_write_u32(), which does not correctly
interact with RMW operations.
This bug only caused issues when postgres is compiled on platforms
without atomics support (i.e. no common new platform), or when compiled
with --disable-atomics, which explains why this wasn't noticed in
testing.
Reported-By: Tom Lane
Discussion: <14947.1475690465@sss.pgh.pa.us>
Backpatch: 9.5-, where the atomic operations API was introduced.
0xc19c is not a valid UTF-8 byte sequence. It doesn't do any harm, AFAICS,
but it's surely not intentional. No backpatching though, just to be sure.
In the passing, also add a file header comment to the file, like the
UCS_to_SJIS.pl script would produce. (The file was originally created with
UCS_to_SJIS.pl, but has been modified by hand since then. That's
questionable, but I'll leave fixing that for later.)
Kyotaro Horiguchi
Discussion: <20160907.155050.233844095.horiguchi.kyotaro@lab.ntt.co.jp>
Recent Perl and/or new Linux distributions are starting to remove "." from
the @INC list by default. That breaks pg_rewind and ssl test suites, which
use helper perl modules that reside in the same directory. To fix, add the
current source directory explicitly to prove's include dir.
The vcregress.pl script probably also needs something like this, but I
wasn't able to remove '.' from @INC on Windows to test this, and don't want
to try doing that blindly.
Discussion: <20160908204529.flg6nivjuwp5vaoy@alap3.anarazel.de>
On buildfarm member cockatiel, that library is in /usr/bin.
(Possibly we should look at $PATH on this platform?)
Per off-list report from Andrew Dunstan.
getBlobs' queries for pre-9.0 servers were broken in two ways:
the 7.x/8.x query uses DISTINCT so it can't have unspecified-type
NULLs in the target list, and both that query and the 7.0 one
failed to provide the correct output column labels, so that the
subsequent code to extract data from the PGresult would fail.
Back-patch to 9.6 where the breakage was introduced (by commit 23f34fa4b).
Amit Langote and Tom Lane
Discussion: <0a3e7a0e-37bd-8427-29bd-958135862f0a@lab.ntt.co.jp>
They are supposed to be mutually exclusive, but there was no check for
that.
Michael Banck
Discussion: <20161007103414.GD12247@nighthawk.caipicrew.dd-dns.de>
Leaving the error in the error queue used to be harmless, because the
X509_STORE_load_locations() call used to be the last step in
initialize_SSL(), and we would clear the queue before the next
SSL_connect() call. But previous commit moved things around. The symptom
was that if a CRL file was not found, and one of the subsequent
initialization steps, like loading the client certificate or private key,
failed, we would incorrectly print the "no such file" error message from
the earlier X509_STORE_load_locations() call as the reason.
Backpatch to all supported versions, like the previous patch.
There were several issues with the old coding:
1. There was a race condition, if two threads opened a connection at the
same time. We used a mutex around SSL_CTX_* calls, but that was not
enough, e.g. if one thread SSL_CTX_load_verify_locations() with one
path, and another thread set it with a different path, before the first
thread got to establish the connection.
2. Opening two different connections, with different sslrootcert settings,
seemed to fail outright with "SSL error: block type is not 01". Not sure
why.
3. We created the SSL object, before calling SSL_CTX_load_verify_locations
and SSL_CTX_use_certificate_chain_file on the SSL context. That was
wrong, because the options set on the SSL context are propagated to the
SSL object, when the SSL object is created. If they are set after the
SSL object has already been created, they won't take effect until the
next connection. (This is bug #14329)
At least some of these could've been fixed while still using a shared
context, but it would've been more complicated and error-prone. To keep
things simple, let's just use a separate SSL context for each connection,
and accept the overhead.
Backpatch to all supported versions.
Report, analysis and test case by Kacper Zuk.
Discussion: <20160920101051.1355.79453@wrigleys.postgresql.org>
If you point pg_rewind to a server that is using synchronous replication,
with "pg_rewind --source-server=...", and the replication is not working
for some reason, pg_rewind will get stuck because it creates a temporary
table, which needs to be replicated. You could call broken replication a
pilot error, but pg_rewind is often used in special circumstances, when
there are changes to the replication setup.
We don't do any "real" updates, and we don't care about fsyncing or
replicating the operations on the temporary tables, so fix that by
setting synchronous_commit off.
Michael Banck, Michael Paquier. Backpatch to 9.5, where pg_rewind was
introduced.
Discussion: <20161005143938.GA12247@nighthawk.caipicrew.dd-dns.de>
LogicalTapeRewind() should not allocate large read buffer, if the tape
is completely empty. The calling code relies on that, for its
calculation of how much memory to allocate for the read buffers. That
lead to massive overallocation of memory, if maxTapes was high, but
only a few tapes were actually used.
Reported by Tomas Vondra
Discussion: <7303da46-daf7-9c68-3cc1-9f83235cf37e@2ndquadrant.com>
We don't need this anymore, and it prevents build-time error checking
that's usually good to have, so remove it. Undoes one change of commit
cac765820.
Unfortunately, it's much harder to get a similar effect on other common
platforms, because we don't want the linker to throw errors for symbols
that will be resolved in the core backend. Only macOS and AIX expect the
core backend executable to be available while linking loadable modules,
so only these platforms can usefully throw errors for unresolved symbols
at link time.
Discussion: <2652.1475512158@sss.pgh.pa.us>
Per discussion with Andrew Dunstan, it seems there are three peculiarities
of the Python installation on MinGW (or at least, of the instance on
buildfarm member frogmouth). First, the library name doesn't contain
"2.7" but just "27". It looks like we can deal with that by consulting
get_config_vars('VERSION') to see whether a dot should be used or not.
Second, the library might be in c:/Windows/System32, analogously to it
possibly being in /usr/lib on Unix-oid platforms. Third, it's apparently
not standard to use the prefix "lib" on the file name. This patch will
accept files with or without "lib", but the first part of that may well
be dead code.
Most Unix-oid platforms provide a symlink "libfoo.so" -> "libfoo.so.n.n"
to allow the linker to resolve a reference "-lfoo" to a version-numbered
shared library. OpenBSD has apparently hacked ld(1) to do this internally,
because there are no such symlinks to be found in their library
directories. Tweak the new code in PGAC_CHECK_PYTHON_EMBED_SETUP to cope.
Per buildfarm member curculio.
Windows apparently has a constant named WAIT_TIMEOUT, and some of these
other names are pretty generic, too. Insert "PG_" at the front of each
name in order to disambiguate.
Michael Paquier
Just turning the crank on the project started in commit d51924be8.
These cases turn out to be exact subsets of the boilerplate needed
for hstore_plpython.
Discussion: <2652.1475512158@sss.pgh.pa.us>
Older versions of Python produce garbage (or at least useless) values of
get_config_vars('LDLIBRARY'). Newer versions produce garbage (or at least
useless) values of get_config_vars('SO'), which was defeating our configure
logic that attempted to identify where the Python shlib really is. The net
result, at least with a stock Python 3.5 installation on macOS, was that
we were linking against a static library in the mistaken belief that it was
a shared library. This managed to work, if you count statically absorbing
libpython into plpython.so as working. But it no longer works as of commit
d51924be8, because now we get separate static copies of libpython in
plpython.so and hstore_plpython.so, and those can't interoperate on the
same data. There are some other infelicities like assuming that nobody
ever installs a private version of Python on a macOS machine.
Hence, forget about looking in $python_configdir for the Python shlib;
as far as I can tell no version of Python has ever put one there, and
certainly no currently-supported version does. Also, rather than relying
on get_config_vars('SO'), just try all the possibilities for shlib
extensions. Also, rather than trusting Py_ENABLE_SHARED, believe we've
found a shlib only if it has a recognized extension. Last, explicitly
cope with the possibility that the shlib is really in /usr/lib and
$python_libdir is a red herring --- this is the actual situation on older
macOS, but we were only accidentally working with it.
Discussion: <5300.1475592228@sss.pgh.pa.us>
WaitLatch, WaitLatchOrSocket, and WaitEventSetWait now taken an
additional wait_event_info parameter; legal values are defined in
pgstat.h. This makes it possible to uniquely identify every point in
the core code where we are waiting for a latch; extensions can pass
WAIT_EXTENSION.
Because latches were the major wait primitive not previously covered
by this patch, it is now possible to see information in
pg_stat_activity on a large number of important wait events not
previously addressed, such as ClientRead, ClientWrite, and SyncRep.
Unfortunately, many of the wait events added by this patch will fail
to appear in pg_stat_activity because they're only used in background
processes which don't currently appear in pg_stat_activity. We should
fix this either by creating a separate view for such information, or
else by deciding to include them in pg_stat_activity after all.
Michael Paquier and Robert Haas, reviewed by Alexander Korotkov and
Thomas Munro.
In commit d51924be8, I overlooked the need to provide linkage for
PLyUnicode_FromStringAndSize, because that's only used (and indeed
only exists) in Python 3 builds.
In light of the need to #if this item, rearrange the ordering of
the code related to each function pointer, so as not to need more
#if's than absolutely necessary.
Per buildfarm.
Before initializing iteration over a subtransaction's changes, the last
few changes were not spilled to disk. That's correct if the transaction
didn't spill to disk, but otherwise... This bug can lead to missed or
misorderd subtransaction contents when they were spilled to disk.
Move spilling of the remaining in-memory changes to
ReorderBufferIterTXNInit(), where it can easily be applied to the top
transaction and, if present, subtransactions.
Since this code had too many bugs already, noticeably increase test
coverage.
Fixes: #14319
Reported-By: Huan Ruan
Discussion: <20160909012610.20024.58169@wrigleys.postgresql.org>
Backport: 9,4-, where logical decoding was added
Previously, on most platforms, we allowed hstore_plpython's references
to hstore and plpython to be unresolved symbols at link time, trusting
the dynamic linker to resolve them when the module is loaded. This
has a number of problems, the worst being that the dynamic linker
does not know where the references come from and can do nothing but
fail if those other modules haven't been loaded. We've more or less
gotten away with that for the limited use-case of datatype transform
modules, but even there, it requires some awkward hacks, most recently
commit 83c249200.
Instead, let's not treat these references as linker-resolvable at all,
but use function pointers that are manually filled in by the module's
_PG_init function. There are few enough contact points that this
doesn't seem unmaintainable, at least for these use-cases. (Note that
the same technique wouldn't work at all for decoupling from libpython
itself, but fortunately that's just a standard shared library and can
be linked to normally.)
This is an initial patch that just converts hstore_plpython. If the
buildfarm doesn't find any fatal problems, I'll work on the other
transform modules soon.
Tom Lane, per an idea of Andres Freund's.
Discussion: <2652.1475512158@sss.pgh.pa.us>
Commit 88e982302 invented GUC_UNIT_XSEGS for min_wal_size and max_wal_size,
but neglected to make it display sensibly in pg_settings.unit (by adding a
case to the switch in GetConfigOptionByNum). Fix that, and adjust said
switch to throw a run-time error the next time somebody forgets.
In passing, avoid using a static buffer for the output string --- the rest
of this function pstrdup's from a local buffer, and I see no very good
reason why the units code should do it differently and less safely.
Per report from Otar Shavadze. Back-patch to 9.5 where the new unit type
was added.
Report: <CAG-jOyA=iNFhN+yB4vfvqh688B7Tr5SArbYcFUAjZi=0Exp-Lg@mail.gmail.com>
Attempting to COPY a subset of columns from a table with RLS enabled
would fail due to an invalid query being constructed (using a single
ColumnRef with the list of fields to exact in 'fields', but that's for
the different levels of an indirection for a single column, not for
specifying multiple columns).
Correct by building a ColumnRef and then RestTarget for each column
being requested and then adding those to the targetList for the select
query. Include regression tests to hopefully catch if this is broken
again in the future.
Patch-By: Adam Brightwell
Reviewed-By: Michael Paquier
pg_upgrade checks whether all the shared libraries used in the old cluster
are also available in the new one by issuing LOAD for each library name.
Previously, it cared not what order it did the LOADs in. Ideally it
should not have to care, but currently the transform modules in contrib
fail unless both the language and datatype modules they depend on are
loaded first. A backend-side solution for that looks possible but
probably not back-patchable, so as a stopgap measure, let's do the LOAD
tests in order by library name length. That should fix the problem for
reasonably-named transform modules, eg "hstore_plpython" will be loaded
after both "hstore" and "plpython". (Yeah, it's a hack.)
In a larger sense, having a predictable order of these probes is a good
thing, since it will make upgrades predictably work or not work in the
face of inter-library dependencies. Also, this patch replaces O(N^2)
de-duplication logic with O(N log N) logic, which could matter in
installations with very many databases. So I don't foresee reverting this
even after we have a proper fix for the library-dependency problem.
In passing, improve a couple of SQL queries used here.
Per complaint from Andrew Dunstan that pg_upgrade'ing the transform contrib
modules failed. Back-patch to 9.5 where transform modules were introduced.
Discussion: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
Don't pre-read tuples into SortTuple slots during merge. Instead, use the
memory for larger read buffers in logtape.c. We're doing the same number
of READTUP() calls either way, but managing the pre-read SortTuple slots
is much more complicated. Also, the on-tape representation is more compact
than SortTuples, so we can fit more pre-read tuples into the same amount
of memory this way. And we have better cache-locality, when we use just a
small number of SortTuple slots.
Now that we only hold one tuple from each tape in the SortTuple slots, we
can greatly simplify the "batch memory" management. We now maintain a
small set of fixed-sized slots, to hold the tuples, and fall back to
palloc() for larger tuples. We use this method during all merge phases,
not just the final merge, and also when randomAccess is requested, and
also in the TSS_SORTEDONTAPE case. In other words, it's used whenever we
do an external sort.
Reviewed by Peter Geoghegan and Claudio Freire.
Discussion: <CAM3SWZTpaORV=yQGVCG8Q4axcZ3MvF-05xe39ZvORdU9JcD6hQ@mail.gmail.com>
Without this, an extension containing an access method is not properly
dumped/restored during pg_upgrade --- the AM ends up not being a member
of the extension after upgrading.
Another oversight in commit 473b93287, reported by Andrew Dunstan.
Report: <f7ac29f3-515c-2a44-21c5-ec925053265f@dunslane.net>
Fixes errors introduced in commit bc34223bc, as detected by Coverity.
In passing, report ENOSPC for a short write while padding a new wal file in
open_walfile, make certain that close_walfile closes walfile in all cases,
and improve a couple of comments.
Michael Paquier and Tom Lane
In standard Unix builds, postmaster child processes do ClosePostmasterPorts
immediately after InitPostmasterChild, that is almost immediately after
being spawned. This is important because we don't want children holding
open the postmaster's end of the postmaster death watch pipe.
However, in EXEC_BACKEND builds, SubPostmasterMain was postponing this
responsibility significantly, in order to make it slightly more convenient
to pass the right flag value to ClosePostmasterPorts. This is bad,
particularly seeing that process_shared_preload_libraries() might invoke
nearly-arbitrary code. Rearrange so that we do it as soon as we've
fetched the socket FDs via read_backend_variables().
Also move the comment explaining about randomize_va_space to before the
call of PGSharedMemoryReAttach, which is where it's relevant. The old
placement was appropriate when the reattach happened inside
CreateSharedMemoryAndSemaphores, but that was a long time ago.
Back-patch to 9.3; the patch doesn't apply cleanly before that, and
it doesn't seem worth a lot of effort given that we've had no actual
field complaints traceable to this.
Discussion: <4157.1475178360@sss.pgh.pa.us>
collect_corrupt_items() failed to initialize tuple.t_self. While
HeapTupleSatisfiesVacuum() doesn't actually use that value, it does
Assert that it's valid, so that the code would dump core if ip_posid
chanced to be zero. (That's somewhat unlikely, which probably explains
how this got missed. In any case it wouldn't matter for field use.)
Also, collect_corrupt_items was returning the wrong TIDs, that is the
contents of t_ctid rather than the tuple's own location. This would
be the same thing in simple cases, but it could be wrong if, for
example, a past update attempt had been rolled back, leaving a live
tuple whose t_ctid doesn't point at itself.
Also, in pg_visibility(), guard against trying to read a page past
the end of the rel. The VM code handles inquiries beyond the end
of the map by silently returning zeroes, and it seems like we should
do the same thing here.
I ran into the assertion failure while using pg_visibility to check
pg_upgrade's behavior, and then noted the other problems while
reading the code.
Report: <29043.1475288648@sss.pgh.pa.us>