Eliminate dependencies on "which", as we don't really need that to be
installed for proper testing. Don't number the tests, as that increases
the footprint of every patch that wants to add or remove tests. Make
the test output more informative, so that it's a bit easier to see what
went right (or wrong). Spelling and grammar improvements.
contrib/xml2 can get by without libxslt; the relevant features just
won't work. But if doesn't have libxml2, or if sepgsql doesn't have
libselinux, the link succeeds but the module then fails to work at load
time. To avoid that, link the require libraries unconditionally, so
that it will be clear at link-time that there is a problem.
Per discussion with Tom Lane and KaiGai Kohei.
Now that we have a test that requires nondefault settings to pass, it seems
like we'd better mention that detail in the directions about how to run the
tests.
Also do some very minor copy-editing.
order of begin, prepare, and commit of three concurrent transactions that
have conflicts between them.
The test runs for a quite long time, and the expected output file is huge,
but this test caught some serious bugs during development, so seems
worthwhile to keep. The test uses prepared transactions, so it fails if the
server has max_prepared_transactions=0. Because of that, it's marked as
"ignore" in the schedule file.
Dan Ports
Perhaps we ought to add some other kind of documentation here instead,
but for now let's get rid of this woefully obsolete description of the
sinval machinery.
Because of ABI tagging, the library version number might no longer be
exactly the Python version number, so do extra lookups. This affects
installations without a shared library, such as ActiveState's
installer.
Also update the way to detect the location of the 'config' directory,
which can also be versioned.
Ashesh Vashi
Module initialization functions in Python 3 must have external
linkage, because PyMODINIT_FUNC does dllexport on Windows-like
platforms. Without this change, the build with Python 3 fails on
Windows.
Dropped columns within a composite type were not handled correctly.
Also, we did not check for whether a composite result type had changed
since we cached the information about it.
Jan Urbański, per a bug report from Jean-Baptiste Quenot
This requires adjusting the API for syscache callback functions: they now
get a hash value, not a TID, to identify the target tuple. Most of them
weren't paying any attention to that argument anyway, but plancache did
require a small amount of fixing.
Also, improve performance a trifle by avoiding sending duplicate inval
messages when a heap_update isn't changing the catcache lookup columns.
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
We have to be sure that we have revalidated each nailed-in-cache relcache
entry before we try to use it to load data for some other relcache entry.
The introduction of "mapped relations" in 9.0 broke this, because although
we updated the state kept in relmapper.c early enough, we failed to
propagate that information into relcache entries soon enough; in
particular, we could try to fetch pg_class rows out of pg_class before
we'd updated its relcache entry's rd_node.relNode value from the map.
This bug accounts for Dave Gould's report of failures after "vacuum full
pg_class", and I believe that there is risk for other system catalogs
as well.
The core part of the fix is to copy relmapper data into the relcache
entries during "phase 1" in RelationCacheInvalidate(), before they'll be
used in "phase 2". To try to future-proof the code against other similar
bugs, I also rearranged the order in which nailed relations are visited
during phase 2: now it's pg_class first, then pg_class_oid_index, then
other nailed relations. This should ensure that RelationClearRelation can
apply RelationReloadIndexInfo to all nailed indexes without risking use
of not-yet-revalidated relcache entries.
Back-patch to 9.0 where the relation mapper was introduced.
This works around the problem that a catalog cache entry might contain a
toast pointer that we try to dereference just as a VACUUM FULL completes
on that catalog. We will see the sinval message on the cache entry when
we acquire lock on the toast table, but by that point we've already told
tuptoaster.c "here's the pointer to fetch", so it's difficult from a code
structural standpoint to update the pointer before we use it. Much less
painful to ensure that toast pointers are not invalidated in the first
place. We have to add a bit of code to deal with the case that a value
that previously wasn't toasted becomes so; but that should be a
seldom-exercised corner case, so the inefficiency shouldn't be significant.
Back-patch to 9.0. In prior versions, we didn't allow CLUSTER on system
catalogs, and VACUUM FULL didn't result in reassignment of toast OIDs, so
there was no problem.
The previous code tried to synchronize by unlinking the init file twice,
but that doesn't actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that would
tell it the data is stale. The result would be bizarre failures in catalog
accesses, typically "could not read block 0 in file ..." later during
startup.
Instead, hold RelCacheInitLock across both the unlink and the sending of
the SI messages. This is more straightforward, and might even be a bit
faster since only one unlink call is needed.
This has been wrong since it was put in (in 2002!), so back-patch to all
supported releases.
When streaming including WAL, the size estimate will always be incorrect,
since we don't know how much WAL is included. To make sure the output doesn't
look completely unreasonable, this patch increases the total size whenever we
go past the estimate, to make sure we never go above 100%.
This makes it clearer that the error message is perhaps not supposed
to be understood by users, and it also makes it somewhat clearer that
it was not accidentally omitted from translation.
Idea from Heikki Linnakangas, except that we don't mark "Reason code"
for translation at this point, because that would make the
implementation too cumbersome.
When updating or deleting a system catalog tuple, it's necessary to acquire
RowExclusiveLock on the catalog before looking up the tuple; otherwise a
concurrent VACUUM FULL on the catalog might move the tuple to a different
TID before we can apply the update. Coding patterns that find the tuple
via a table scan aren't at risk here, but when obtaining the tuple from a
catalog cache, correct ordering is important; and several routines in
foreigncmds.c got it wrong. Noted while running the regression tests in
parallel with VACUUM FULL of assorted system catalogs.
For consistency I moved all the heap_open calls to the starts of their
functions, including a couple for which there was no actual bug.
Back-patch to 8.4 where foreigncmds.c was added.
The statement start timestamp was not set before initiating the transaction
that is used to look up client authentication information in pg_authid.
In consequence, enable_sig_alarm computed a wrong value (far in the past)
for statement_fin_time. That didn't have any immediate effect, because the
timeout alarm was set without reference to statement_fin_time; but if we
subsequently blocked on a lock for a short time, CheckStatementTimeout
would consult the bogus value when we cancelled the lock timeout wait,
and then conclude we'd timed out, leading to immediate failure of the
connection attempt. Thus an innocent "vacuum full pg_authid" would cause
failures of concurrent connection attempts. Noted while testing other,
more serious consequences of vacuum full on system catalogs.
We should set the statement timestamp before StartTransactionCommand(),
so that the transaction start timestamp is also valid. I'm not sure if
there are any non-cosmetic effects of it not being valid, but the xact
timestamp is at least sent to the statistics machinery.
Back-patch to 9.0. Before that, the client authentication timeout was done
outside any transaction and did not depend on this state to be valid.
poll() is preferred over select() on platforms where both are available,
because it tends to be a bit faster and it doesn't have an arbitrary limit
on the range of FD numbers that can be accessed. The FD range limit does
not appear to be a risk factor for any 9.1 usages, so this doesn't need to
be back-patched, but we need to have it in place if we keep on expanding
the uses of WaitLatch.
Instead of displaying comments on an arbitrary subset of the object
types which support them, make \dd display comments on exactly those
object types which don't have their own backlash commands. We now
regard the display of comments as properly the job of the relevant
backslash command (though many of them do so only in verbose mode)
rather than something that \dd should be responsible for. However,
a handful of object types have no backlash command, so make \dd
give information about those.
Josh Kupershmidt
The latch infrastructure is now capable of detecting all cases where the
walsender loop needs to wake up, so there is no reason to have an arbitrary
timeout.
Also, modify the walsender loop logic to follow the standard pattern of
ResetLatch, test for work to do, WaitLatch. The previous coding was both
hard to follow and buggy: it would sometimes busy-loop despite having
nothing available to do, eg between receipt of a signal and the next time
it was caught up with new WAL, and it also had interesting choices like
deciding to update to WALSNDSTATE_STREAMING on the strength of information
known to be obsolete.
This is in hopes of learning more about what causes "pgstat wait timeout"
warnings in the buildfarm. This patch should probably be reverted once
we've learned what we can. As coded, it will result in regression test
"failures" at half the delay that the existing code does, so I expect
to see a few more than before.
In pursuit of this (and with the expectation that WaitLatch will be needed
in more places), convert the latch field that was already added to PGPROC
for sync rep into a generic latch that is activated for all PGPROC-owning
processes, and change many of the standard backend signal handlers to set
that latch when a signal happens. This will allow WaitLatch callers to be
wakened properly by these signals.
In passing, fix a whole bunch of signal handlers that had been hacked to do
things that might change errno, without adding the necessary save/restore
logic for errno. Also make some minor fixes in unix_latch.c, and clean
up bizarre and unsafe scheme for disowning the process's latch. Much of
this has to be back-patched into 9.1.
Peter Geoghegan, with additional work by Tom