Commit Graph

330 Commits

Author SHA1 Message Date
Alvaro Herrera
0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00
Bruce Momjian
861ad67bd9 pg_upgrade: remove --single-transaction usage
With AtEOXact applied, --single-transaction makes pg_restore slower, and
has the potential to require lock table configuration, so remove the
argument.

Per suggestion from Tom.
2013-01-22 22:27:16 -05:00
Bruce Momjian
600250d0ed Improve pg_upgrade error report
If the cluster alignments don't match, output this suggestion:

	Likely one cluster is a 32-bit install, the other 64-bit
2013-01-18 09:26:55 -05:00
Andrew Dunstan
4ae5ee6c9b Extend and improve use of EXTRA_REGRESS_OPTS.
This is now used by ecpg tests, and not clobbered by pg_upgrade
tests. This change won't affect anything that doesn't set this
environment variable, but will enable the buildfarm to control
exactly what port regression test installs will be running on,
and thus to detect possible rogue postmasters more easily.

Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
2013-01-12 08:28:58 -05:00
Bruce Momjian
a89c46f9bc Allow parallel copy/link in pg_upgrade
This patch implements parallel copying/linking of files by tablespace
using the --jobs option in pg_upgrade.
2013-01-09 08:57:47 -05:00
Tom Lane
78a5e738e9 Prevent creation of postmaster's TCP socket during pg_upgrade testing.
On non-Windows machines, we use the Unix socket for connections to test
postmasters, so there is no need to create a TCP socket.  Furthermore,
doing so causes failures due to port conflicts if two builds are carried
out concurrently on one machine.  (If the builds are done in different
chroots, which is standard practice at least in Red Hat distros, there
is no risk of conflict on the Unix socket.)  Suppressing the TCP socket
by setting listen_addresses to empty has long been standard practice
for pg_regress, and pg_upgrade knows about this too ... but pg_upgrade's
test.sh didn't get the memo.

Back-patch to 9.2, and also sync the 9.2 version of the script with HEAD
as much as practical.
2013-01-03 18:34:51 -05:00
Bruce Momjian
bcbe99244f Adjust a few pg_upgrade functions to return void.
Adjust pg_upgrade page conversion functions (which are not used) to
return void so transfer_all_new_dbs can return void.
2013-01-02 21:20:20 -05:00
Bruce Momjian
bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
Bruce Momjian
6f1b9e4efd Add pg_upgrade --jobs parameter
Add pg_upgrade --jobs, which allows parallel dump/restore of databases,
which improves performance.
2012-12-26 19:26:30 -05:00
Bruce Momjian
dc9896a245 Avoid using NAMEDATALEN in pg_upgrade
Because the client encoding might not match the server encoding,
pg_upgrade can't allocate NAMEDATALEN bytes for storage of database,
relation, and namespace identifiers.  Instead pg_strdup() the memory and
free it.

Also add C comment in initdb.c about safe NAMEDATALEN usage.
2012-12-20 13:56:31 -05:00
Bruce Momjian
345fb82f16 Add pg_upgrade comment about mismatch error
Add comment stating that constraint and index names must match.
2012-12-20 07:37:27 -05:00
Bruce Momjian
e95c4bd113 Fix pg_upgrade for invalid indexes
All versions of pg_upgrade upgraded invalid indexes caused by CREATE
INDEX CONCURRENTLY failures and marked them as valid.  The patch adds a
check to all pg_upgrade versions and throws an error during upgrade or
--check.

Backpatch to 9.2, 9.1, 9.0.  Patch slightly adjusted.
2012-12-11 15:09:22 -05:00
Bruce Momjian
acdb8c2259 Fix pg_upgrade -O/-o options
Fix previous commit that added synchronous_commit=off, but broke -O/-o
due to missing space in argument passing.

Backpatch to 9.2.
2012-12-10 23:03:25 -05:00
Bruce Momjian
6dd9584507 Improve pg_upgrade's status display
Pg_upgrade displays file names during copy and database names during
dump/restore.  Andrew Dunstan identified three bugs:

*  long file names were being truncated to 60 _leading_ characters, which
   often do not change for long file names

*  file names were truncated to 60 characters in log files

*  carriage returns were being output to log files

This commit fixes these --- it prints 60 _trailing_ characters to the
status display, and full path names without carriage returns to log
files.  It also suppresses status output to the log file unless verbose
mode is used.
2012-12-07 12:26:13 -05:00
Bruce Momjian
c47d261c07 In pg_upgrade testing script, turn off command echo at the end so status
report is clearer.
2012-12-04 08:17:45 -05:00
Bruce Momjian
2f22765607 Restore set -x in pg_upgrade/test.sh, so the user can see what is being
executed.
2012-12-03 23:44:23 -05:00
Bruce Momjian
630cd14426 Add initdb --sync-only option to sync the data directory to durable
storage.

Have pg_upgrade use it, and enable server options fsync=off and
full_page_writes=off.

Document that users turning fsync from off to on should run initdb
--sync-only.

[ Previous commit was incorrectly applied as a git merge. ]
2012-12-03 22:47:59 -05:00
Bruce Momjian
25d1ed04a2 Revert initdb --sync-only patch that had incorrect commit messages. 2012-12-03 22:46:51 -05:00
Bruce Momjian
4d88bc8f2b dummy commit 2012-12-03 22:45:02 -05:00
Bruce Momjian
cd7569a546 dummy commit 2012-12-03 22:45:02 -05:00
Bruce Momjian
452739df82 In pg_upgrade, remove 'set -x' from test script. 2012-12-01 12:07:14 -05:00
Bruce Momjian
1c59e37665 Revert:
In pg_upgrade, remove pg_restore's --single-transaction option,
    as it throws errors in certain cases.
2012-12-01 10:21:45 -05:00
Bruce Momjian
209772350b Remove pg_restore's --single-transaction option, as it throws errors in
certain cases.
2012-12-01 09:58:00 -05:00
Bruce Momjian
5eeab9c85c In pg_upgrade, improve status wording now that we have per-database
status output for dump/restore.
2012-11-30 22:32:25 -05:00
Bruce Momjian
12ee6ec71f In pg_upgrade, dump each database separately and use
--single-transaction to restore each database schema.  This yields
performance improvements for databases with many tables.  Also, remove
split_old_dump() as it is no longer needed.
2012-11-30 16:30:13 -05:00
Andrew Dunstan
abece8af17 Clean environment for pg_upgrade test.
This removes exisiting PG settings from the environment for
pg_upgrade tests, just like pg_regress does.
2012-11-30 07:54:24 -05:00
Bruce Momjian
6b711cf37c In pg_upgrade, simplify function copy_file() by using pg_malloc() and
centralizing error/shutdown code.
2012-11-24 22:39:03 -05:00
Bruce Momjian
16e1ae77f9 In pg_upgrade, fix a few place that used maloc/free rather than
pg_malloc/pg_free.
2012-11-24 22:12:39 -05:00
Bruce Momjian
b55743a5df In pg_upgrade, report errno string if file existence check returns an
error and errno != ENOENT.
2012-11-19 16:41:58 -05:00
Bruce Momjian
546d65d55f In pg_upgrade, add third meaningless parameter to open(). 2012-11-14 19:01:29 -05:00
Bruce Momjian
29add0de49 In pg_upgrade, copy fsm, vm, and extent files by checking for file
existence via open(), rather than collecting a directory listing and
looking up matching relfilenode files with sequential scans of the
array.  This speeds up pg_upgrade by 2x for a large number of tables,
e.g. 16k.

Per observation by Ants Aasma.
2012-11-14 17:32:07 -05:00
Bruce Momjian
dec10ba4c5 Mark pg_upgrade's free_db_and_rel_infos() as a static function. 2012-11-13 21:10:40 -05:00
Bruce Momjian
ed5699dd1b In pg_upgrade, set synchronous_commit=off for the new cluster, to
improve performance when restoring the schema from the old cluster.

Backpatch to 9.2.
2012-11-06 14:28:57 -05:00
Andrew Dunstan
2f2be7473b Use a more portable platform test. 2012-10-18 16:14:11 -04:00
Bruce Momjian
a9701a1d7d In pg_upgrade, issue proper error message when we can't open PG_VERSION.
Backpatch to 9.2.
2012-10-10 13:53:03 -04:00
Bruce Momjian
ce75457949 In pg_upgrade, use full path name for analyze_new_cluster.sh script.
Backpatch to 9.2.
2012-10-02 21:18:43 -04:00
Tom Lane
09ac603c36 Work around unportable behavior of malloc(0) and realloc(NULL, 0).
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory.  Hack our
various wrapper functions to hide the difference by substituting a size
request of 1.  This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure.  This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.

Back-patch to 9.2 to fix the pg_dump issue.  Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
2012-10-02 17:32:42 -04:00
Bruce Momjian
8a7598091a In pg_upgrade, improve error reporting when the number of relation
objects does not match between the old and new clusters.

Backpatch to 9.2.
2012-10-02 11:53:45 -04:00
Bruce Momjian
ac96b851ec Adjust pg_upgrade query so toast tables related to system catalog schema
entries are not dumped.   This fixes an error caused by
droping/recreating the information_schema, but other failures were also
possible.

Backpatch to 9.2.
2012-10-02 11:46:08 -04:00
Bruce Momjian
b61837a49f In pg_upgrade, try to convert the locale names to canonical form before
comparison;  also report the old/new values if they don't match.

Backpatch to 9.2.
2012-10-02 11:42:34 -04:00
Peter Eisentraut
10bfe81dee pg_upgrade test: Disable fsync in initdb and postgres calls
This mirrors the behavior of pg_regress and makes the test run much
faster.
2012-09-26 22:41:57 -04:00
Peter Eisentraut
5cfd5bb15b pg_upgrade: Remove check for pg_config
It is no longer used, but was still being checked for.

bug #7548 from Reinhard Max
2012-09-18 22:04:28 -04:00
Andrew Dunstan
f8c81c5dde In pg_upgrade, try a few times to open a log file.
If we call pg_ctl stop, the server might continue and thus
hold a log file for a short time after it has deleted its pid file,
(which is when pg_ctl will exit), and so a subsequent attempt to
open the log file might fail.

We therefore try to open it a few times, sleeping one second between
tries, to give the server time to exit.

This corrects an error that was observed on the buildfarm.

Backpatched to 9.2,
2012-09-05 23:14:49 -04:00
Andrew Dunstan
f8f5cf33a3 Fix pg_upgrade test script's line end handling on Windows.
Call pg_dumpall using -f switch instead of redirection, to avoid
writing the output in text mode and generating spurious carriage
returns. Remove to carriage return ignoring hack introduced by
commit e442b0f0c6.

Backpatch to 9.2.
2012-09-05 18:00:31 -04:00
Andrew Dunstan
ea0b414a0d Fix line end mishandling in pg_upgrade on Windows.
pg_upgrade opened the output from pg_dumpall in text mode and
wrote the split files in text mode. This caused unwanted eating
of intended carriage returns on input and production of spurious
carriage returns on output. To avoid this, open all these files
in binary mode. On non-Windows platforms, this change has no
effect.

Backpatch to 9.0. On 9.0 and 9.1, we also switch from redirecting
pg_dumpall's output to using pg_dumpall's -f switch, for the same
reason.
2012-09-05 17:41:43 -04:00
Tom Lane
b98fd52a55 Silence -Wunused-result warning in contrib/pg_upgrade.
This is just neatnik-ism, but since we do it for comparable code in elog.c,
we may as well do it here.
2012-09-05 14:37:22 -04:00
Bruce Momjian
022cd22f0f In pg_upgrade, document why we can't issue \n\n in the command logfile
on Windows.  Slightly cleanup log output on Windows given this
restriction.

Backpatch to 9.2.
2012-09-05 00:01:13 -04:00
Andrew Dunstan
2042185baf Fix transcription error. 2012-09-04 09:39:49 -04:00
Andrew Dunstan
0829c7087e Fix command echoing in pg_upgade's analyze script for Windows. 2012-09-04 05:49:22 -04:00
Andrew Dunstan
2f0c7d5854 Indent fix_path_separator() header properly. 2012-09-03 22:59:19 -04:00