Commit Graph

35860 Commits

Author SHA1 Message Date
Peter Eisentraut
a06af43695 doc: Fix DocBook table column count declaration
This was broken in d6464fdc0a.
2013-12-10 21:47:45 -05:00
Robert Haas
66abc2608c Add a new reloption, user_catalog_table.
When this reloption is set and wal_level=logical is configured,
we'll record the CIDs stamped by inserts, updates, and deletes to
the table just as we would for an actual catalog table.  This will
allow logical decoding to use historical MVCC snapshots to access
such tables just as they access ordinary catalog tables.

Replication solutions built around the logical decoding machinery
will likely need to set this operation for their configuration
tables; it might also be needed by extensions which perform table
access in their output functions.

Andres Freund, reviewed by myself and others.
2013-12-10 19:17:34 -05:00
Robert Haas
e55704d8b2 Add new wal_level, logical, sufficient for logical decoding.
When wal_level=logical, we'll log columns from the old tuple as
configured by the REPLICA IDENTITY facility added in commit
07cacba983.  This makes it possible
a properly-configured logical replication solution to correctly
follow table updates even if they change the chosen key columns,
or, with REPLICA IDENTITY FULL, even if the table has no key at
all.  Note that updates which do not modify the replica identity
column won't log anything extra, making the choice of a good key
(i.e. one that will rarely be changed) important to performance
when wal_level=logical is configured.

Each insert, update, or delete to a catalog table will also log
the CMIN and/or CMAX values of stamped by the current transaction.
This is necessary because logical decoding will require access to
historical snapshots of the catalog in order to decode some data
types, and the CMIN/CMAX values that we may need in order to judge
row visibility may have been overwritten by the time we need them.

Andres Freund, reviewed in various versions by myself, Heikki
Linnakangas, KONDO Mitsumasa, and many others.
2013-12-10 19:01:40 -05:00
Tom Lane
9ec6199d18 Fix possible crash with nested SubLinks.
An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...)
could produce an invalid plan that results in a crash at execution time,
if the planner attempts to flatten the outer IN into a semi-join.
This happens because convert_testexpr() was not expecting any nested
SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging
to the inner SubLink.  (I think the comment denying that this case could
happen was wrong when written; it's certainly been wrong for quite a long
time, since very early versions of the semijoin flattening logic.)

Per report from Teodor Sigaev.  Back-patch to all supported branches.
2013-12-10 16:10:17 -05:00
Noah Misch
53685d7981 Rename TABLE() to ROWS FROM().
SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and
other collection types.  This feature, however, deals with set-returning
functions.  Use a different syntax for this feature to keep open the
possibility of implementing the standard TABLE().
2013-12-10 09:34:37 -05:00
Bruce Momjian
01cc1fecfd pgcrypto docs: update cpu type used in duration testing 2013-12-09 16:12:24 -05:00
Bruce Momjian
d6464fdc0a pgcrypto docs: update encryption timings and add relative times
Miles Elam
2013-12-09 16:10:47 -05:00
Robert Haas
d9250da032 Fixups for dsm.c's file descriptor handling.
Per complaint from Tom Lane.
2013-12-09 11:15:19 -05:00
Magnus Hagander
33d3f5594a Fix pg_stat_statements build on 32-bit systems
Peter Geoghegan
2013-12-08 11:59:07 +01:00
Joe Conway
d6ca510d9d Fix performance regression in dblink connection speed.
Previous commit e5de601267 modified dblink
to ensure client encoding matched the server. However the added
PQsetClientEncoding() call added significant overhead. Restore original
performance in the common case where client encoding already matches
server encoding by doing nothing in that case. Applies to all active
branches.

Issue reported and work sponsored by Zonar Systems.
2013-12-07 17:00:26 -08:00
Magnus Hagander
54aa5ef7f2 Fix a couple of typos
Noted by Peter Geoghegan
2013-12-07 23:08:17 +01:00
Peter Eisentraut
3164721462 SSL: Support ECDH key exchange
This sets up ECDH key exchange, when compiling against OpenSSL that
supports EC.  Then the ECDHE-RSA and ECDHE-ECDSA cipher suites can be
used for SSL connections.  The latter one means that EC keys are now
usable.

The reason for EC key exchange is that it's faster than DHE and it
allows to go to higher security levels where RSA will be horribly slow.

There is also new GUC option ssl_ecdh_curve that specifies the curve
name used for ECDH.  It defaults to "prime256v1", which is the most
common curve in use in HTTPS.

From: Marko Kreen <markokr@gmail.com>
Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>
2013-12-07 15:11:44 -05:00
Fujii Masao
91484409bd Expose qurey ID in pg_stat_statements view.
The query ID is the internal hash identifier of the statement,
and was not available in pg_stat_statements view so far.

Daniel Farina, Sameer Thakur and Peter Geoghegan, reviewed by me.
2013-12-08 02:06:02 +09:00
Peter Eisentraut
ef3267523d SSL: Add configuration option to prefer server cipher order
By default, OpenSSL (and SSL/TLS in general) lets the client cipher
order take priority.  This is OK for browsers where the ciphers were
tuned, but few PostgreSQL client libraries make the cipher order
configurable.  So it makes sense to have the cipher order in
postgresql.conf take priority over client defaults.

This patch adds the setting "ssl_prefer_server_ciphers" that can be
turned on so that server cipher order is preferred.  Per discussion,
this now defaults to on.

From: Marko Kreen <markokr@gmail.com>
Reviewed-by: Adrian Klaver <adrian.klaver@gmail.com>
2013-12-07 08:13:50 -05:00
Bruce Momjian
8fe3d90d34 docs: update partition encryption options
Text from Adam Vande More
2013-12-06 09:47:39 -05:00
Bruce Momjian
fa4add50c4 docs: clarify SSL certificate authority chain docs
Previously, the requirements of how intermediate certificates were
handled and their chain to root certificates was unclear.
2013-12-06 09:42:08 -05:00
Alvaro Herrera
312bde3d40 Fix improper abort during update chain locking
In 247c76a989, I added some code to do fine-grained checking of
MultiXact status of locking/updating transactions when traversing an
update chain.  There was a thinko in that patch which would have the
traversing abort, that is return HeapTupleUpdated, when the other
transaction is a committed lock-only.  In this case we should ignore it
and return success instead.  Of course, in the case where there is a
committed update, HeapTupleUpdated is the correct return value.

A user-visible symptom of this bug is that in REPEATABLE READ and
SERIALIZABLE transaction isolation modes spurious serializability errors
can occur:
  ERROR:  could not serialize access due to concurrent update

In order for this to happen, there needs to be a tuple that's key-share-
locked and also updated, and the update must abort; a subsequent
transaction trying to acquire a new lock on that tuple would abort with
the above error.  The reason is that the initial FOR KEY SHARE is seen
as committed by the new locking transaction, which triggers this bug.
(If the UPDATE commits, then the serialization error is correctly
reported.)

When running a query in READ COMMITTED mode, what happens is that the
locking is aborted by the HeapTupleUpdated return value, then
EvalPlanQual fetches the newest version of the tuple, which is then the
only version that gets locked.  (The second time the tuple is checked
there is no misbehavior on the committed lock-only, because it's not
checked by the code that traverses update chains; so no bug.) Only the
newest version of the tuple is locked, not older ones, but this is
harmless.

The isolation test added by this commit illustrates the desired
behavior, including the proper serialization errors that get thrown.

Backpatch to 9.3.
2013-12-05 17:47:51 -03:00
Tom Lane
74242c23c1 Clear retry flags properly in replacement OpenSSL sock_write function.
Current OpenSSL code includes a BIO_clear_retry_flags() step in the
sock_write() function.  Either we failed to copy the code correctly, or
they added this since we copied it.  In any case, lack of the clear step
appears to be the cause of the server lockup after connection loss reported
in bug #8647 from Valentine Gogichashvili.  Assume that this is correct
coding for all OpenSSL versions, and hence back-patch to all supported
branches.

Diagnosis and patch by Alexander Kukushkin.
2013-12-05 12:48:28 -05:00
Alvaro Herrera
07aeb1fec5 Avoid resetting Xmax when it's a multi with an aborted update
HeapTupleSatisfiesUpdate can very easily "forget" tuple locks while
checking the contents of a multixact and finding it contains an aborted
update, by setting the HEAP_XMAX_INVALID bit.  This would lead to
concurrent transactions not noticing any previous locks held by
transactions that might still be running, and thus being able to acquire
subsequent locks they wouldn't be normally able to acquire.

This bug was introduced in commit 1ce150b7bb; backpatch this fix to 9.3,
like that commit.

This change reverts the change to the delete-abort-savept isolation test
in 1ce150b7bb, because that behavior change was caused by this bug.

Noticed by Andres Freund while investigating a different issue reported
by Noah Misch.
2013-12-05 12:21:55 -03:00
Bruce Momjian
86ef4796f5 build: pass EXTRA_REGRESS_OPTS to secondary regression tests
Christoph Berg
2013-12-04 10:14:45 -05:00
Bruce Momjian
5043fc8251 doc: split long query into multiple lines
Report from Erik Rijkers
2013-12-04 10:03:13 -05:00
Peter Eisentraut
dfd5151c58 Fix whitespace 2013-12-03 22:57:08 -05:00
Heikki Linnakangas
9e857436ef Don't include unused space in LOG_NEWPAGE records.
This is the same trick we use when taking a full page image of a buffer
passed to XLogInsert.
2013-12-04 00:10:47 +02:00
Heikki Linnakangas
22122c83f1 Fix full-page writes of internal GIN pages.
Insertion to a non-leaf GIN page didn't make a full-page image of the page,
which is wrong. The code used to do it correctly, but was changed (commit
853d1c3103) because the redo-routine didn't
track incomplete splits correctly when the page was restored from a full
page image. Of course, that was not right way to fix it, the redo routine
should've been fixed instead. The redo-routine was surreptitiously fixed
in 2010 (commit 4016bdef8a), so all we need
to do now is revert the code that creates the record to its original form.

This doesn't change the format of the WAL record.

Backpatch to all supported versions.
2013-12-03 23:16:01 +02:00
Bruce Momjian
4a8adfd4d0 C comment: again update comment for pg_fe_sendauth for error cases 2013-12-03 11:42:18 -05:00
Bruce Momjian
6a6b7bbb81 Update C comment for pg_fe_getauthname
This function no longer takes an argument.
2013-12-03 11:33:46 -05:00
Bruce Momjian
9e0a97f1c8 libpq: change PQconndefaults() to ignore invalid service files
Previously missing or invalid service files returned NULL.  Also fix
pg_upgrade to report "out of memory" for a null return from
PQconndefaults().

Patch by Steve Singer, rewritten by me
2013-12-03 11:12:25 -05:00
Peter Eisentraut
95e3d50539 doc: Refine documentation about recovery command exist status
Add more documentation about how different exit codes and signals are
handled in each case.

Reviewed-by: Peter Geoghegan <pg@heroku.com>
2013-12-02 22:31:41 -05:00
Peter Eisentraut
fef88b3fda Report exit code from external recovery commands properly
When an external recovery command such as restore_command or
archive_cleanup_command fails, report the exit code properly,
distinguishing signals and normal exists, using the existing
wait_result_to_str() facility, instead of just reporting the return
value from system().

Reviewed-by: Peter Geoghegan <pg@heroku.com>
2013-12-02 22:31:05 -05:00
Tom Lane
7ab321404c Fix crash in assign_collations_walker for EXISTS with empty SELECT list.
We (I think I, actually) forgot about this corner case while coding
collation resolution.  Per bug #8648 from Arjen Nienhuis.
2013-12-02 20:28:45 -05:00
Tom Lane
02bb4bbc66 Update release notes for 9.3.2, 9.2.6, 9.1.11, 9.0.15, 8.4.19. 2013-12-02 15:53:55 -05:00
Bruce Momjian
54916b99f7 doc: update wording of ineffective SET and ABORT commands
Wording by Alvaro Herrera
2013-12-02 12:51:58 -05:00
Tom Lane
b8b7b723f2 Improve draft release notes.
Per suggestions from Andres Freund.  Also fix spelling of
Sergey Burladyan's name.
2013-12-02 12:17:46 -05:00
Tom Lane
7a1e34d371 Increase git_changelog's timestamp_slop from 10 min to 1 day.
Many committers seem to now be using a work flow in which back-patched
commits are timestamped minutes or even hours apart in different branches
(most likely because they commit in one branch before starting work on
the next one).  git_changelog was failing to merge its reports in such
cases, so increase the max time it's willing to merge commits across.
I considered getting rid of the limit altogether, but that produces
some odd results in terms of how the merged commit gets sorted relative
to unrelated commits.
2013-12-02 11:33:49 -05:00
Robert Haas
c6d4b1dd3e Flag mmap implemenation of dynamic shared memory as resize-capable.
Error noted by Heikki Linnakangas
2013-12-02 11:18:54 -05:00
Robert Haas
a8656a3ab0 Make NUM_TOCHAR_prepare and NUM_TOCHAR_finish macros declare "len".
Remove the variable from the enclosing scopes so that nothing can be
relying on it.  The net result of this refactoring is that we get rid
of a few unnecessary strlen() calls.

Original patch from Greg Jaskiewicz, substantially expanded by me.
2013-12-02 10:51:06 -05:00
Robert Haas
9d140f7be2 Avoid out-of-bounds read in errfinish if error_stack_depth < 0.
If errordata_stack_depth < 0, we won't find that out and correct the
problem until CHECK_STACK_DEPTH() is invoked.  In the meantime,
elevel will be set based on an invalid read.  This is probably
harmless in practice, but it seems cleaner this way.

Xi Wang
2013-12-02 10:42:01 -05:00
Peter Eisentraut
3e3520cf7a Translation updates 2013-12-02 00:17:07 -05:00
Tom Lane
23e796de15 Draft release notes for 9.3.2.
I'm putting these up for review before I start to extract the relevant
subsets for the older branches.  It'll be easier to make any suggested
wording improvements at this stage.
2013-12-01 18:46:16 -05:00
Peter Eisentraut
3c81b5c1d2 doc: Disable preface.autolabel in XSLT
The makes the output more consistent with the existing DSSSL setup.
2013-12-01 17:13:23 -05:00
Tom Lane
335470251d Update time zone data files to tzdata release 2013h.
DST law changes in Argentina, Brazil, Jordan, Libya, Liechtenstein,
Morocco, Palestine.  New timezone abbreviations WIB, WIT, WITA for
Indonesia.
2013-12-01 14:11:44 -05:00
Tom Lane
4796035402 Editorial corrections to the October 2013 minor-release notes.
This is mostly to fix incorrect migration instructions: since the preceding
minor releases advised reindexing some GIST indexes, it's important that
we back-link to that advice rather than earlier instances.

Also improve some bug descriptions and fix a few typos.

No back-patch yet; these files will get copied into the back branches
later in the release process.
2013-11-30 16:57:25 -05:00
Bruce Momjian
e7d56aee2d pg_upgrade: Handle default_transaction_read_only settings
Setting default_transaction_read_only=true could prevent pg_upgrade from
completing, so prepend default_transaction_read_only=false to
PGOPTIONS.
2013-11-30 16:50:33 -05:00
Kevin Grittner
4bd371f6f8 Fix pg_dumpall to work for databases flagged as read-only.
pg_dumpall's charter is to be able to recreate a database cluster's
contents in a virgin installation, but it was failing to honor that
contract if the cluster had any ALTER DATABASE SET
default_transaction_read_only settings.  By including a SET command
for the connection for each connection opened by pg_dumpall output,
errors are avoided and the source cluster is successfully
recreated.

There was discussion of whether to also set this for the connection
applying pg_dump output, but it was felt that it was both less
appropriate in that context, and far easier to work around.

Backpatch to all supported branches.
2013-11-30 11:24:56 -06:00
Peter Eisentraut
34fa72ec9c Remove use of obsolescent Autoconf macros
Remove the use of the following macros, which are obsolescent according
to the Autoconf documentation:

- AC_C_CONST
- AC_C_STRINGIZE
- AC_C_VOLATILE
- AC_FUNC_MEMCMP
2013-11-30 09:17:08 -05:00
Peter Eisentraut
1eafea5d1b doc: Simplify handling of variablelists in XSLT build
The previously used custom template is no longer necessary because
parameters provided by the standard style sheet can achieve the same
outcome.
2013-11-29 22:42:47 -05:00
Alvaro Herrera
2393c7d102 Fix a couple of bugs in MultiXactId freezing
Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
into a multixact to check the members against cutoff_xid.  This means
that a very old Xid could survive hidden within a multi, possibly
outliving its CLOG storage.  In the distant future, this would cause
clog lookup failures:
ERROR:  could not access status of transaction 3883960912
DETAIL:  Could not open file "pg_clog/0E78": No such file or directory.

This mostly was problematic when the updating transaction aborted, since
in that case the row wouldn't get pruned away earlier in vacuum and the
multixact could possibly survive for a long time.  In many cases, data
that is inaccessible for this reason way can be brought back
heuristically.

As a second bug, heap_freeze_tuple() didn't properly handle multixacts
that need to be frozen according to cutoff_multi, but whose updater xid
is still alive.  Instead of preserving the update Xid, it just set Xmax
invalid, which leads to both old and new tuple versions becoming
visible.  This is pretty rare in practice, but a real threat
nonetheless.  Existing corrupted rows, unfortunately, cannot be repaired
in an automated fashion.

Existing physical replicas might have already incorrectly frozen tuples
because of different behavior than in master, which might only become
apparent in the future once pg_multixact/ is truncated; it is
recommended that all clones be rebuilt after upgrading.

Following code analysis caused by bug report by J Smith in message
CADFUPgc5bmtv-yg9znxV-vcfkb+JPRqs7m2OesQXaM_4Z1JpdQ@mail.gmail.com
and privately by F-Secure.

Backpatch to 9.3, where freezing of MultiXactIds was introduced.

Analysis and patch by Andres Freund, with some tweaks by Álvaro.
2013-11-29 21:47:25 -03:00
Alvaro Herrera
1ce150b7bb Don't TransactionIdDidAbort in HeapTupleGetUpdateXid
It is dangerous to do so, because some code expects to be able to see what's
the true Xmax even if it is aborted (particularly while traversing HOT
chains).  So don't do it, and instead rely on the callers to verify for
abortedness, if necessary.

Several race conditions and bugs fixed in the process.  One isolation test
changes the expected output due to these.

This also reverts commit c235a6a589, which is no longer necessary.

Backpatch to 9.3, where this function was introduced.

Andres Freund
2013-11-29 21:47:21 -03:00
Alvaro Herrera
1df0122daa Truncate pg_multixact/'s contents during crash recovery
Commit 9dc842f08 of 8.2 era prevented MultiXact truncation during crash
recovery, because there was no guarantee that enough state had been
setup, and because it wasn't deemed to be a good idea to remove data
during crash recovery anyway.  Since then, due to Hot-Standby, streaming
replication and PITR, the amount of time a cluster can spend doing crash
recovery has increased significantly, to the point that a cluster may
even never come out of it.  This has made not truncating the content of
pg_multixact/ not defensible anymore.

To fix, take care to setup enough state for multixact truncation before
crash recovery starts (easy since checkpoints contain the required
information), and move the current end-of-recovery actions to a new
TrimMultiXact() function, analogous to TrimCLOG().

At some later point, this should probably done similarly to the way
clog.c is doing it, which is to just WAL log truncations, but we can't
do that for the back branches.

Back-patch to 9.0.  8.4 also has the problem, but since there's no hot
standby there, it's much less pressing.  In 9.2 and earlier, this patch
is simpler than in newer branches, because multixact access during
recovery isn't required.  Add appropriate checks to make sure that's not
happening.

Andres Freund
2013-11-29 21:47:15 -03:00
Alvaro Herrera
f54106f77e Fix full-table-vacuum request mechanism for MultiXactIds
While autovacuum dutifully launched anti-multixact-wraparound vacuums
when the multixact "age" was reached, the vacuum code was not aware that
it needed to make them be full table vacuums.  As the resulting
partial-table vacuums aren't capable of actually increasing relminmxid,
autovacuum continued to launch anti-wraparound vacuums that didn't have
the intended effect, until age of relfrozenxid caused the vacuum to
finally be a full table one via vacuum_freeze_table_age.

To fix, introduce logic for multixacts similar to that for plain
TransactionIds, using the same GUCs.

Backpatch to 9.3, where permanent MultiXactIds were introduced.

Andres Freund, some cleanup by Álvaro
2013-11-29 21:47:13 -03:00