We must only set the bit(s) for the strongest lock held in the tuple;
otherwise, a multixact containing members with exclusive lock and
key-share lock will behave as though only a share lock is held.
This bug was introduced in commit 0ac5ad5134, somewhere along
development, when we allowed a singleton FOR SHARE lock to be
implemented without a MultiXact by using a multi-bit pattern.
I overlooked that GetMultiXactIdHintBits() needed to be tweaked as well.
Previously, we could have the bits for FOR KEY SHARE and FOR UPDATE
simultaneously set and it wouldn't cause a problem.
Per report from digoal@126.com
Previous patch to skip checkpoints at end of recovery didn't
correctly perform crash recovery, fumbling the timeline switch.
Now we record the minRecoveryPointTLI of the newly selected
timeline, so that we crash recover to the correct timeline.
Bug report from Fujii Masao, investigated by me.
It's not sensible for an interval that's used as a time zone value to be
larger than a day. When we changed the interval type to contain a separate
day field, check_timezone() was adjusted to reject nonzero day values, but
timetz_izone(), timestamp_izone(), and timestamptz_izone() evidently were
overlooked.
While at it, make the error messages for these three cases consistent.
This ensure the version number increases over time. The first three digits
in the version number is still set to the actual PostgreSQL version
number, but the last one is intended to be an ever increasing build number,
which previosly failed when it changed between 1, 2 and 3 digits long values.
Noted by Deepak
The new option specifies length of aggregation interval (in
seconds). May be used only together with -l. With this option, the log
contains per-interval summary (number of transactions, min/max latency
and two additional fields useful for variance estimation).
Patch contributed by Tomas Vondra, reviewed by Pavel Stehule. Slight
change by Tatsuo Ishii, suggested by Robert Hass to emit an error
message indicating that the option is not currently supported on
Windows.
exec_simple_check_plan and exec_eval_simple_expr attempted to call
GetCachedPlan directly. This meant that if an error was thrown during
planning, the resulting context traceback would not include the line
normally contributed by _SPI_error_callback. This is already inconsistent,
but just to be really odd, a re-execution of the very same expression
*would* show the additional context line, because we'd already have cached
the plan and marked the expression as non-simple.
The problem is easy to demonstrate in 9.2 and HEAD because planning of a
cached plan doesn't occur at all until GetCachedPlan is done. In earlier
versions, it could only be an issue if initial planning had succeeded, then
a replan was forced (already somewhat improbable for a simple expression),
and the replan attempt failed. Since the issue is mainly cosmetic in older
branches anyway, it doesn't seem worth the risk of trying to fix it there.
It is worth fixing in 9.2 since the instability of the context printout can
affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion
on pgsql-novice.
To fix, introduce a SPI function that wraps GetCachedPlan while installing
the correct callback function. Use this instead of calling GetCachedPlan
directly from plpgsql.
Also introduce a wrapper function for extracting a SPI plan's
CachedPlanSource list. This lets us stop including spi_priv.h in
pl_exec.c, which was never a very good idea from a modularity standpoint.
In passing, fix a similar inconsistency that could occur in SPI_cursor_open,
which was also calling GetCachedPlan without setting up a context callback.
Such cases should work, but the grammar failed to accept them because of
our ancient precedence hacks to convince bison that extra parentheses
around a sub-SELECT in an expression are unambiguous. (Formally, they
*are* ambiguous, but we don't especially care whether they're treated as
part of the sub-SELECT or part of the expression. Bison cares, though.)
Fix by adding a redundant-looking production for this case.
This is a fine example of why fixing shift/reduce conflicts via
precedence declarations is more dangerous than it looks: you can easily
cause the parser to reject cases that should work.
This has been wrong since commit 3db4056e22
or maybe before, and apparently some people have been working around it
by inserting no-op casts. That method introduces a dump/reload hazard,
as illustrated in bug #7838 from Jan Mate. Hence, back-patch to all
active branches.
This patch addresses the problem that applications currently have to
extract object names from possibly-localized textual error messages,
if they want to know for example which index caused a UNIQUE_VIOLATION
failure. It adds new error message fields to the wire protocol, which
can carry the name of a table, table column, data type, or constraint
associated with the error. (Since the protocol spec has always instructed
clients to ignore unrecognized field types, this should not create any
compatibility problem.)
Support for providing these new fields has been added to just a limited set
of error reports (mainly, those in the "integrity constraint violation"
SQLSTATE class), but we will doubtless add them to more calls in future.
Pavel Stehule, reviewed and extensively revised by Peter Geoghegan, with
additional hacking by Tom Lane.
Beyond 21474, the number of accounts exceed the range for int4. Change the
initialization code to use bigint for account id columns when scale is large
enough, and switch to using int64s for the variables in pgbench code. The
threshold where we switch to bigints is set at 20000, because that's easier
to remember and document than 21474, and ensures that there is some headroom
when int4s are used.
Greg Smith, with various changes by Euler Taveira de Oliveira, Gurjeet
Singh and Satoshi Nagayasu.
touched any temporary tables.
We could try harder, and keep track of whether we've inserted to any temp
tables, rather than accessed them, and which temp tables have been inserted
to. But this is dead simple, and already covers many interesting scenarios.
pg_ctl promote -m fast will skip the checkpoint at end of recovery so that we
can achieve very fast failover when the apply delay is low. Write new WAL record
XLOG_END_OF_RECOVERY to allow us to switch timeline correctly for downstream log
readers. If we skip synchronous end of recovery checkpoint we request a normal
spread checkpoint so that the window of re-recovery is low.
Simon Riggs and Kyotaro Horiguchi, with input from Fujii Masao.
Review by Heikki Linnakangas
Give away ownership of shared objects (databases, tablespaces) along
with local objects, per original code intention. Try to make the
documentation clearer, too.
Per discussion about DROP OWNED's brokenness, in bug #7748.
This is not backpatched because it'd require some refactoring of the
ALTER/SET OWNER code for databases and tablespaces.
My "fix" for bugs #7578 and #6116 on DROP OWNED at fe3b5eb08a not only
misstated that it applied to REASSIGN OWNED (which it did not affect),
but it also failed to fix the problems fully, because I didn't test the
case of owned shared objects. Thus I created a new bug, reported by
Thomas Kellerer as #7748, which would cause DROP OWNED to fail with a
not-for-user-consumption error message. The code would attempt to drop
the database, which not only fails to work because the underlying code
does not support that, but is a pretty dangerous and undesirable thing
to be doing as well.
This patch fixes that bug by having DROP OWNED only attempt to process
shared objects when grants on them are found, ignoring ownership.
Backpatch to 8.3, which is as far as the previous bug was backpatched.
If a PL/Python function raises an SPIError (or one if its subclasses)
directly with python's raise statement, treat it the same as an SPIError
generated internally. In particular, if the user sets the sqlstate
attribute, preserve that.
Oskari Saarenmaa and Jan Urbański, reviewed by Karl O. Pinc.
The SQL standard does not have general functions-in-FROM, but it does
allow UNNEST() there (see the <collection derived table> production),
and the semantics of that are defined to include lateral references.
So spec compliance requires allowing lateral references within UNNEST()
even without an explicit LATERAL keyword. Rather than making UNNEST()
a special case, it seems best to extend this flexibility to any
function-in-FROM. We'll still allow LATERAL to be written explicitly
for clarity's sake, but it's now a noise word in this context.
In theory this change could result in a change in behavior of existing
queries, by allowing what had been an outer reference in a function-in-FROM
to be captured by an earlier FROM-item at the same level. However, all
pre-9.3 PG releases have a bug that causes them to match variable
references to earlier FROM-items in preference to outer references (and
then throw an error). So no previously-working query could contain the
type of ambiguity that would risk a change of behavior.
Per a suggestion from Andrew Gierth, though I didn't use his patch.
DROP IF EXISTS with a missing schema in commit
7e2322dff3 applies not only to tables, but
to DROP IF EXISTS with missing schemas for indexes, views, sequences,
and foreign tables. Yeah!
Previously non-honored FREEZE mode was ignored. This also issues an
appropriate error message based on the cause of the failure, per
suggestion from Tom. Additional regression test case added.
Previously, CREATE TABLE IF EXIST threw an error if the schema was
nonexistent. This was done by passing 'missing_ok' to the function that
looks up the schema oid.
plpython tried to use a single cache entry for a trigger function, but it
needs a separate cache entry for each table the trigger is applied to,
because there is table-dependent data in there. This was done correctly
before 9.1, but commit 46211da1b8 broke it
by simplifying the lookup key from "function OID and triggered table OID"
to "function OID and is-trigger boolean". Go back to using both OIDs
as the lookup key. Per bug report from Sandro Santilli.
Andres Freund
In the initial implementation of plan caching, we saved the active
search_path when a plan was first cached, then reinstalled that path
anytime we needed to reparse or replan. The idea of that was to try to
reselect the same referenced objects, in somewhat the same way that views
continue to refer to the same objects in the face of schema or name
changes. Of course, that analogy doesn't bear close inspection, since
holding the search_path fixed doesn't cope with object drops or renames.
Moreover sticking with the old path seems to create more surprises than
it avoids. So instead of doing that, consider that the cached plan depends
on search_path, and force reparse/replan if the active search_path is
different than it was when we last saved the plan.
This gets us fairly close to having "transparency" of plan caching, in the
sense that the cached statement acts the same as if you'd just resubmitted
the original query text for another execution. There are still some corner
cases where this fails though: a new object added in the search path
schema(s) might capture a reference in the query text, but we'd not realize
that and force a reparse. We might try to fix that in the future, but for
the moment it looks too expensive and complicated.
When descending the tree for an insert, and there are multiple equally good
pages we could insert to, make the choice in random. Previously, we would
always choose the tuple with lowest offset number. That meant that when two
non-leaf pages overlap - in the extreme case they might have exactly the same
key - all but the first such page went unused. That wasn't optimal for space
usage; if you deleted some tuples from the non-first pages, the space would
never be reused.
With this patch, the other pages are sometimes chosen too, although there's
still a heavy bias towards low-offset tuples, so that we don't lose cache
locality when doing a lot of inserts with similar keys.
Original idea by Alexander Korotkov, although this patch version was written
by me and copy-edited by Tom Lane.
Previously, the VARIADIC labeling was effectively ignored, but now these
functions act as though the array elements had all been given as separate
arguments.
Pavel Stehule
Since 9.0, the count parameter has only limited the number of tuples
actually returned by the executor. It doesn't affect the behavior of
INSERT/UPDATE/DELETE unless RETURNING is specified, because without
RETURNING, the ModifyTable plan node doesn't return control to execMain.c
for each tuple. And we only check the limit at the top level.
While this behavioral change was unintentional at the time, discussion of
bug #6572 led us to the conclusion that we prefer the new behavior anyway,
and so we should just adjust the docs to match rather than change the code.
Accordingly, do that. Back-patch as far as 9.0 so that the docs match the
code in each branch.
This ensures that mapping of non-ascii prompts
to the correct code page occurs.
Bug report and original patch from Alexander Law,
reviewed and reworked by Noah Misch.
Backpatch to all live branches.
Tuples marked SELECT FOR UPDATE in a cluster that's later processed by
pg_upgrade would have a different infomask bit pattern than those
produced by 9.3dev; that bit pattern was being seen as "dead" by HEAD
(because they would fail the "is this tuple locked" test, and so the
visibility rules would thing they're updated, even though there's no
HEAP_UPDATED version of them). In other words, some rows could silently
disappear after pg_upgrade.
With this new definition, those tuples become visible again.
This is breakage resulting from my commit 0ac5ad5134.
This makes 9.3 -> 9.3 upgrades work when they cross the commit that
added persistent multixacts; early 9.3 pg_controldata did not have the
required oldestMultiXact line, and so would fail to upgrade.
per Bruce Momjian
The machinery around XLOG_HEAP2_CLEANUP_INFO failed
to correctly pass through the necessary information
on latestRemovedXid, avoiding cancellations in some
infrequent concurrent update/cleanup scenarios.
Backpatchable fix to 9.0
Detailed bug report and fix by Noah Misch,
backpatchable version by me.
When pg_upgrade can't find required pg_controldata information, report
_which_ cluster is failing, with this message:
The %s cluster lacks some required control information: