Previous code had old/new prefixes on option values, e.g.
--old-datadir=OLDDATADIR. Remove them, for simplicity; now:
--old-datadir=DATADIR. Also update docs to do the same.
On immediate shutdown, or during a restart-after-crash sequence,
postmaster used to send SIGQUIT (and then abandon ship if shutdown); but
this is not a good strategy if backends don't die because of that
signal. (This might happen, for example, if a backend gets tangled
trying to malloc() due to gettext(), as in an example illustrated by
MauMau.) This causes problems when later trying to restart the server,
because some processes are still attached to the shared memory segment.
Instead of just abandoning such backends to their fates, we now have
postmaster hang around for a little while longer, send a SIGKILL after
some reasonable waiting period, and then exit. This makes immediate
shutdown more reliable.
There is disagreement on whether it's best for postmaster to exit after
sending SIGKILL, or to stick around until all children have reported
death. If this controversy is resolved differently than what this patch
implements, it's an easy change to make.
Bug reported by MauMau in message 20DAEA8949EC4E2289C6E8E58560DEC0@maumau
MauMau and Álvaro Herrera
Change -u (user) option to -U, for consistency with other tools like
pg_dump and psql. Also expand --user to --username, again for
consistency.
BACKWARD INCOMPATIBILITY
This results in a slightly less specific error message when OVER
is used in a context where we don't accept window functions, but
per discussion, it's worth it to get the benefit of not needing
to reserve this keyword any more. This same refactoring will
also let us avoid reserving some other keywords that we expect
to add in upcoming patches (specifically, IGNORE, RESPECT, and
FILTER).
Troels Nielsen, with minor changes by me
In some cases, the use of these macros may be preferable to Assert()
or AssertMacro(), since this way the caller can set the trap message.
Andres Freund and Robert Haas
On many platforms the OS will round the sleep time to millisecond
resolution, but there is no reason for us to pre-emptively round the
argument to pg_usleep.
When the delay was measured in milliseconds and started from 1 ms, it
sometimes took many attempts until the logic that increases the delay by
multiplying with a random value between 1 and 2 actually managed to bump it
from 1 ms to 2 ms. That lead to a sequence of 1 ms waits until the delay
started to increase. This wasn't really a problem but it looked odd if you
observed the waits. There is no measurable difference in performance, but
it's more readable this way.
Jeff Janes
The MaxAllocSize guard is convenient for most callers, because it
reduces the need for careful attention to overflow, data type selection,
and the SET_VARSIZE() limit. A handful of callers are happy to navigate
those hazards in exchange for the ability to allocate a larger chunk.
Introduce MemoryContextAllocHuge() and repalloc_huge(). Use this in
tuplesort.c and tuplestore.c, enabling internal sorts of up to INT_MAX
tuples, a factor-of-48 increase. In particular, B-tree index builds can
now benefit from much-larger maintenance_work_mem settings.
Reviewed by Stephen Frost, Simon Riggs and Jeff Janes.
When there's a comment on an index that was created with UNIQUE or PRIMARY
KEY constraint syntax, we need to label the comment as depending on the
constraint not the index, since only the constraint object actually appears
in the dump. This incorrect dependency can lead to parallel pg_restore
trying to restore the comment before the index has been created, per bug
#8257 from Lloyd Albin.
This patch fixes pg_dump to produce the right dependency in dumps made
in the future. Usually we also try to hack pg_restore to work around
bogus dependencies, so that existing (wrong) dumps can still be restored in
parallel mode; but that doesn't seem practical here since there's no easy
way to relate the constraint dump entry to the comment after the fact.
Andres Freund
On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is
*not* a success return, at least not on Linux. We need to treat it as a
failure to avoid giving a misleading error message. Per the Single Unix
Spec, only EINPROGRESS and EINTR returns indicate that the connection
attempt is in progress.
On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected
case. We must accept EINPROGRESS as well because Cygwin will return that,
and it doesn't seem worth distinguishing Cygwin from native Windows here.
It's not very clear whether EINTR can occur on Windows, but let's leave
that part of the logic alone in the absence of concrete trouble reports.
Also, remove the test for errno == 0, effectively reverting commit
da9501bddb, which AFAICS was just a thinko;
or at best it might have been a workaround for a platform-specific bug,
which we can hope is gone now thirteen years later. In any case, since
libpq makes no effort to reset errno to zero before calling connect(),
it seems unlikely that that test has ever reliably done anything useful.
Andres Freund and Tom Lane
Adjust the wording in the first para of "Sequence Manipulation Functions"
so that neither of the link phrases in it break across line boundaries,
in either A4- or US-page-size PDF output. This fixes a reported build
failure for the 9.3beta2 A4 PDF docs, and future-proofs this particular
para against causing similar problems in future. (Perhaps somebody will
fix this issue in the SGML/TeX documentation tool chain someday, but I'm
not holding my breath.)
Back-patch to all supported branches, since the same problem could rise up
to bite us in future updates if anyone changes anything earlier than this
in func.sgml.
Valgrind "client requests" in aset.c and mcxt.c teach Valgrind and its
Memcheck tool about the PostgreSQL allocator. This makes Valgrind
roughly as sensitive to memory errors involving palloc chunks as it is
to memory errors involving malloc chunks. Further client requests in
PageAddItem() and printtup() verify that all bits being added to a
buffer page or furnished to an output function are predictably-defined.
Those tests catch failures of C-language functions to fully initialize
the bits of a Datum, which in turn stymie optimizations that rely on
_equalConst(). Define the USE_VALGRIND symbol in pg_config_manual.h to
enable these additions. An included "suppression file" silences nominal
errors we don't plan to fix.
Reviewed in earlier versions by Peter Geoghegan and Korry Douglas.
Move some repeated debugging code into functions and store intermediates
in variables where not presently necessary. No code-generation changes
in a production build, and no functional changes. This simplifies and
focuses the main patch.
Every other core buffer page consumer initializes the bytes it furnishes
to PageAddItem(). For consistency, do the same here. No back-patch;
regardless, we couldn't count on the fix so long as binary upgrade can
carry forward affected index builds.
GNU gettext selects a default encoding for the messages it emits in a
platform-specific manner; it uses the Windows ANSI code page on Windows
and follows LC_CTYPE on other platforms. This is inconvenient for
PostgreSQL server processes, so realize consistent cross-platform
behavior by calling bind_textdomain_codeset() on Windows each time we
permanently change LC_CTYPE. This primarily affects SQL_ASCII databases
and processes like the postmaster that do not attach to a database,
making their behavior consistent with PostgreSQL on non-Windows
platforms. Messages from SQL_ASCII databases use the encoding implied
by the database LC_CTYPE, and messages from non-database processes use
LC_CTYPE from the postmaster system environment. PlatformEncoding
becomes unused, so remove it.
Make write_console() prefer WriteConsoleW() to write() regardless of the
encodings in use. In this situation, write() will invariably mishandle
non-ASCII characters.
elog.c has assumed that messages conform to the database encoding.
While usually true, this does not hold for SQL_ASCII and MULE_INTERNAL.
Introduce MessageEncoding to track the actual encoding of message text.
The present consumers are Windows-specific code for converting messages
to UTF16 for use in system interfaces. This fixes the appearance in
Windows event logs and consoles of translated messages from SQL_ASCII
processes like the postmaster. Note that SQL_ASCII inherently disclaims
a strong notion of encoding, so non-ASCII byte sequences interpolated
into messages by %s may yet yield a nonsensical message. MULE_INTERNAL
has similar problems at present, albeit for a different reason: its lack
of libiconv support or a conversion to UTF8.
Consequently, one need no longer restart Windows with a different
Windows ANSI code page to broadly test backend logging under a given
language. Changing the user's locale ("Format") is enough. Several
accounts can simultaneously run postmasters under different locales, all
correctly logging localized messages to Windows event logs and consoles.
Alexander Law and Noah Misch
Clang 3.3 correctly complains that a variable of type enum
MultiXactStatus cannot hold a value of -1, which makes sense. Change
the declared type of the variable to int instead, and apply casting as
necessary to avoid the warning.
Per notice from Andres Freund
In binary upgrade mode, we need to recreate and then drop dropped
columns so that all the columns get the right attribute number. This is
true for foreign tables as well as for native tables. For foreign
tables we have been getting the first part right but not the second,
leading to bogus columns in the upgraded database. Fix this all the way
back to 9.1, where foreign tables were introduced.
In replication, when we shutdown the master, walsender tries to send
all the outstanding WAL records to the standby, and then to exit. This
basically means that all the WAL records are fully synced between
two servers after the clean shutdown of the master. So, after
promoting the standby to new master, we can restart the stopped
master as new standby without the need for a fresh backup from
new master.
But there was one problem so far: though walsender tries to send all
the outstanding WAL records, it doesn't wait for them to be replicated
to the standby. Then, before receiving all the WAL records,
walreceiver can detect the closure of connection and exit. We cannot
guarantee that there is no missing WAL in the standby after clean
shutdown of the master. In this case, backup from new master is
required when restarting the stopped master as new standby.
This patch fixes this problem. It just changes walsender so that it
waits for all the outstanding WAL records to be replicated to the
standby before closing the replication connection.
Per discussion, this is a fix that needs to get backpatched rather than
new feature. So, back-patch to 9.1 where enough infrastructure for
this exists.
Patch by me, reviewed by Andres Freund.
Follow-up to commit 873ab97219, in which
I noted that WaitLatch was a better solution in the commit log message,
but neglected to add any documentation in the code.
In some cases with higher numbers of subtransactions
it was possible for us to incorrectly initialize
subtrans leading to complaints of missing pages.
Bug report by Sergey Konoplev
Analysis and fix by Andres Freund
If there is no <date> element, the publication date for the EPUB
manifest is taken from the copyright year. But something like
"1996-2013" is not a legal date specification. So the EPUB output
currently fails epubcheck.
Put in a separate <date> element with the current year. Put it in
legal.sgml, because copyright.pl already instructs to update that
manually, so it hopefully won't be missed.
Most of the documentation uses "single-user mode", so use that in the
code as well. Adjust the documentation to match the new error message
wording. Also add a documentation index entry for "single-user mode".
Based-on-patch-by: Jeff Janes <jeff.janes@gmail.com>
In Danish collations, there are letter combinations which sort
higher than 'Z'. A test for values > 'WA' was picking up rows
where the value started with 'AA', causing the test to fail.
Backpatch to 9.2, where the failing test was added.
Per report from Svenne Krap and analysis by Jeff Janes
ALTER TABLE .. VALIDATE CONSTRAINT previously
gave incorrect details about lock levels and
therefore incomplete reasons to use the option.
Initial bug report and fix from Marko Tiikkaja
Reworded by me to include comments by Kevin Grittner
MarkBufferDirtyHint() writes WAL, and should know if it's got a
standard buffer or not. Currently, the only callers where buffer_std
is false are related to the FSM.
In passing, rename XLOG_HINT to XLOG_FPI, which is more descriptive.
Back-patch to 9.3.
This avoids platform-dependent behavior wherein pg_sleep() might fail to be
interrupted by statement timeout, query cancel, SIGTERM, etc. Also, since
there's no reason to wake up once a second any more, we can reduce the
power consumption of a sleeping backend a tad.
Back-patch to 9.3, since use of SA_RESTART for SIGALRM makes this a bigger
issue than it used to be.
The exclusion of SIGALRM dates back to Berkeley days, when Postgres used
SIGALRM in only one very short stretch of code. Nowadays, allowing it to
interrupt kernel calls doesn't seem like a very good idea, since its use
for statement_timeout means SIGALRM could occur anyplace in the code, and
there are far too many call sites where we aren't prepared to deal with
EINTR failures. When third-party code is taken into consideration, it
seems impossible that we ever could be fully EINTR-proof, so better to
use SA_RESTART always and deal with the implications of that. One such
implication is that we should not assume pg_usleep() will be terminated
early by a signal. Therefore, long sleeps should probably be replaced
by WaitLatch operations where practical.
Back-patch to 9.3 so we can get some beta testing on this change.