Commit 1b86c81d2d fixed the decoding of toasted columns for the rows
contained in one xl_heap_multi_insert record. But that's not actually
enough, because heap_multi_insert() will actually first toast all
passed in rows and then emit several *_multi_insert records; one for
each page it fills with tuples.
Add a XLOG_HEAP_LAST_MULTI_INSERT flag which is set in
xl_heap_multi_insert->flag denoting that this multi_insert record is
the last emitted by one heap_multi_insert() call. Then use that flag
in decode.c to only set clear_toast_afterwards in the right situation.
Expand the number of rows inserted via COPY in the corresponding
regression test to make sure that more than one heap page is filled
with tuples by one heap_multi_insert() call.
Backpatch to 9.4 like the previous commit.
This command provides an automated way to create foreign table definitions
that match remote tables, thereby reducing tedium and chances for error.
In this patch, we provide the necessary core-server infrastructure and
implement the feature fully in the postgres_fdw foreign-data wrapper.
Other wrappers will throw a "feature not supported" error until/unless
they are updated.
Ronan Dunklau and Michael Paquier, additional work by me
Previously, when calculations on the need for toast tables changed,
pg_upgrade could not handle cases where the new cluster needed a TOAST
table and the old cluster did not. (It already handled the opposite
case.) This fixes the "OID mismatch" error typically generated in this
case.
Backpatch through 9.2
When decoding the results of a HEAP2_MULTI_INSERT (currently only
generated by COPY FROM) toast columns for all but the last tuple
weren't replaced by their actual contents before being handed to the
output plugin. The reassembled toast datums where disregarded after
every REORDER_BUFFER_CHANGE_(INSERT|UPDATE|DELETE) which is correct
for plain inserts, updates, deletes, but not multi inserts - there we
generate several REORDER_BUFFER_CHANGE_INSERTs for a single
xl_heap_multi_insert record.
To solve the problem add a clear_toast_afterwards boolean to
ReorderBufferChange's union member that's used by modifications. All
row changes but multi_inserts always set that to true, but
multi_insert sets it only for the last change generated.
Add a regression test covering decoding of multi_inserts - there was
none at all before.
Backpatch to 9.4 where logical decoding was introduced.
Bug found by Petr Jelinek.
The isxdigit() calls relied on undefined behavior. The isascii() call
was well-defined, but our prevailing style is to include the cast.
Back-patch to 9.4, where the isxdigit() calls were introduced.
Historically these database properties could be manipulated only by
manually updating pg_database, which is error-prone and only possible for
superusers. But there seems no good reason not to allow database owners to
set them for their databases, so invent CREATE/ALTER DATABASE options to do
that. Adjust a couple of places that were doing it the hard way to use the
commands instead.
Vik Fearing, reviewed by Pavel Stehule
The output buffer size in unaccent_lexize() was calculated as input string
length times pg_database_encoding_max_length(), which effectively assumes
that replacement strings aren't more than one character. While that was
all that we previously documented it to support, the code actually has
always allowed replacement strings of arbitrary length; so if you tried
to make use of longer strings, you were at risk of buffer overrun. To fix,
use an expansible StringInfo buffer instead of trying to determine the
maximum space needed a-priori.
This would be a security issue if unaccent rules files could be installed
by unprivileged users; but fortunately they can't, so in the back branches
the problem can be labeled as improper configuration by a superuser.
Nonetheless, a memory stomp isn't a nice way of reacting to improper
configuration, so let's back-patch the fix.
We were already issuing a WARNING, albeit only elog not ereport, for
duplicate source strings; so warning rather than just being stoically
silent seems like the best thing to do here. Arguably both of these
complaints should be upgraded to ERRORs, but that might be more
behavioral change than people want.
Note: the faulty line is already printed via an errcontext hook,
so there's no need for more information than these messages provide.
This could be useful in languages where diacritic signs are represented as
separate characters; more generally it supports using unaccent dictionaries
for substring substitutions beyond narrowly conceived "diacritic removal".
In any case, since the rule-file parser doesn't complain about
multi-character source strings, it behooves us to do something unsurprising
with them.
This is useful in languages where diacritic signs are represented as
separate characters; it's also one step towards letting unaccent be used
for arbitrary substring substitutions.
In passing, improve the user documentation for unaccent, which was sadly
vague about some important details.
Mohammad Alhashash, reviewed by Abhijit Menon-Sen
There were some C comments that hadn't been updated from the switch of
using only pg_dumpall to using pg_dump and pg_dumpall, so update them.
Also, don't bother using --schema-only for pg_dumpall --globals-only.
Backpatch through 9.4
This function continued to use it after heap_endscan() freed it. In
passing, don't explicit create a strategy here. Instead, use the one
created by heap_beginscan_strat(), if any. Back-patch to 9.2, where use
of a BufferAccessStrategy here was introduced.
This fixes a bug that caused vacuum to fail when the '0000' files left
by initdb were accessed as part of vacuum's cleanup of old pg_multixact
files.
Backpatch through 9.3
dblink uses a short-lived data conversion memory context. However it
was not deleted when no longer needed, leading to a noticeable memory
leak under some circumstances. Plug the hole, along with minor
refactoring. Backpatch to 9.2 where the leak was introduced.
Report and initial patch by MauMau. Reviewed/modified slightly by
Tom Lane and me.
Give passwords to each user created in support of an ECPG connection
test case. Use SET SESSION AUTHORIZATION, not a fresh connection, to
reduce privileges during a dblink test case.
To test against such a server, both the "make installcheck-world"
environment and the postmaster environment must provide the default
user's password; $PGPASSFILE is the principal way to do so. (The
postmaster environment needs it for dblink and postgres_fdw tests.)
Arrange for postmaster child processes to respond to two environment
variables, PG_OOM_ADJUST_FILE and PG_OOM_ADJUST_VALUE, to determine whether
they reset their OOM score adjustments and if so to what. This is superior
to the previous design involving #ifdef's in several ways. The behavior is
now available in a default build, and both ends of the adjustment --- the
original adjustment of the postmaster's level and the subsequent
readjustment by child processes --- can now be controlled in one place,
namely the postmaster launch script. So it's no longer necessary for the
launch script to act on faith that the server was compiled with the
appropriate options. In addition, if someone wants to use an OOM score
other than zero for the child processes, that doesn't take a recompile
anymore; and we no longer have to cater separately to the two different
historical kernel APIs for this adjustment.
Gurjeet Singh, somewhat revised by me
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
Since most of the system thinks AND and OR are N-argument expressions
anyway, let's have the grammar generate a representation of that form when
dealing with input like "x AND y AND z AND ...", rather than generating
a deeply-nested binary tree that just has to be flattened later by the
planner. This avoids stack overflow in parse analysis when dealing with
queries having more than a few thousand such clauses; and in any case it
removes some rather unsightly inconsistencies, since some parts of parse
analysis were generating N-argument ANDs/ORs already.
It's still possible to get a stack overflow with weirdly parenthesized
input, such as "x AND (y AND (z AND ( ... )))", but such cases are not
mainstream usage. The maximum depth of parenthesization is already
limited by Bison's stack in such cases, anyway, so that the limit is
probably fairly platform-independent.
Patch originally by Gurjeet Singh, heavily revised by me
Any OS user able to access the socket can connect as the bootstrap
superuser and proceed to execute arbitrary code as the OS user running
the test. Protect against that by placing the socket in a temporary,
mode-0700 subdirectory of /tmp. The pg_regress-based test suites and
the pg_upgrade test suite were vulnerable; the $(prove_check)-based test
suites were already secure. Back-patch to 8.4 (all supported versions).
The hazard remains wherever the temporary cluster accepts TCP
connections, notably on Windows.
As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values
like /var/run/postgresql are often unwritable to the build user.
Security: CVE-2014-0067
187492b6c2 changed pgstat.c so that
the stats files were saved into $PGDATA/pg_stat directory when the server
was shutdowned. But it accidentally forgot to change the location of
pg_stat_statements permanent stats file. This commit fixes pg_stat_statements
so that its stats file is also saved into $PGDATA/pg_stat at shutdown.
Since this fix changes the file layout, we don't back-patch it to 9.3
where this oversight was introduced.
The original coding in contrib/uuid-ossp created and destroyed a uuid_t
object (or, in some cases, even two of them) each time it was called.
This is not the intended usage: you're supposed to keep the uuid_t object
around so that the library can cache its state across uses. (Other UUID
libraries seem to keep equivalent state behind-the-scenes in static
variables, but OSSP chose differently.) Aside from being quite inefficient,
creating a new uuid_t loses knowledge of the previously generated UUID,
which in theory could result in duplicate V1-style UUIDs being created
on sufficiently fast machines.
On at least some platforms, creating a new uuid_t also draws some entropy
from /dev/urandom, leaving less for the rest of the system. This seems
sufficiently unpleasant to justify back-patching this change.
The previous version of these tests expected uuid_generate_v1() to always
emit MAC addresses with the local-admin and multicast address bits zero.
However, several of the buildfarm critters are reporting values with the
local-admin bit set. (Perhaps they're running inside VMs or jails.)
And a couple are reporting values with the multicast bit set, probably
meaning that the UUID library couldn't read the system MAC address.
Also, it emerges that if OSSP UUID can't read the system MAC address, it
falls back to V1MC behavior wherein the whole node field gets randomized
each time, breaking the test that expected the node field to remain stable
in V1 output. (It looks like e2fs doesn't behave that way, though.)
It's not entirely clear why we can't get a system MAC address, since the
buildfarm scripts would not work without internet access. Nonetheless,
the regression tests had better cope with the case, so adjust the tests
to expect these behaviors.
This reverts commit 45b7abe59e.
It turns out that the %name-prefix syntax without "=" does not work
at all in pre-2.4 Bison. We are not prepared to make such a large
jump in minimum required Bison version just to suppress a warning
message in a version hardly any developers are using yet.
When 3.0 gets more popular, we'll figure out a way to deal with this.
In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
anyone using 3.0 who doesn't want to see the warning.
%name-prefix doesn't use an "=" sign according to the Bison docs, but it
silently accepted one anyway, until Bison 3.0. This was originally a
typo of mine in commit 012abebab1, and we
seem to have slavishly copied the error into all the other grammar files.
Per report from Vik Fearing; analysis by Peter Eisentraut.
Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.
On reflection, the timestamp-advances test might fail if we're unlucky
enough for the time_mid field to change between two calls, since uuid_cmp
is just bytewise comparison and the field ordering has more significant
fields later. Build some field extraction functions so we can do a more
honest test of that. Also check that the version and reserved fields
contain what they should.
The V5 (SHA1 hashing) code wrote 20 bytes into a 16-byte local variable.
This had accidentally failed to fail in my testing and Matteo's, but
buildfarm results exposed the problem.
Allow the contrib/uuid-ossp extension to be built atop any one of these
three popular UUID libraries. (The extension's name is now arguably a
misnomer, but we'll keep it the same so as not to cause unnecessary
compatibility issues for users.)
We would not normally consider a change like this post-beta1, but the issue
has been forced by our upgrade to autoconf 2.69, whose more rigorous header
checks are causing OSSP's header files to be rejected on some platforms.
It's been foreseen for some time that we'd have to move away from depending
on OSSP UUID due to lack of upstream maintenance, so this is a down payment
on that problem.
While at it, add some simple regression tests, in hopes of catching any
major incompatibilities between the three implementations.
Matteo Beccati, with some further hacking by me
Commit 090d0f2050 added new code showing
how it can be useful to set bgw_notify_pid to a non-zero value, but it
failed to make sure that the existing call to RegisterBackgroundWorker
initialized the new field at all.
Report and patch by Shigeru Hanada.
On Mingw, it seems that scanf() doesn't necessarily accept the same format
codes that printf() does, and in particular it may fail to recognize %llu
even though printf() does. Since configure only probes printf() behavior
while setting up the INT64_FORMAT macros, this means it's unsafe to use
those macros with scanf(). We had only one instance of such a coding
pattern, in contrib/pg_stat_statements, so change that code to avoid
the problem.
Per buildfarm warnings. Back-patch to 9.0 where the troublesome code
was introduced.
Michael Paquier
Change the total-transactions counters from int32 to int64 to accommodate
cases where we do more than 2^31 transactions during a run. This patch
does not change the INT_MAX limit on explicit "-t" parameters, but it
does allow the product of the -t and -c parameters to exceed INT_MAX, or
allow a -T limit that is large enough that more than 2^31 transactions
can be completed. While pgbench did not actually fail in such cases,
it did print an incorrect total-transactions count, and some of the
derived numbers such as TPS would have been wrong as well.
Tomas Vondra
C89 says that compound initializers may only contain constant expressions;
a restriction violated by commit 89d00cbe. While we've had no actual field
complaints about this, C89 is still the project standard, and it's not
saving all that much code to break compatibility here. So let's adhere to
the old restriction.
In passing, replace a bunch of hardwired constants "256" with
sizeof(target-variable), just because the latter is more readable and
less breakable. And const-ify where possible.
Back-patch to 9.3 where the nonportable code was added.
Andres Freund and Tom Lane
gbt_macad_union also allocated 12-byte structs where we really need 16.
Per report from Andres Freund. No back-patch since there's no current
risk of a real problem.
The macaddr opclass stores two macaddr structs (each of size 6) in an
index column that's declared as being of type gbtreekey16, ie 16 bytes.
In the original coding this led to passing a palloc'd value of size 12
to the index insertion code, so that data would be fetched past the
end of the allocated value during index tuple construction. This makes
valgrind unhappy. In principle it could result in a SIGSEGV, though
with the current implementation of palloc there's no risk since
the 12-byte request size would be rounded up to 16 bytes anyway.
To fix, add a field to struct gbtree_ninfo showing the declared size of
the index datums, and use that in the palloc requests; and use palloc0
to be sure that any wasted bytes are cleanly initialized.
Per report from Andres Freund. No back-patch since there's no current
risk of a real problem.
pg_stat_replication shows connected replication clients. The ddl test case
never has any replication clients connected, so querying pg_stat_replication
is pointless. To check that a slot has been dropped correctly, query
pg_replication_slots instead.
Andres Freund
The code expands a varbit gist leaf key to a node key by copying the bit
data twice in a varlen datum, as both the lower and upper key. The lower key
was expanded to INTALIGN size, but the padding bytes were not initialized.
That's a problem because when the lower/upper keys are compared, the padding
bytes are used compared too, when the values are otherwise equal. That could
lead to incorrect query results.
REINDEX is advised for any btree_gist indexes on bit or bit varying data
type, to fix any garbage padding bytes on disk.
Per Valgrind, reported by Andres Freund. Backpatch to all supported
versions.