If we're skipping past a certain TID, avoid decoding posting list segments
that only contain smaller TIDs.
Extracted from Alexander Korotkov's fast scan patch, heavily modified.
In a multi-key search, ie. something like "col @> 'foo' AND col @> 'bar'",
as soon as we find the next item that matches the first criteria, we don't
need to check the second criteria for TIDs smaller the first match. That
saves a lot of effort, especially if one of the terms is rare, while the
second occurs very frequently.
Based on ideas from Alexander Korotkov's fast scan patch.
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.
The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.
In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.
Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
json_build_array() and json_build_object allow for the construction of
arbitrarily complex json trees. json_object() turns a one or two
dimensional array, or two separate arrays, into a json_object of
name/value pairs, similarly to the hstore() function.
json_object_agg() aggregates its two arguments into a single json object
as name value pairs.
Catalog version bumped.
Andrew Dunstan, reviewed by Marko Tiikkaja.
Per the expanded comment-
As we're just trying to reset these to go to DEVNULL, there's not
much point in checking for failure from the close/dup2 calls here,
if they fail then presumably the file descriptors are closed and
any writes will go into the bitbucket anyway.
Pointed out by Tom.
It's worth distinguishing these cases from run-of-the-mill wrong-password
problems, since users have been known to waste lots of time pursuing the
wrong theory about what's failing. Now, our longstanding policy about how
to report authentication failures is that we don't really want to tell the
*client* such things, since that might be giving information to a bad guy.
But there's nothing wrong with reporting the details to the postmaster log,
and indeed the comments in this area of the code contemplate that
interesting details should be so reported. We just weren't handling these
particular interesting cases usefully.
To fix, add infrastructure allowing subroutines of ClientAuthentication()
to return a string to be added to the errdetail_log field of the main
authentication-failed error report. We might later want to use this to
report other subcases of authentication failure the same way, but for the
moment I just dealt with password cases.
Per discussion of a patch from Josh Drake, though this is not what
he proposed.
This change allows us to eliminate the previous limit on stored query
length, and it makes the shared-memory hash table very much smaller,
allowing more statements to be tracked. (The default value of
pg_stat_statements.max is therefore increased from 1000 to 5000.)
In typical scenarios, the hash table can be large enough to hold all the
statements commonly issued by an application, so that there is little
"churn" in the set of tracked statements, and thus little need to do I/O
to the file.
To further reduce the need for I/O to the query-texts file, add a way
to retrieve all the columns of the pg_stat_statements view except for
the query text column. This is probably not of much interest for human
use but it could be exploited by programs, which will prefer using the
queryid anyway.
Ordinarily, we'd need to bump the extension version number for the latter
change. But since we already advanced pg_stat_statements' version number
from 1.1 to 1.2 in the 9.4 development cycle, it seems all right to just
redefine what 1.2 means.
Peter Geoghegan, reviewed by Pavel Stehule
This makes it possible to store lwlocks as part of some other data
structure in the main shared memory segment, or in a dynamic shared
memory segment. There is still a main LWLock array and this patch does
not move anything out of it, but it provides necessary infrastructure
for doing that in the future.
This change is likely to increase the size of LWLockPadded on some
platforms, especially 32-bit platforms where it was previously only
16 bytes.
Patch by me. Review by Andres Freund and KaiGai Kohei.
Fix integer overflow issue noted by Magnus Hagander, as well as a bunch
of other infelicities in commit ee1e5662d8
and its unreasonably large number of followups.
Move allocation to after we check the remote server version, to avoid
a possible, very minor, memory leak. This makes us more consistent
throughout as most places in pg_dump are done in the same way (due, in
part, to previous fixes like this).
Spotted by the Coverity scanner.
Consistently check the dup2() call results throughout syslogger.c.
It's pretty unlikely that they'll error out, but if they do,
ereport(FATAL) instead of blissfully continuing on.
Spotted by the Coverity scanner.
From the Department of Nitpicking, be consistent with other escaping
and use 'E' instead of 'e' to escape the string in the example docs
for GET DISAGNOSTICS stack = PG_CONTEXT.
Noticed by Department Chief Magnus Hagander.
Mention that CREATE TABLE LIKE INCLUDING DEFAULTS creates a link between
the original and new tables if a default function modifies the database,
like nextval().
This allows ending recovery as a consistent state has been reached. Without
this, there was no easy way to e.g restore an online backup, without
replaying any extra WAL after the backup ended.
MauMau and me.
Per report from Jeffrey Walton, libpq has been accepting only TLSv1
exactly. Along the lines of the backend code, libpq will now support
new versions as OpenSSL adds them.
Marko Kreen, reviewed by Wim Lewis.
During parallel pg_dump, a worker process closing the connection caused
a minor memory leak (particularly minor as we are likely about to exit
anyway). Instead, free the memory in this case prior to returning NULL
to indicate connection closed.
Spotting by the Coverity scanner.
Back patch to 9.3 where this was introduced.
The maxoff field is not used in the new, compressed page format. Let's
reset it when converting an old-format page to the new format. The code
won't care either way, but this makes it possible to use the field for
something else in the future.
When vacuuming a data leaf page, any compressed posting lists that are not
modified, are copied back to the buffer from a later location in the same
buffer rather than from a palloc'd copy. IOW, they are just moved
downwards in the same buffer. Because the source and destination addresses
can overlap, we must use memmove rather than memcpy.
Report and fix by Alexander Korotkov.
Add the ability to specify the objects to move by who those objects are
owned by (as relowner) and change ALL to mean ALL objects. This
makes the command always operate against a well-defined set of objects
and not have the objects-to-be-moved based on the role of the user
running the command.
Per discussion with Simon and Tom.
Since C99, it's been standard for printf and friends to accept a "z" size
modifier, meaning "whatever size size_t has". Up to now we've generally
dealt with printing size_t values by explicitly casting them to unsigned
long and using the "l" modifier; but this is really the wrong thing on
platforms where pointers are wider than longs (such as Win64). So let's
start using "z" instead. To ensure we can do that on all platforms, teach
src/port/snprintf.c to understand "z", and add a configure test to force
use of that implementation when the platform's version doesn't handle "z".
Having done that, modify a bunch of places that were using the
unsigned-long hack to use "z" instead. This patch doesn't pretend to have
gotten everyplace that could benefit, but it catches many of them. I made
an effort in particular to ensure that all uses of the same error message
text were updated together, so as not to increase the number of
translatable strings.
It's possible that this change will result in format-string warnings from
pre-C99 compilers. We might have to reconsider if there are any popular
compilers that will warn about this; but let's start by seeing what the
buildfarm thinks.
Andres Freund, with a little additional work by me
The Sparc machines in the buildfarm are crashing because of misaligned
access to posting lists stored in entry tuples.
I accidentally removed a critical SHORTALIGN() from ginFormTuple, as part
of the packed posting lists patch. Perhaps I thought it was unnecessary,
because the index_form_tuple() call above the SHORTALIGN already aligned
the size, missing the fact that the null-category byte makes it misaligned
again (I think the SHORTALIGN is indeed unnecessary if there's no null-
category byte, but let's just play it safe...)
Some cases were still reporting errors and aborting, instead of a NOTICE
that the object was being skipped. This makes it more difficult to
cleanly handle pg_dump --clean, so change that to instead skip missing
objects properly.
Per bug #7873 reported by Dave Rolsky; apparently this affects a large
number of users.
Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera
There was a bug in the psql's meta command \conninfo. When the
IP address was specified in the hostaddr and psql used it to create
a connection (i.e., psql -d "hostaddr=xxx"), \conninfo could not
display that address. This is because \conninfo got the connection
information only from PQhost() which could not return hostaddr.
This patch adds PQhostaddr(), and changes \conninfo so that it
can display not only the host name that PQhost() returns but also
the IP address which PQhostaddr() returns.
The bug has existed since 9.1 where \conninfo was introduced.
But it's too late to add new libpq function into the released versions,
so no backpatch.
In the platform that doesn't support Unix-domain socket, when
neither host nor hostaddr are specified, the default host
'localhost' is used to connect to the server and PQhost() must
return that, but it didn't. This patch fixes PQhost() so that
it returns the default host in that case.
Also this patch fixes PQhost() so that it doesn't return
Unix-domain socket directory path in the platform that doesn't
support Unix-domain socket.
Back-patch to all supported versions.