2
0
mirror of https://git.postgresql.org/git/postgresql.git synced 2025-03-07 19:47:50 +08:00
Commit Graph

36351 Commits

Author SHA1 Message Date
Tom Lane
8db652359c Fix enforcement of restrictions inside regexp lookaround constraints.
Lookahead and lookbehind constraints aren't allowed to contain backrefs,
and parentheses within them are always considered non-capturing.  Or so
says the manual.  But the regexp parser forgot about these rules once
inside a parenthesized subexpression, so that constructs like (\w)(?=(\1))
were accepted (but then not correctly executed --- a case like this acted
like (\w)(?=\w), without any enforcement that the two \w's match the same
text).  And in (?=((foo))) the innermost parentheses would be counted as
capturing parentheses, though no text would ever be captured for them.

To fix, properly pass down the "type" argument to the recursive invocation
of parse().

Back-patch to all supported branches; it was agreed that silent
misexecution of such patterns is worse than throwing an error, even though
new errors in minor releases are generally not desirable.
2015-11-07 12:43:24 -05:00
Kevin Grittner
909a83df56 Add RMV to list of commands taking AE lock.
Backpatch to 9.3, where it was initially omitted.

Craig Ringer, with minor adjustment by Kevin Grittner
2015-11-02 06:26:49 -06:00
Kevin Grittner
18479293cb Fix serialization anomalies due to race conditions on INSERT.
On insert the CheckForSerializableConflictIn() test was performed
before the page(s) which were going to be modified had been locked
(with an exclusive buffer content lock).  If another process
acquired a relation SIReadLock on the heap and scanned to a page on
which an insert was going to occur before the page was so locked,
a rw-conflict would be missed, which could allow a serialization
anomaly to be missed.  The window between the check and the page
lock was small, so the bug was generally not noticed unless there
was high concurrency with multiple processes inserting into the
same table.

This was reported by Peter Bailis as bug , by Sean Chittenden
as bug , and by others.

The race condition was eliminated in heap_insert() by moving the
check down below the acquisition of the buffer lock, which had been
the very next statement.  Because of the loop locking and unlocking
multiple buffers in heap_multi_insert() a check was added after all
inserts were completed.  The check before the start of the inserts
was left because it might avoid a large amount of work to detect a
serialization anomaly before performing the all of the inserts and
the related WAL logging.

While investigating this bug, other SSI bugs which were even harder
to hit in practice were noticed and fixed, an unnecessary check
(covered by another check, so redundant) was removed from
heap_update(), and comments were improved.

Back-patch to all supported branches.

Kevin Grittner and Thomas Munro
2015-10-31 14:46:57 -05:00
Robert Haas
71b1a0ab5f Fix incorrect message in ATWrongRelkindError.
Mistake introduced by commit 3bf3ab8c56.

Etsuro Fujita
2015-10-28 11:54:30 +01:00
Noah Misch
7cecabca8f Fix back-patch of commit 8e3b4d9d40.
master emits an extra context message compared to 9.5 and earlier.
2015-10-20 00:58:40 -04:00
Noah Misch
0c7390d624 Eschew "RESET statement_timeout" in tests.
Instead, use transaction abort.  Given an unlucky bout of latency, the
timeout would cancel the RESET itself.  Buildfarm members gharial,
lapwing, mereswine, shearwater, and sungazer witness that.  Back-patch
to 9.1 (all supported versions).  The query_canceled test still could
timeout before entering its subtransaction; for whatever reason, that
has yet to happen on the buildfarm.
2015-10-20 00:37:45 -04:00
Tom Lane
e69d4756ef Fix incorrect handling of lookahead constraints in pg_regprefix().
pg_regprefix was doing nothing with lookahead constraints, which would
be fine if it were the right kind of nothing, but it isn't: we have to
terminate our search for a fixed prefix, not just pretend the LACON arc
isn't there.  Otherwise, if the current state has both a LACON outarc and a
single plain-color outarc, we'd falsely conclude that the color represents
an addition to the fixed prefix, and generate an extracted index condition
that restricts the indexscan too much.  (See added regression test case.)

Terminating the search is conservative: we could traverse the LACON arc
(thus assuming that the constraint can be satisfied at runtime) and then
examine the outarcs of the linked-to state.  But that would be a lot more
work than it seems worth, because writing a LACON followed by a single
plain character is a pretty silly thing to do.

This makes a difference only in rather contrived cases, but it's a bug,
so back-patch to all supported branches.
2015-10-19 13:54:54 -07:00
Michael Meskes
defd2ecf4f Fix order of arguments in ecpg generated typedef command. 2015-10-18 10:16:49 +02:00
Tom Lane
4528f9d69d Miscellaneous cleanup of regular-expression compiler.
Revert our previous addition of "all" flags to copyins() and copyouts();
they're no longer needed, and were never anything but an unsightly hack.

Improve a couple of infelicities in the REG_DEBUG code for dumping
the NFA data structure, including adding code to count the total
number of states and arcs.

Add a couple of missed error checks.

Add some more documentation in the README file, and some regression tests
illustrating cases that exceeded the state-count limit and/or took
unreasonable amounts of time before this set of patches.

Back-patch to all supported branches.
2015-10-16 15:52:12 -04:00
Tom Lane
ad5e5a62af Improve memory-usage accounting in regular-expression compiler.
This code previously counted the number of NFA states it created, and
complained if a limit was exceeded, so as to prevent bizarre regex patterns
from consuming unreasonable time or memory.  That's fine as far as it went,
but the code paid no attention to how many arcs linked those states.  Since
regexes can be contrived that have O(N) states but will need O(N^2) arcs
after fixempties() processing, it was still possible to blow out memory,
and take a long time doing it too.  To fix, modify the bookkeeping to count
space used by both states and arcs.

I did not bother with including the "color map" in the accounting; it
can only grow to a few megabytes, which is not a lot in comparison to
what we're allowing for states+arcs (about 150MB on 64-bit machines
or half that on 32-bit machines).

Looking at some of the larger real-world regexes captured in the Tcl
regression test suite suggests that the most that is likely to be needed
for regexes found in the wild is under 10MB, so I believe that the current
limit has enough headroom to make it okay to keep it as a hard-wired limit.

In connection with this, redefine REG_ETOOBIG as meaning "regular
expression is too complex"; the previous wording of "nfa has too many
states" was already somewhat inapropos because of the error code's use
for stack depth overrun, and it was not very user-friendly either.

Back-patch to all supported branches.
2015-10-16 15:36:17 -04:00
Tom Lane
2a8d6e4d06 Improve performance of pullback/pushfwd in regular-expression compiler.
The previous coding would create a new intermediate state every time it
wanted to interchange the ordering of two constraint arcs.  Certain regex
features such as \Y can generate large numbers of parallel constraint arcs,
and if we needed to reorder the results of that, we created unreasonable
numbers of intermediate states.  To improve matters, keep a list of
already-created intermediate states associated with the state currently
being considered by the outer loop; we can re-use such states to place all
the new arcs leading to the same destination or source.

I also took the trouble to redefine push() and pull() to have a less risky
API: they no longer delete any state or arc that the caller might possibly
have a pointer to, except for the specifically-passed constraint arc.
This reduces the risk of re-introducing the same type of error seen in
the failed patch for CVE-2007-4772.

Back-patch to all supported branches.
2015-10-16 15:11:49 -04:00
Tom Lane
677e64cb80 Improve performance of fixempties() pass in regular-expression compiler.
The previous coding took something like O(N^4) time to fully process a
chain of N EMPTY arcs.  We can't really do much better than O(N^2) because
we have to insert about that many arcs, but we can do lots better than
what's there now.  The win comes partly from using mergeins() to amortize
de-duplication of arcs across multiple source states, and partly from
exploiting knowledge of the ordering of arcs for each state to avoid
looking at arcs we don't need to consider during the scan.  We do have
to be a bit careful of the possible reordering of arcs introduced by
the sort-merge coding of the previous commit, but that's not hard to
deal with.

Back-patch to all supported branches.
2015-10-16 14:58:11 -04:00
Tom Lane
296241635b Fix O(N^2) performance problems in regular-expression compiler.
Change the singly-linked in-arc and out-arc lists to be doubly-linked,
so that arc deletion is constant time rather than having worst-case time
proportional to the number of other arcs on the connected states.

Modify the bulk arc transfer operations copyins(), copyouts(), moveins(),
moveouts() so that they use a sort-and-merge algorithm whenever there's
more than a small number of arcs to be copied or moved.  The previous
method is O(N^2) in the number of arcs involved, because it performs
duplicate checking independently for each copied arc.  The new method may
change the ordering of existing arcs for the destination state, but nothing
really cares about that.

Provide another bulk arc copying method mergeins(), which is unused as
of this commit but is needed for the next one.  It basically is like
copyins(), but the source arcs might not all come from the same state.

Replace the O(N^2) bubble-sort algorithm used in carcsort() with a qsort()
call.

These changes greatly improve the performance of regex compilation for
large or complex regexes, at the cost of extra space for arc storage during
compilation.  The original tradeoff was probably fine when it was made, but
now we care more about speed and less about memory consumption.

Back-patch to all supported branches.
2015-10-16 14:43:18 -04:00
Tom Lane
6e4dda7963 Fix regular-expression compiler to handle loops of constraint arcs.
It's possible to construct regular expressions that contain loops of
constraint arcs (that is, ^ $ AHEAD BEHIND or LACON arcs).  There's no use
in fully traversing such a loop at execution, since you'd just end up in
the same NFA state without having consumed any input.  Worse, such a loop
leads to infinite looping in the pullback/pushfwd stage of compilation,
because we keep pushing or pulling the same constraints around the loop
in a vain attempt to move them to the pre or post state.  Such looping was
previously recognized in CVE-2007-4772; but the fix only handled the case
of trivial single-state loops (that is, a constraint arc leading back to
its source state) ... and not only that, it was incorrect even for that
case, because it broke the admittedly-not-very-clearly-stated API contract
of the pull() and push() subroutines.  The first two regression test cases
added by this commit exhibit patterns that result in assertion failures
because of that (though there seem to be no ill effects in non-assert
builds).  The other new test cases exhibit multi-state constraint loops;
in an unpatched build they will run until the NFA state-count limit is
exceeded.

To fix, remove the code added for CVE-2007-4772, and instead create a
general-purpose constraint-loop-breaking phase of regex compilation that
executes before we do pullback/pushfwd.  Since we never need to traverse
a constraint loop fully, we can just break the loop at any chosen spot,
if we add clone states that can replicate any sequence of arc transitions
that would've traversed just part of the loop.

Also add some commentary clarifying why we have to have all these
machinations in the first place.

This class of problems has been known for some time --- we had a report
from Marc Mamin about two years ago, for example, and there are related
complaints in the Tcl bug tracker.  I had discussed a fix of this kind
off-list with Henry Spencer, but didn't get around to doing something
about it until the issue was rediscovered by Greg Stark recently.

Back-patch to all supported branches.
2015-10-16 14:14:41 -04:00
Tom Lane
bc6b03bb81 On Windows, ensure shared memory handle gets closed if not being used.
Postmaster child processes that aren't supposed to be attached to shared
memory were not bothering to close the shared memory mapping handle they
inherit from the postmaster process.  That's mostly harmless, since the
handle vanishes anyway when the child process exits -- but the syslogger
process, if used, doesn't get killed and restarted during recovery from a
backend crash.  That meant that Windows doesn't see the shared memory
mapping as becoming free, so it doesn't delete it and the postmaster is
unable to create a new one, resulting in failure to recover from crashes
whenever logging_collector is turned on.

Per report from Dmitry Vasilyev.  It's a bit astonishing that we'd not
figured this out long ago, since it's been broken from the very beginnings
of out native Windows support; probably some previously-unexplained trouble
reports trace to this.

A secondary problem is that on Cygwin (perhaps only in older versions?),
exec() may not detach from the shared memory segment after all, in which
case these child processes did remain attached to shared memory, posing
the risk of an unexpected shared memory clobber if they went off the rails
somehow.  That may be a long-gone bug, but we can deal with it now if it's
still live, by detaching within the infrastructure introduced here to deal
with closing the handle.

Back-patch to all supported branches.

Tom Lane and Amit Kapila
2015-10-13 11:21:33 -04:00
Tom Lane
dfe572de00 Fix "pg_ctl start -w" to test child process status directly.
pg_ctl start with -w previously relied on a heuristic that the postmaster
would surely always manage to create postmaster.pid within five seconds.
Unfortunately, that fails much more often than we would like on some of the
slower, more heavily loaded buildfarm members.

We have known for quite some time that we could remove the need for that
heuristic on Unix by using fork/exec instead of system() to launch the
postmaster.  This allows us to know the exact PID of the postmaster, which
allows near-certain verification that the postmaster.pid file is the one
we want and not a leftover, and it also lets us use waitpid() to detect
reliably whether the child postmaster has exited or not.

What was blocking this change was not wanting to rewrite the Windows
version of start_postmaster() to avoid use of CMD.EXE.  That's doable
in theory but would require fooling about with stdout/stderr redirection,
and getting the handling of quote-containing postmaster switches to
stay the same might be rather ticklish.  However, we realized that
we don't have to do that to fix the problem, because we can test
whether the shell process has exited as a proxy for whether the
postmaster is still alive.  That doesn't allow an exact check of the
PID in postmaster.pid, but we're no worse off than before in that
respect; and we do get to get rid of the heuristic about how long the
postmaster might take to create postmaster.pid.

On Unix, this change means that a second "pg_ctl start -w" immediately
after another such command will now reliably fail, whereas previously
it would succeed if done within two seconds of the earlier command.
Since that's a saner behavior anyway, it's fine.  On Windows, the case can
still succeed within the same time window, since pg_ctl can't tell that the
earlier postmaster's postmaster.pid isn't the pidfile it is looking for.
To ensure stable test results on Windows, we can insert a short sleep into
the test script for pg_ctl, ensuring that the existing pidfile looks stale.
This hack can be removed if we ever do rewrite start_postmaster(), but that
no longer seems like a high-priority thing to do.

Back-patch to all supported versions, both because the current behavior
is buggy and because we must do that if we want the buildfarm failures
to go away.

Tom Lane and Michael Paquier
2015-10-12 18:30:37 -04:00
Andrew Dunstan
f844f0ef18 Factor out encoding specific tests for json
This lets us remove the large alternative results files for the main
json and jsonb tests, which makes modifying those tests simpler for
committers and patch submitters.

Backpatch to 9.4 for jsonb and 9.3 for json.
2015-10-07 22:19:38 -04:00
Tom Lane
32e53593f9 Improve documentation of the role-dropping process.
In general one may have to run both REASSIGN OWNED and DROP OWNED to get
rid of all the dependencies of a role to be dropped.  This was alluded to
in the REASSIGN OWNED man page, but not really spelled out in full; and in
any case the procedure ought to be documented in a more prominent place
than that.  Add a section to the "Database Roles" chapter explaining this,
and do a bit of wordsmithing in the relevant commands' man pages.
2015-10-07 16:12:06 -04:00
Tom Lane
31bc563b9b Perform an immediate shutdown if the postmaster.pid file is removed.
The postmaster now checks every minute or so (worst case, at most two
minutes) that postmaster.pid is still there and still contains its own PID.
If not, it performs an immediate shutdown, as though it had received
SIGQUIT.

The original goal behind this change was to ensure that failed buildfarm
runs would get fully cleaned up, even if the test scripts had left a
postmaster running, which is not an infrequent occurrence.  When the
buildfarm script removes a test postmaster's $PGDATA directory, its next
check on postmaster.pid will fail and cause it to exit.  Previously, manual
intervention was often needed to get rid of such orphaned postmasters,
since they'd block new test postmasters from obtaining the expected socket
address.

However, by checking postmaster.pid and not something else, we can provide
additional robustness: manual removal of postmaster.pid is a frequent DBA
mistake, and now we can at least limit the damage that will ensue if a new
postmaster is started while the old one is still alive.

Back-patch to all supported branches, since we won't get the desired
improvement in buildfarm reliability otherwise.
2015-10-06 17:15:27 -04:00
Tom Lane
f5bbaeef1a Stamp 9.3.10. 2015-10-05 15:14:02 -04:00
Peter Eisentraut
cc0c8ec9fc doc: Update URLs of external projects 2015-10-05 12:32:10 -04:00
Tom Lane
57b02827f3 Fix insufficiently-portable regression test case.
Some of the buildfarm members are evidently miserly enough of stack space
to pass the originally-committed form of this test.  Increase the
requirement 10X to hopefully ensure that it fails as-expected everywhere.

Security: CVE-2015-5289
2015-10-05 12:19:15 -04:00
Peter Eisentraut
921c18c150 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 576bd3231176cdea570609e7fd16152bf2e5e15a
2015-10-05 11:01:00 -04:00
Tom Lane
f795753663 Last-minute updates for release notes.
Add entries for security and not-quite-security issues.

Security: CVE-2015-5288, CVE-2015-5289
2015-10-05 10:57:47 -04:00
Andres Freund
e946cc601c Remove outdated comment about relation level autovacuum freeze limits.
The documentation for the autovacuum_multixact_freeze_max_age and
autovacuum_freeze_max_age relation level parameters contained:
"Note that while you can set autovacuum_multixact_freeze_max_age very
small, or even zero, this is usually unwise since it will force frequent
vacuuming."
which hasn't been true since these options were made relation options,
instead of residing in the pg_autovacuum table (834a6da4f7).

Remove the outdated sentence. Even the lowered limits from 2596d70 are
high enough that this doesn't warrant calling out the risk in the CREATE
TABLE docs.

Per discussion with Tom Lane and Alvaro Herrera

Discussion: 26377.1443105453@sss.pgh.pa.us
Backpatch: 9.0- (in parts)
2015-10-05 16:51:04 +02:00
Noah Misch
28dea9485e Prevent stack overflow in query-type functions.
The tsquery, ltxtquery and query_int data types have a common ancestor.
Having acquired check_stack_depth() calls independently, each was
missing at least one call.  Back-patch to 9.0 (all supported versions).
2015-10-05 10:06:34 -04:00
Noah Misch
9286ff78f8 Prevent stack overflow in container-type functions.
A range type can name another range type as its subtype, and a record
type can bear a column of another record type.  Consequently, functions
like range_cmp() and record_recv() are recursive.  Functions at risk
include operator family members and referents of pg_type regproc
columns.  Treat as recursive any such function that looks up and calls
the same-purpose function for a record column type or the range subtype.
Back-patch to 9.0 (all supported versions).

An array type's element type is never itself an array type, so array
functions are unaffected.  Recursion depth proportional to array
dimensionality, found in array_dim_to_jsonb(), is fine thanks to MAXDIM.
2015-10-05 10:06:34 -04:00
Noah Misch
f8862172e6 Prevent stack overflow in json-related functions.
Sufficiently-deep recursion heretofore elicited a SIGSEGV.  If an
application constructs PostgreSQL json or jsonb values from arbitrary
user input, application users could have exploited this to terminate all
active database connections.  That applies to 9.3, where the json parser
adopted recursive descent, and later versions.  Only row_to_json() and
array_to_json() were at risk in 9.2, both in a non-security capacity.
Back-patch to 9.2, where the json type was introduced.

Oskari Saarenmaa, reviewed by Michael Paquier.

Security: CVE-2015-5289
2015-10-05 10:06:34 -04:00
Noah Misch
cc1210f0aa pgcrypto: Detect and report too-short crypt() salts.
Certain short salts crashed the backend or disclosed a few bytes of
backend memory.  For existing salt-induced error conditions, emit a
message saying as much.  Back-patch to 9.0 (all supported versions).

Josh Kupershmidt

Security: CVE-2015-5288
2015-10-05 10:06:34 -04:00
Andres Freund
3933417141 Re-Align *_freeze_max_age reloption limits with corresponding GUC limits.
In 020235a575 I lowered the autovacuum_*freeze_max_age minimums to
allow for easier testing of wraparounds. I did not touch the
corresponding per-table limits. While those don't matter for the purpose
of wraparound, it seems more consistent to lower them as well.

It's noteworthy that the previous reloption lower limit for
autovacuum_multixact_freeze_max_age was too high by one magnitude, even
before 020235a575.

Discussion: 26377.1443105453@sss.pgh.pa.us
Backpatch: back to 9.0 (in parts), like the prior patch
2015-10-05 11:57:11 +02:00
Tom Lane
04811c350b Release notes for 9.5beta1, 9.4.5, 9.3.10, 9.2.14, 9.1.19, 9.0.23. 2015-10-04 19:38:00 -04:00
Tom Lane
0867e0ad53 Further twiddling of nodeHash.c hashtable sizing calculation.
On reflection, the submitted patch didn't really work to prevent the
request size from exceeding MaxAllocSize, because of the fact that we'd
happily round nbuckets up to the next power of 2 after we'd limited it to
max_pointers.  The simplest way to enforce the limit correctly is to
round max_pointers down to a power of 2 when it isn't one already.

(Note that the constraint to INT_MAX / 2, if it were doing anything useful
at all, is properly applied after that.)
2015-10-04 15:55:07 -04:00
Tom Lane
45dd7cdbab Fix possible "invalid memory alloc request size" failure in nodeHash.c.
Limit the size of the hashtable pointer array to not more than
MaxAllocSize.  We've seen reports of failures due to this in HEAD/9.5,
and it seems possible in older branches as well.  The change in
NTUP_PER_BUCKET in 9.5 may have made the problem more likely, but
surely it didn't introduce it.

Tomas Vondra, slightly modified by me
2015-10-04 14:17:14 -04:00
Tom Lane
0f6a046b67 Update time zone data files to tzdata release 2015g.
DST law changes in Cayman Islands, Fiji, Moldova, Morocco, Norfolk Island,
North Korea, Turkey, Uruguay.  New zone America/Fort_Nelson for Canadian
Northern Rockies.
2015-10-02 19:16:06 -04:00
Tom Lane
4175cc604f Add recursion depth protection to LIKE matching.
Since MatchText() recurses, it could in principle be driven to stack
overflow, although quite a long pattern would be needed.
2015-10-02 15:00:52 -04:00
Tom Lane
9ed207ae99 Add recursion depth protections to regular expression matching.
Some of the functions in regex compilation and execution recurse, and
therefore could in principle be driven to stack overflow.  The Tcl crew
has seen this happen in practice in duptraverse(), though their fix was
to put in a hard-wired limit on the number of recursive levels, which is
not too appetizing --- fortunately, we have enough infrastructure to check
the actually available stack.  Greg Stark has also seen it in other places
while fuzz testing on a machine with limited stack space.  Let's put guards
in to prevent crashes in all these places.

Since the regex code would leak memory if we simply threw elog(ERROR),
we have to introduce an API that checks for stack depth without throwing
such an error.  Fortunately that's not difficult.
2015-10-02 14:51:58 -04:00
Tom Lane
6b3810d0a4 Fix potential infinite loop in regular expression execution.
In cfindloop(), if the initial call to shortest() reports that a
zero-length match is possible at the current search start point, but then
it is unable to construct any actual match to that, it'll just loop around
with the same start point, and thus make no progress.  We need to force the
start point to be advanced.  This is safe because the loop over "begin"
points has already tried and failed to match starting at "close", so there
is surely no need to try that again.

This bug was introduced in commit e2bd904955,
wherein we allowed continued searching after we'd run out of match
possibilities, but evidently failed to think hard enough about exactly
where we needed to search next.

Because of the way this code works, such a match failure is only possible
in the presence of backrefs --- otherwise, shortest()'s judgment that a
match is possible should always be correct.  That probably explains how
come the bug has escaped detection for several years.

The actual fix is a one-liner, but I took the trouble to add/improve some
comments related to the loop logic.

After fixing that, the submitted test case "()*\1" didn't loop anymore.
But it reported failure, though it seems like it ought to match a
zero-length string; both Tcl and Perl think it does.  That seems to be from
overenthusiastic optimization on my part when I rewrote the iteration match
logic in commit 173e29aa5d: we can't just
"declare victory" for a zero-length match without bothering to set match
data for capturing parens inside the iterator node.

Per fuzz testing by Greg Stark.  The first part of this is a bug in all
supported branches, and the second part is a bug since 9.2 where the
iteration rewrite happened.
2015-10-02 14:26:36 -04:00
Tom Lane
384ce1b756 Add some more query-cancel checks to regular expression matching.
Commit 9662143f0c added infrastructure to
allow regular-expression operations to be terminated early in the event
of SIGINT etc.  However, fuzz testing by Greg Stark disclosed that there
are still cases where regex compilation could run for a long time without
noticing a cancel request.  Specifically, the fixempties() phase never
adds new states, only new arcs, so it doesn't hit the cancel check I'd put
in newstate().  Add one to newarc() as well to cover that.

Some experimentation of my own found that regex execution could also run
for a long time despite a pending cancel.  We'd put a high-level cancel
check into cdissect(), but there was none inside the core text-matching
routines longest() and shortest().  Ordinarily those inner loops are very
very fast ... but in the presence of lookahead constraints, not so much.
As a compromise, stick a cancel check into the stateset cache-miss
function, which is enough to guarantee a cancel check at least once per
lookahead constraint test.

Making this work required more attention to error handling throughout the
regex executor.  Henry Spencer had apparently originally intended longest()
and shortest() to be incapable of incurring errors while running, so
neither they nor their subroutines had well-defined error reporting
behaviors.  However, that was already broken by the lookahead constraint
feature, since lacon() can surely suffer an out-of-memory failure ---
which, in the code as it stood, might never be reported to the user at all,
but just silently be treated as a non-match of the lookahead constraint.
Normalize all that by inserting explicit error tests as needed.  I took the
opportunity to add some more comments to the code, too.

Back-patch to all supported branches, like the previous patch.
2015-10-02 13:45:39 -04:00
Tom Lane
71d9523d77 Docs: add disclaimer about hazards of using regexps from untrusted sources.
It's not terribly hard to devise regular expressions that take large
amounts of time and/or memory to process.  Recent testing by Greg Stark has
also shown that machines with small stack limits can be driven to stack
overflow by suitably crafted regexps.  While we intend to fix these things
as much as possible, it's probably impossible to eliminate slow-execution
cases altogether.  In any case we don't want to treat such things as
security issues.  The history of that code should already discourage
prudent DBAs from allowing execution of regexp patterns coming from
possibly-hostile sources, but it seems like a good idea to warn about the
hazard explicitly.

Currently, similar_escape() allows access to enough of the underlying
regexp behavior that the warning has to apply to SIMILAR TO as well.
We might be able to make it safer if we tightened things up to allow only
SQL-mandated capabilities in SIMILAR TO; but that would be a subtly
non-backwards-compatible change, so it requires discussion and probably
could not be back-patched.

Per discussion among pgsql-security list.
2015-10-02 13:30:43 -04:00
Tom Lane
7e1e1c9d18 Fix pg_dump to handle inherited NOT VALID check constraints correctly.
This case seems to have been overlooked when unvalidated check constraints
were introduced, in 9.2.  The code would attempt to dump such constraints
over again for each child table, even though adding them to the parent
table is sufficient.

In 9.2 and 9.3, also fix contrib/pg_upgrade/Makefile so that the "make
clean" target fully cleans up after a failed test.  This evidently got
dealt with at some point in 9.4, but it wasn't back-patched.  I ran into
it while testing this fix ...

Per bug  from Ingmar Brouns.
2015-10-01 16:19:49 -04:00
Tom Lane
c3e847902f Fix documentation error in commit 8703059c6b.
Etsuro Fujita spotted a thinko in the README commentary.
2015-10-01 10:32:14 -04:00
Fujii Masao
2d57d886fa Fix mention of htup.h in storage.sgml
Previously it was documented that the details on HeapTupleHeaderData
struct could be found in htup.h. This is not correct because it's now
defined in htup_details.h.

Back-patch to 9.3 where the definition of HeapTupleHeaderData struct
was moved from htup.h to htup_details.h.

Michael Paquier
2015-10-01 23:13:26 +09:00
Tom Lane
aad86c518d Improve LISTEN startup time when there are many unread notifications.
If some existing listener is far behind, incoming new listener sessions
would start from that session's read pointer and then need to advance over
many already-committed notification messages, which they have no interest
in.  This was expensive in itself and also thrashed the pg_notify SLRU
buffers a lot more than necessary.  We can improve matters considerably
in typical scenarios, without much added cost, by starting from the
furthest-ahead read pointer, not the furthest-behind one.  We do have to
consider only sessions in our own database when doing this, which requires
an extra field in the data structure, but that's a pretty small cost.

Back-patch to 9.0 where the current LISTEN/NOTIFY logic was introduced.

Matt Newell, slightly adjusted by me
2015-09-30 23:32:23 -04:00
Tom Lane
f60b2e2d48 Fix plperl to handle non-ASCII error message texts correctly.
We were passing error message texts to croak() verbatim, which turns out
not to work if the text contains non-ASCII characters; Perl mangles their
encoding, as reported in bug  from Michal Leinweber.  To fix, convert
the text into a UTF8-encoded SV first.

It's hard to test this without risking failures in different database
encodings; but we can follow the lead of plpython, which is already
assuming that no-break space (U+00A0) has an equivalent in all encodings
we care about running the regression tests in (cf commit 2dfa15de5).

Back-patch to 9.1.  The code is quite different in 9.0, and anyway it seems
too risky to put something like this into 9.0's final minor release.

Alex Hunsaker, with suggestions from Tim Bunce and Tom Lane
2015-09-29 10:52:22 -04:00
Andrew Dunstan
729d0056b4 Fix compiler warning about unused function in non-readline case.
Backpatch to all live branches to keep the code in sync.
2015-09-28 18:34:31 -04:00
Tom Lane
1bcc9e60a7 Second try at fixing O(N^2) problem in foreign key references.
This replaces ill-fated commit 5ddc72887a,
which was reverted because it broke active uses of FK cache entries.  In
this patch, we still do nothing more to invalidatable cache entries than
mark them as needing revalidation, so we won't break active uses.  To keep
down the overhead of InvalidateConstraintCacheCallBack(), keep a list of
just the currently-valid cache entries.  (The entries are large enough that
some added space for list links doesn't seem like a big problem.)  This
would still be O(N^2) when there are many valid entries, though, so when
the list gets too long, just force the "sinval reset" behavior to remove
everything from the list.  I set the threshold at 1000 entries, somewhat
arbitrarily.  Possibly that could be fine-tuned later.  Another item for
future study is whether it's worth adding reference counting so that we
could safely remove invalidated entries.  As-is, problem cases are likely
to end up with large and mostly invalid FK caches.

Like the previous attempt, backpatch to 9.3.

Jan Wieck and Tom Lane
2015-09-25 13:16:31 -04:00
Tom Lane
b7d17eca57 Further fix for psql's code for locale-aware formatting of numeric output.
(Third time's the charm, I hope.)

Additional testing disclosed that this code could mangle already-localized
output from the "money" datatype.  We can't very easily skip applying it
to "money" values, because the logic is tied to column right-justification
and people expect "money" output to be right-justified.  Short of
decoupling that, we can fix it in what should be a safe enough way by
testing to make sure the string doesn't contain any characters that would
not be expected in plain numeric output.
2015-09-25 12:20:46 -04:00
Tom Lane
9c547c939c Further fix for psql's code for locale-aware formatting of numeric output.
On closer inspection, those seemingly redundant atoi() calls were not so
much inefficient as just plain wrong: the author of this code either had
not read, or had not understood, the POSIX specification for localeconv().
The grouping field is *not* a textual digit string but separate integers
encoded as chars.

We'll follow the existing code as well as the backend's cash.c in only
honoring the first group width, but let's at least honor it correctly.

This doesn't actually result in any behavioral change in any of the
locales I have installed on my Linux box, which may explain why nobody's
complained; grouping width 3 is close enough to universal that it's barely
worth considering other cases.  Still, wrong is wrong, so back-patch.
2015-09-25 00:00:58 -04:00
Tom Lane
7e327ecd2b Fix psql's code for locale-aware formatting of numeric output.
This code did the wrong thing entirely for numbers with an exponent
but no decimal point (e.g., '1e6'), as reported by Jeff Janes in
bug .  More generally, it made lots of unverified assumptions
about what the input string could possibly look like.  Rearrange so
that it only fools with leading digits that it's directly verified
are there, and an immediately adjacent decimal point.  While at it,
get rid of some useless inefficiencies, like converting the grouping
count string to integer over and over (and over).

This has been broken for a long time, so back-patch to all supported
branches.
2015-09-24 23:01:04 -04:00
Tom Lane
b7dcb2dd4a Improve handling of collations in contrib/postgres_fdw.
If we have a local Var of say varchar type with default collation, and
we apply a RelabelType to convert that to text with default collation, we
don't want to consider that as creating an FDW_COLLATE_UNSAFE situation.
It should be okay to compare that to a remote Var, so long as the remote
Var determines the comparison collation.  (When we actually ship such an
expression to the remote side, the local Var would become a Param with
default collation, meaning the remote Var would in fact control the
comparison collation, because non-default implicit collation overrides
default implicit collation in parse_collate.c.)  To fix, be more precise
about what FDW_COLLATE_NONE means: it applies either to a noncollatable
data type or to a collatable type with default collation, if that collation
can't be traced to a remote Var.  (When it can, FDW_COLLATE_SAFE is
appropriate.)  We were essentially using that interpretation already at
the Var/Const/Param level, but we weren't bubbling it up properly.

An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT
value to describe the second situation, but that would add more code
without changing the actual behavior, so it didn't seem worthwhile.

Also, since we're clarifying the rule to be that we care about whether
operator/function input collations match, there seems no need to fail
immediately upon seeing a Const/Param/non-foreign-Var with nondefault
collation.  We only have to reject if it appears in a collation-sensitive
context (for example, "var IS NOT NULL" is perfectly safe from a collation
standpoint, whatever collation the var has).  So just set the state to
UNSAFE rather than failing immediately.

Per report from Jeevan Chalke.  This essentially corrects some sloppy
thinking in commit ed3ddf918b, so back-patch
to 9.3 where that logic appeared.
2015-09-24 12:47:30 -04:00