Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
The recent patch to remove gcc 4.6 warnings created some new ones, at
least on my rather old gcc version. Try to make everybody happy by
casting to "void" when we just want to discard the result.
In particular, if we don't have real ndistinct estimates for both sides,
fall back to assuming that half of the left-hand rows have join partners.
This is what was done in 8.2 and 8.3 (cf nulltestsel() in those versions).
It's pretty stupid but it won't lead us to think that an antijoin produces
no rows out, as seen in recent example from Uwe Schroeder.
If the referencing and referenced columns have different collations,
the parser will be unable to resolve which collation to use unless it's
helped out in this way. The effects are sometimes masked, if we end up
using a non-collation-sensitive plan; but if we do use a mergejoin
we'll see a failure, as recently noted by Robert Haas.
The SQL spec states that the referenced column's collation should be used
to resolve RI checks, so that's what we do. Note however that we currently
don't append a COLLATE clause when writing a query that examines only the
referencing column. If we ever support collations that have varying
notions of equality, that will have to be changed. For the moment, though,
it's preferable to leave it off so that we can use a normal index on the
referencing column.
This warning is new in gcc 4.6 and part of -Wall. This patch cleans
up most of the noise, but there are some still warnings that are
trickier to remove.
This is necessary, not optional, now that ILIKE and regexes are collation
aware --- else we might derive a wrong comparison constant for index
optimized pattern matches.
entry's commitSeqNo to that of the old one being transferred, or take
the minimum commitSeqNo if it is merging two lock entries.
Also, CreatePredicateLock should initialize commitSeqNo for to
InvalidSerCommitSeqNo instead of to 0. (I don't think using 0 would
actually affect anything, but we should be consistent.)
I also added a couple of assertions I used to track this down: a
lock's commitSeqNo should never be zero, and it should be
InvalidSerCommitSeqNo if and only if the lock is not held by
OldCommittedSxact.
Dan Ports, to fix leak of predicate locks reported by YAMAMOTO Takashi.
This involves getting the character classification and case-folding
functions in the regex library to use the collations infrastructure.
Most of this work had been done already in connection with the upper/lower
and LIKE logic, so it was a simple matter of transposition.
While at it, split out these functions into a separate source file
regc_pg_locale.c, so that they can be correctly labeled with the Postgres
project's license rather than the Scriptics license. These functions are
100% Postgres-written code whereas what remains in regc_locale.c is still
mostly not ours, so lumping them both under the same copyright notice was
getting more and more misleading.
The original collation patch only fixed the multi-byte code path.
This change also ensures that ILIKE's idea of the case-folding rules
is exactly the same as str_tolower's.
Tweak the test so that it does not depend on the platform using ".utf8" as
the extension signifying that a locale uses UTF8 encoding. For the most
part this just requires using the abbreviated collation names "en_US" etc,
though I had to work a bit harder on the collation creation tests.
This opens the door to using the test on platforms that spell locales
differently, for example ".utf-8" or ".UTF-8". Also, the test is now
somewhat useful with server encodings other than UTF8; though depending on
which encoding is selected, different subsets of it will fail for lack of
character set support.
Remove crude hack that tried to propagate collation through a
function-returning-record, ie, from the function's arguments to individual
fields selected from its result record. That is just plain inconsistent,
because the function result is composite and cannot have a collation;
and there's no hope of making this kind of action-at-a-distance work
consistently. Adjust regression test cases that expected this to happen.
Meanwhile, the behavior of casting to a domain with a declared collation
stays the same as it was, since that seemed to be the consensus.
"Unusable" collations are those not matching the current database's
encoding. The former behavior inconsistently showed such collations
some of the time, depending on the details of the pattern argument.
Get rid of bogus collation test in match_special_index_operator (even for
ILIKE, the pattern match operator's collation doesn't matter here, and even
if it did the test was testing the wrong thing).
Fix broken looping logic in expand_indexqual_rowcompare.
Add collation check in match_clause_to_ordering_op.
Make naming and argument ordering more consistent; improve comments.
The previous coding worked only if ltproc->fn_collation was always either
DEFAULT_COLLATION_OID or a C-compatible locale. While that's true at the
moment, it wasn't documented (and in fact wasn't true when this code was
committed...). But it only takes a couple more lines to make its internal
caching behavior locale-aware, so let's do that.
Otherwise, the SLRU machinery can get confused and think that the SLRU
has wrapped around. Along the way, regardless of whether we're
truncating all of the SLRU or just some of it, flush pages after
truncating, rather than before.
Kevin Grittner
Honor index column's collation spec if there is one, don't go to the
expense of calling get_typcollation when we can reasonably assume that
all GIN storage types will use default collation, and be sure to set
a collation for the comparePartialFn too.
All the other fields of the constant are being extracted from the syscache
entry we already have, so handle collation similarly. (There don't seem
to be any other uses for the new function at the moment.)
Per discussion, the original behavior seems too noisy. But if things
are so broken that none of the locales reported by "locale -a" are usable,
that's probably worth warning about.
When a regular lock is held, SSI can use that in lieu of a predicate lock
to detect rw conflicts; but if the regular lock is being taken by a
subtransaction, we can't assume that it'll commit, so releasing the
parent transaction's lock in that case is a no-no.
Kevin Grittner
As noted by Thom Brown, this confuses the DocBook index processor; it
fails to merge entries that differ only in whitespace, and sorts them
unexpectedly as well. Seems like a toolchain bug, but I'm not going to
hold my breath waiting for a fix.
Note: easiest way to find these is to look for double spaces in HTML.index.
Per a discussion with Gavin Flower. This barely scratches the surface
of potential WITH (something RETURNING) use cases, of course, but it's
one of the simplest compelling examples I can think of.
If we call hash_search() with HASH_ENTER, it will bail out rather than
return NULL, so it's redundant to check for NULL again in the caller.
Thus, in cases where we believe it's impossible for the hash table to run
out of slots anyway, we can simplify the code slightly.
On the flip side, in cases where it's theoretically possible to run out of
space, we don't want to rely on dynahash.c to throw an error; instead,
we pass HASH_ENTER_NULL and throw the error ourselves if a NULL comes
back, so that we can provide a more descriptive error message.
Kevin Grittner
Remove the hard-wired assumption that __mips__ (and only __mips__) lacks
dlopen in FreeBSD and OpenBSD. This assumption is outdated at least for
OpenBSD, as per report from an anonymous 9.1 tester. We can perfectly well
use HAVE_DLOPEN instead to decide which code to use.
Some other cosmetic adjustments to make freebsd.c, netbsd.c, and openbsd.c
exactly alike.
The original coding supposed that a dump TOC file could never contain lines
longer than 1K. The folly of that was exposed by a recent report from
Per-Olov Esgard. We only really need to see the first dozen or two bytes
of each line, since we're just trying to read off the numeric ID at the
start of the line; so there's no need for a particularly huge buffer.
What there is a need for is logic to not process continuation bufferloads.
Back-patch to all supported branches, since it's always been like this.
This is needed only in 9.1 because only 9.0 had this and no one is
upgrading from a 9.0 beta to 9.0 anymore. We basically don't backpatch
9.0 beta fixes at this point.
Previous patches took care of assorted places that call transformExpr from
outside the main parser, but I overlooked the fact that some places use
transformWhereClause as a shortcut for transformExpr + coerce_to_boolean.
In particular this broke collation-sensitive index WHERE clauses, as per
report from Thom Brown. Trigger WHEN and rule WHERE clauses too.
I'm not forcing initdb for this fix, but any affected indexes, triggers,
or rules will need to be dropped and recreated.
The previous functions of assign hooks are now split between check hooks
and assign hooks, where the former can fail but the latter shouldn't.
Aside from being conceptually clearer, this approach exposes the
"canonicalized" form of the variable value to guc.c without having to do
an actual assignment. And that lets us fix the problem recently noted by
Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus
log messages about "parameter "wal_buffers" cannot be changed without
restarting the server". There may be some speed advantage too, because
this design lets hook functions avoid re-parsing variable values when
restoring a previous state after a rollback (they can store a pre-parsed
representation of the value instead). This patch also resolves a
longstanding annoyance about custom error messages from variable assign
hooks: they should modify, not appear separately from, guc.c's own message
about "invalid parameter value".
When we release and reacquire SerializableXactHashLock, we must recheck
whether an R/W conflict still needs to be flagged, because it could have
changed under us in the meantime. And when we release the partition
lock, we must re-walk the list of predicate locks from the beginning,
because our pointer could get invalidated under us.
Bug report #5952 by Yamamoto Takashi. Patch by Kevin Grittner.