page about the maximum UTF8 sequence length we support (4 bytes since 8.1,
3 before that). pg_utf2wchar_with_len never got updated to support 4-byte
characters at all, and in any case had a buffer-overrun risk in that it
could produce multiple pg_wchars from what mblen claims to be just one UTF8
character. The only reason we don't have a major security hole is that most
callers allocate worst-case output buffers; the sole exception in released
versions appears to be pre-8.2 iwchareq() (ie, ILIKE), which can be crashed
due to zeroing out its return address --- but AFAICS that can't be exploited
for anything more than a crash, due to inability to control what gets written
there. Per report from James Russell and Michael Fuhr.
Pre-8.1 the risk is much less, but I still think pg_utf2wchar_with_len's
behavior given an incomplete final character risks buffer overrun, so
back-patch that logic change anyway.
This patch also makes sure that UTF8 sequences exceeding the supported
length (whichever it is) are consistently treated as error cases, rather
than being treated like a valid shorter sequence in some places.
involving unions of types having typmods. Variants of the failure are known
to occur in 8.1 and up; not sure if it's possible in 8.0 and 7.4, but since
the code exists that far back, I'll just patch 'em all. Per report from
Brian Hurt.
FAMILY; and add FAMILY option to CREATE OPERATOR CLASS to allow adding a
class to a pre-existing family. Per previous discussion. Man, what a
tedious lot of cutting and pasting ...
which I had removed in the first cut of the EquivalenceClass rewrite to
simplify that patch a little. But it's still important --- in a four-way
join problem mergejoinscansel() was eating about 40% of the planning time
according to gprof. Also, improve the EquivalenceClass code to re-use
join RestrictInfos rather than generating fresh ones for each join
considered. This saves some memory space but more importantly improves
the effectiveness of caching planning info in RestrictInfos.
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
provide just a boolean 'amcanorder', instead of fields that specify the
sort operator strategy numbers. We have decided to require ordering-capable
AMs to use btree-compatible strategy numbers, so the old fields are
overkill (and indeed misleading about what's allowed).
match the postgresql.conf file. Also add units to descriptions that
lacked them. Wording improvements. Mention pg_settings.unit as the way
to find the default units for setting.
Backpatch to 8.2.X.
representation of equivalence classes of variables. This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
currentMarkData from IndexScanDesc to the opaque structs for the
AMs that need this information (currently gist and hash).
Patch from Heikki Linnakangas, fixes by Neil Conway.