This isn't fully tested as yet, in particular I'm not sure that the
"foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some
buildfarm cycles on it.
sepgsql is not converted to an extension, mainly because it seems to
require a very nonstandard installation process.
Dimitri Fontaine and Tom Lane
in the formerly-always-blank columns just to left and right of the data.
Different marking is used for a line break caused by a newline in the data
than for a straight wraparound. A newline break is signaled by a "+" in the
right margin column in ASCII mode, or a carriage return arrow in UNICODE mode.
Wraparound is signaled by a dot in the right margin as well as the following
left margin in ASCII mode, or an ellipsis symbol in the same places in UNICODE
mode. "\pset linestyle old-ascii" is added to make the previous behavior
available if anyone really wants it.
In passing, this commit also cleans up a few regression test files that
had unintended spacing differences from the current actual output.
Roger Leigh, reviewed by Gabrielle Roth and other members of PDXPUG.
compatibility package. This supports importing dumps from past versions
using tsearch2, and provides the old names and API for most functions
that were changed. (rewrite(ARRAY[...]) is a glaring omission, though.)
Pavel Stehule and Tom Lane
It required some changes in lexize algorithm, but interface with
dictionaries stays compatible with old dictionaries.
Funded by Georgia Public Library Service and LibLime, Inc.
1) rank_cd now use weight of lexemes
2) rank_cd and rank can use any combination of normalization methods:
no normalization
normalization by log(length of document)
-----/------- by length of document
-----/------- by number of unique word in document
-----/------- by log(number of unique word in document)
-----/------- by number of covers (only rank_cd)
Improve cover's search.
TODO: changes in documentation
singlebyte encodings, so we should have snowball for every encodings.
I hope that finalize multibyte support work in tsearch2, but testing is needed...
- supports multibyte encodings
- more strict rules for lexemes
- flex isn't used
Add:
- tsquery plainto_tsquery(text)
Function makes tsquery from plain text.
- &&, ||, !! operation for tsquery for combining
tsquery from it's parts: 'foo & bar' || 'asd' => 'foo & bar | asd'
1 Comparison operation for tsquery
2 Btree index on tsquery
3 numnode(tsquery) - returns 'length' of tsquery
4 tsquery @ tsquery, tsquery ~ tsquery - contains, contained for tsquery.
Note: They don't gurantee exact result, only MAY BE, so it
useful only for speed up rewrite functions
5 GiST index support for @,~
6 rewrite():
select rewrite(orig, what, to);
select rewrite(ARRAY[orig, what, to]) from tsquery_table;
select rewrite(orig, 'select what, to from tsquery_table;');
7 significantly improve cover algorithm
literally.
Add GUC variables:
"escape_string_warning" - warn about backslashes in non-E strings
"escape_string_syntax" - supports E'' syntax?
"standard_compliant_strings" - treats backslashes literally in ''
Update code to use E'' when escapes are used.
The 'word' variable there is initialised from
the prs->words array, but immediately after,
that array may be reallocated, thus leaving
word pointing to unallocated memory.