It required some changes in lexize algorithm, but interface with
dictionaries stays compatible with old dictionaries.
Funded by Georgia Public Library Service and LibLime, Inc.
1) rank_cd now use weight of lexemes
2) rank_cd and rank can use any combination of normalization methods:
no normalization
normalization by log(length of document)
-----/------- by length of document
-----/------- by number of unique word in document
-----/------- by log(number of unique word in document)
-----/------- by number of covers (only rank_cd)
Improve cover's search.
TODO: changes in documentation
singlebyte encodings, so we should have snowball for every encodings.
I hope that finalize multibyte support work in tsearch2, but testing is needed...
- supports multibyte encodings
- more strict rules for lexemes
- flex isn't used
Add:
- tsquery plainto_tsquery(text)
Function makes tsquery from plain text.
- &&, ||, !! operation for tsquery for combining
tsquery from it's parts: 'foo & bar' || 'asd' => 'foo & bar | asd'
1 Comparison operation for tsquery
2 Btree index on tsquery
3 numnode(tsquery) - returns 'length' of tsquery
4 tsquery @ tsquery, tsquery ~ tsquery - contains, contained for tsquery.
Note: They don't gurantee exact result, only MAY BE, so it
useful only for speed up rewrite functions
5 GiST index support for @,~
6 rewrite():
select rewrite(orig, what, to);
select rewrite(ARRAY[orig, what, to]) from tsquery_table;
select rewrite(orig, 'select what, to from tsquery_table;');
7 significantly improve cover algorithm
that return INTERNAL without also having INTERNAL arguments. Since the
functions in question aren't meant to be called by hand anyway, I just
redeclared them to take 'internal' instead of 'text'. Also add code
to ProcedureCreate() to enforce the restriction, as I should have done
to start with :-(
boxes. Change interface to user-defined GiST support methods union and
picksplit. Now instead of bytea struct it used special GistEntryVector
structure.
Bug reference: 1081
Logged by: Aarjav Trivedi
Email address: aarjav@cc.gatech.edu
PostgreSQL version: 7.4
Operating system: Linux
Description: Spelling error in tsearch2.sql leading to problems
with
tsearch
Details:
On line 620 of tsearch2.sql which is required to install and run
TSEARCH,
REATE FUNCTION tsstat_in(cstring)
should be
CREATE FUNCTION tsstat_in(cstring)
because of this error, TSEARCH fails to work as specified,