Commit Graph

14 Commits

Author SHA1 Message Date
Bruce Momjian
f99a569a2e pgindent run for 8.2. 2006-10-04 00:30:14 +00:00
Tom Lane
ae643747b1 Fix a passel of recently-committed violations of the rule 'thou shalt
have no other gods before c.h'.  Also remove some demonstrably redundant
#include lines, mostly of <errno.h> which was added to c.h years ago.
2006-07-14 05:28:29 +00:00
Teodor Sigaev
04e9704b9e Now ispell dictionary can eat dictionaries in MySpell format,
used by OpenOffice. Dictionaries are placed at
http://lingucomponent.openoffice.org/spell_dic.html
Dictionary automatically recognizes format of files.

Warning. MySpell's format has limitation with compound
word support: it's impossible to mark affix as
compound-only affix. So for norwegian, german etc
languages it's recommended to use original ispell format.
For that reason I don't want to remove my2ispell
scripts, it's has workaround at least for norwegian language.
2006-06-09 13:25:59 +00:00
Teodor Sigaev
46a25ce6a9 1 Fix bug with very short word: prefix and suffix might be overlapped,
sorry but fix can't be applyed to previous version: it's require
  refill tsvector...
2 Small optimize of load time for huge dictionaries
3 use palloc instead of malloc during load dict file
2006-02-09 18:04:20 +00:00
Bruce Momjian
1dc3498251 Standard pgindent run for 8.1. 2005-10-15 02:49:52 +00:00
Teodor Sigaev
324300bc7c improve support of agglutinative languages (query with compound words).
regression=# select to_tsquery( '\'fotballklubber\'');
                   to_tsquery
------------------------------------------------
 'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
(1 row)

So, changed interface to dictionaries, lexize method of dictionary shoud return
pointer to aray of TSLexeme structs instead of char**. Last element should
have TSLexeme->lexeme == NULL.

typedef struct {
        /* number of variant of split word , for example
                Word 'fotballklubber' (norwegian) has two varian to split:
                ( fotball, klubb ) and ( fot, ball, klubb ). So, dictionary
                should return:
                nvariant        lexeme
                1               fotball
                1               klubb
                2               fot
                2               ball
                2               klubb

        */
        uint16  nvariant;

        /* currently unused */
        uint16  flags;

        /* C-string */
        char    *lexeme;
} TSLexeme;
2005-01-25 15:24:38 +00:00
Teodor Sigaev
5b354d2c7e Fixes:
1 Report error message instead of do nothing in case of error in regex
2 Malloced storage for mask, find and repl part of Affix. This parts may be
  large enough in real life (for example in czech, thanks to moje <moje@kalhotky.net>)
2005-01-11 16:07:55 +00:00
Bruce Momjian
b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Teodor Sigaev
de55c0cef6 1 Fix affixes with void replacement (AFAIK, it's only russian)
2 Optimize regex execution
2004-06-23 11:06:11 +00:00
Teodor Sigaev
11864ab657 Win32 related patch by Darko Prenosil. Small correct by teodor 2004-05-31 13:29:43 +00:00
Teodor Sigaev
565dc5d1ae Fix integer types to use definition from c.h. Per bug report by Patrick Boulay <patrick.boulay@medrium.com> 2003-12-10 15:54:58 +00:00
Teodor Sigaev
c63c1946a2 Optimize. Improve ispell support for compound words. This work was sponsored by ABC Startsiden AS. 2003-11-17 17:34:35 +00:00
Bruce Momjian
089003fb46 pgindent run. 2003-08-04 00:43:34 +00:00
Teodor Sigaev
b88605337e tsearch2 module 2003-07-21 10:27:44 +00:00