Commit Graph

78 Commits

Author SHA1 Message Date
Teodor Sigaev
324300bc7c improve support of agglutinative languages (query with compound words).
regression=# select to_tsquery( '\'fotballklubber\'');
                   to_tsquery
------------------------------------------------
 'fotball' & 'klubb' | 'fot' & 'ball' & 'klubb'
(1 row)

So, changed interface to dictionaries, lexize method of dictionary shoud return
pointer to aray of TSLexeme structs instead of char**. Last element should
have TSLexeme->lexeme == NULL.

typedef struct {
        /* number of variant of split word , for example
                Word 'fotballklubber' (norwegian) has two varian to split:
                ( fotball, klubb ) and ( fot, ball, klubb ). So, dictionary
                should return:
                nvariant        lexeme
                1               fotball
                1               klubb
                2               fot
                2               ball
                2               klubb

        */
        uint16  nvariant;

        /* currently unused */
        uint16  flags;

        /* C-string */
        char    *lexeme;
} TSLexeme;
2005-01-25 15:24:38 +00:00
Teodor Sigaev
fe30edbab8 Change
typedef struct {} WordEntryPos;
to
typedef uint16 WordEntryPos
according to http://www.pgsql.ru/db/mw/msg.html?mid=2035188

Require re-fill all tsvector fields and reindex tsvector indexes.
2005-01-25 12:36:25 +00:00
Teodor Sigaev
5b354d2c7e Fixes:
1 Report error message instead of do nothing in case of error in regex
2 Malloced storage for mask, find and repl part of Affix. This parts may be
  large enough in real life (for example in czech, thanks to moje <moje@kalhotky.net>)
2005-01-11 16:07:55 +00:00
Neil Conway
1b3e769682 This patch makes some cleanups to contrib/ to silence some sparse
warnings:

- remove pointless "extern" keyword from some function definitions in
contrib/tsearch2

- use "NULL" not "0" as NULL pointer in contrib/tsearch,
contrib/tsearch2, contrib/pgbench, and contrib/vacuumlo
2004-11-09 06:09:40 +00:00
Tom Lane
0636d55843 Fix some more 'old-style parameter declaration' warnings. 2004-10-25 02:30:29 +00:00
Tom Lane
3fdd33ab99 Avoid macro-redefinition warnings on Windows, per Andrew Dunstan. 2004-10-21 19:49:27 +00:00
Tom Lane
380bd04c16 Standardize on using the Min, Max, and Abs macros that are in our c.h file,
getting rid of numerous ad-hoc versions that have popped up in various
places.  Shortens code and avoids conflict with Windows min() and max()
macros.
2004-10-21 19:28:36 +00:00
Tom Lane
d3e36da789 Make the standard stopword files be sought relative to share_dir, so
that a tsearch2 installation can be relocatable.
2004-10-17 23:09:31 +00:00
Tom Lane
b04e70b11e Adjust tsearch2.sql to avoid use of COPY FROM STDIN, so as to
simplify life for the Win32 installer.  Per Dave Page.
2004-09-14 03:58:54 +00:00
Tom Lane
b2c4071299 Redesign query-snapshot timing so that volatile functions in READ COMMITTED
mode see a fresh snapshot for each command in the function, rather than
using the latest interactive command's snapshot.  Also, suppress fresh
snapshots as well as CommandCounterIncrement inside STABLE and IMMUTABLE
functions, instead using the snapshot taken for the most closely nested
regular query.  (This behavior is only sane for read-only functions, so
the patch also enforces that such functions contain only SELECT commands.)
As per my proposal of 6-Sep-2004; I note that I floated essentially the
same proposal on 19-Jun-2002, but that discussion tailed off without any
action.  Since 8.0 seems like the right place to be taking possibly
nontrivial backwards compatibility hits, let's get it done now.
2004-09-13 20:10:13 +00:00
Bruce Momjian
15d3f9f6b7 Another pgindent run with lib typedefs added. 2004-08-30 02:54:42 +00:00
Bruce Momjian
b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian
ee85595d46 > Please find enclose a submission to fix these problems.
>
> The patch adds missing the "libpgport.a" file to the installation under
> "install-all-headers". It is needed by some contribs. I install the
> library in "pkglibdir", but I was wondering whether it should be "libdir"?
> I was wondering also whether it would make sense to have a "libpgport.so"?
>
> It fixes various macros which are used by contrib makefiles, especially
> libpq_*dir and LDFLAGS when used under PGXS. It seems to me that they are
> needed to
>
> It adds the ability to test and use PGXS with contribs, with "make
> USE_PGXS=1". Without the macro, this is exactly as before, there should be
> no difference, esp. wrt the vpath feature that seemed broken by previous
> submission. So it should not harm anybody, and it is useful at least to me.
>
> It fixes some inconsistencies in various contrib makefiles
> (useless override, ":=" instead of "=").

Fabien COELHO
2004-08-20 20:13:10 +00:00
Tom Lane
fcbc438727 Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments
and documentation to reference 8.0 instead of 7.5.
2004-08-04 21:34:35 +00:00
Tom Lane
959b353db2 Fix misspellings: langauge -> language. 2004-07-04 23:34:24 +00:00
Teodor Sigaev
bb89237531 1 Eliminate duplicate field HLWORD->skip
2 Rework support for html tags in parser
3 add HighlightAll to headline function for generating highlighted
  whole text with saved html tags
2004-06-28 16:19:09 +00:00
Teodor Sigaev
df9d87f608 Previous commit wasnt full... 2004-06-23 11:29:58 +00:00
Teodor Sigaev
de55c0cef6 1 Fix affixes with void replacement (AFAIK, it's only russian)
2 Optimize regex execution
2004-06-23 11:06:11 +00:00
Teodor Sigaev
09bc52fe73 Fix stupid bug in installcheck 2004-06-23 09:43:43 +00:00
Teodor Sigaev
f35e8d8431 Fix silly bug 2004-06-01 10:24:25 +00:00
Teodor Sigaev
553bc41633 1 add namespaces as Tom suggest http://www.pgsql.ru/db/mw/msg.html?mid=1987703
2 remove select qeury in inserts
2004-05-31 16:51:56 +00:00
Teodor Sigaev
7cb55d21ed Fix memory leak with pg_regexec 2004-05-31 13:55:19 +00:00
Teodor Sigaev
d222bb4d5e Fix memory leak with pg_regcomp 2004-05-31 13:52:57 +00:00
Teodor Sigaev
11864ab657 Win32 related patch by Darko Prenosil. Small correct by teodor 2004-05-31 13:29:43 +00:00
Neil Conway
72b6ad6313 Use the new List API function names throughout the backend, and disable the
list compatibility API by default. While doing this, I decided to keep
the llast() macro around and introduce llast_int() and llast_oid() variants.
2004-05-30 23:40:41 +00:00
Teodor Sigaev
a6ea6457fa Stat function now can show statistics per weight of lexemes 2004-05-28 15:36:49 +00:00
Teodor Sigaev
89a8e15671 Update docs from Andrew J. Kopciuch <akopciuch@bddf.ca> 2004-05-14 10:39:25 +00:00
Tom Lane
a90b2a035f Suppress 'uninitialized variable' warning emitted by some (not all)
versions of gcc.  The code is correct AFAICS, but it requires slightly
more analysis than usual to see that the variable can't be used uninitialized.
2004-05-07 13:09:12 +00:00
Teodor Sigaev
d1eb9fede5 Use regprocedure type instead of oid. Usefull for human read and dump/restore 2004-05-07 11:19:06 +00:00
Tom Lane
0bd61548ab Solve the 'Turkish problem' with undesirable locale behavior for case
conversion of basic ASCII letters.  Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower.  These functions use the same notions of
case folding already developed for identifier case conversion.  I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent.  Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
2004-05-07 00:24:59 +00:00
Tom Lane
89ee5b89a6 Fix some more compatibility issues (ctype.h macros must never be passed
signed chars...)
2004-04-02 00:41:18 +00:00
Tom Lane
47fe0517fc Fix some portability issues (reliance on gcc-isms). 2004-04-01 23:44:38 +00:00
Tom Lane
375369acd1 Replace TupleTableSlot convention for whole-row variables and function
results with tuples as ordinary varlena Datums.  This commit does not
in itself do much for us, except eliminate the horrid memory leak
associated with evaluation of whole-row variables.  However, it lays the
groundwork for allowing composite types as table columns, and perhaps
some other useful features as well.  Per my proposal of a few days ago.
2004-04-01 21:28:47 +00:00
Teodor Sigaev
f2c064afcb Cleanup vectors of GISTENTRY and eliminate problem with 64-bit strict-aligned
boxes. Change interface to user-defined GiST support methods union and
picksplit. Now instead of bytea struct it used special GistEntryVector
structure.
2004-03-30 15:45:33 +00:00
Teodor Sigaev
eebdfcdbe6 1 Minimize memory allocation for void (but not null) value.
2 Add silly ordering for ts_vector to aim grouping, union, except etc. Don't use BTree opclass (tsvector_ops).
2004-03-25 16:56:10 +00:00
Tom Lane
fa96a5e15b Add %option nodefault to all our flex lexers. Fix a couple of rule gaps
exposed thereby.  AFAICT these would not lead to any worse problems than
junk emitted on the backend's stdout, but we should have the option to
catch possible worse errors in future.
2004-02-24 22:06:32 +00:00
Bruce Momjian
1d567aee07 The following bug has been logged online:
Bug reference:      1081
Logged by:          Aarjav Trivedi

Email address:      aarjav@cc.gatech.edu

PostgreSQL version: 7.4

Operating system:   Linux

Description:        Spelling error in tsearch2.sql leading to problems
with
tsearch

Details:

On line 620 of tsearch2.sql which is required to install and run
TSEARCH,

REATE FUNCTION tsstat_in(cstring)

should be

CREATE FUNCTION tsstat_in(cstring)

because of this error, TSEARCH fails to work as specified,
2004-02-20 20:42:29 +00:00
Teodor Sigaev
125d69cd9b Fix signed char in comparison and check memory allocation 2003-12-18 19:27:53 +00:00
Teodor Sigaev
565dc5d1ae Fix integer types to use definition from c.h. Per bug report by Patrick Boulay <patrick.boulay@medrium.com> 2003-12-10 15:54:58 +00:00
Teodor Sigaev
a5a68766e1 One more fix confusion 2003-12-05 15:37:51 +00:00
Teodor Sigaev
8f678600c2 Avoid confusion start_parse_str function with tsearch V1 2003-12-05 14:27:42 +00:00
Teodor Sigaev
6de3fe3c0d Avoid conflict strndup with glibc 2003-12-04 12:21:11 +00:00
Teodor Sigaev
32580efafb Fix for word with several infinitives 2003-12-03 16:07:48 +00:00
PostgreSQL Daemon
969685ad44 $Header: -> $PostgreSQL Changes ... 2003-11-29 19:52:15 +00:00
Teodor Sigaev
cabdf460d3 Fix free instead of pfree 2003-11-28 12:09:02 +00:00
Teodor Sigaev
baeab89de6 Fixes about word with several infiniteve 2003-11-27 16:04:40 +00:00
Teodor Sigaev
f5e8bfb4b1 Utility for convert myspell dictionaries to ispell, full README will be later 2003-11-26 14:06:16 +00:00
Teodor Sigaev
4ca765f9aa Ignore too long lexeme 2003-11-25 13:33:15 +00:00
Teodor Sigaev
c63c1946a2 Optimize. Improve ispell support for compound words. This work was sponsored by ABC Startsiden AS. 2003-11-17 17:34:35 +00:00
Bruce Momjian
6000e32805 I've not changed any malloc/calloc to palloc. It looks to me that these memory
areas are for the lifetime of the backend and in the interests of not breaking
something that's not broken I left alone.

Note for anyone reading this and wanting it for tsearch-v2-stable (i.e. for 7.3
backend) this patch probably will not apply cleanly to that source. It should
be simple enough to see what's going on and apply the changes by hand if need
be.


--
Nigel J. Andrews
2003-09-29 18:54:38 +00:00