postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-03-07 19:47:50 +08:00

Author	SHA1	Message	Date
Tom Lane	ca71131eeb	Introduce t_isalnum() to replace t_isalpha() \|\| t_isdigit() tests. ts_locale.c omitted support for "isalnum" tests, perhaps on the grounds that there were initially no use-cases for that. However, both ltree and pg_trgm need such tests, and we do also have one use-case now in the core backend. The workaround of testing isalpha and isdigit separately seems quite inefficient, especially when dealing with multibyte characters; so let's fill in the missing support. Discussion: https://postgr.es/m/2548310.1664999615@sss.pgh.pa.us	2022-10-06 11:08:56 -04:00
Alexander Korotkov	935f666650	Handle equality operator in contrib/pg_trgm Obviously, in order to equality operator be satisfiable, target string must contain all the trigrams of the search string. On this base, we implement equality operator in GiST/GIN indexes with recheck. Discussion: https://postgr.es/m/CAOBaU_YWwtT7tdggtROacjdOdeYHCz-tmSwuC-j-TOG-g97J0w%40mail.gmail.com Author: Julien Rouhaud Reviewed-by: Tom Lane, Alexander Korotkov, Georgios Kokolatos, Erik Rijkers	2020-11-15 08:52:35 +03:00
Tom Lane	0da06d9faf	Get rid of trailing semicolons in C macro definitions. Writing a trailing semicolon in a macro is almost never the right thing, because you almost always want to write a semicolon after each macro call instead. (Even if there was some reason to prefer not to, pgindent would probably make a hash of code formatted that way; so within PG the rule should basically be "don't do it".) Thus, if we have a semi inside the macro, the compiler sees "something;;". Much of the time the extra empty statement is harmless, but it could lead to mysterious syntax errors at call sites. In perhaps an overabundance of neatnik-ism, let's run around and get rid of the excess semicolons whereever possible. The only thing worse than a mysterious syntax error is a mysterious syntax error that only happens in the back branches; therefore, backpatch these changes where relevant, which is most of them because most of these mistakes are old. (The lack of reported problems shows that this is largely a hypothetical issue, but still, it could bite us in some future patch.) John Naylor and Tom Lane Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com	2020-05-01 17:28:00 -04:00
Alexander Korotkov	911e702077	Implement operator class parameters PostgreSQL provides set of template index access methods, where opclasses have much freedom in the semantics of indexing. These index AMs are GiST, GIN, SP-GiST and BRIN. There opclasses define representation of keys, operations on them and supported search strategies. So, it's natural that opclasses may be faced some tradeoffs, which require user-side decision. This commit implements opclass parameters allowing users to set some values, which tell opclass how to index the particular dataset. This commit doesn't introduce new storage in system catalog. Instead it uses pg_attribute.attoptions, which is used for table column storage options but unused for index attributes. In order to evade changing signature of each opclass support function, we implement unified way to pass options to opclass support functions. Options are set to fn_expr as the constant bytea expression. It's possible due to the fact that opclass support functions are executed outside of expressions, so fn_expr is unused for them. This commit comes with some examples of opclass options usage. We parametrize signature length in GiST. That applies to multiple opclasses: tsvector_ops, gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and gist_hstore_ops. Also we parametrize maximum number of integer ranges for gist__int_ops. However, the main future usage of this feature is expected to be json, where users would be able to specify which way to index particular json parts. Catversion is bumped. Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru Author: Nikita Glukhov, revised by me Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera	2020-03-30 19:17:23 +03:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Teodor Sigaev	be8a7a6866	Add strict_word_similarity to pg_trgm module strict_word_similarity is similar to existing word_similarity function but it takes into account word boundaries to compute similarity. Author: Alexander Korotkov Review by: David Steele, Liudmila Mantrova, me Discussion: https://www.postgresql.org/message-id/flat/CY4PR17MB13207ED8310F847CF117EED0D85A0@CY4PR17MB1320.namprd17.prod.outlook.com	2018-03-21 14:57:42 +03:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Teodor Sigaev	f576b17cd6	Add word_similarity to pg_trgm contrib module. Patch introduces a concept of similarity over string and just a word from another string. Version of extension is not changed because 1.2 was already introduced in 9.6 release cycle, so, there wasn't a public version. Author: Alexander Korotkov, Artur Zakirov	2016-03-16 18:59:21 +03:00
Teodor Sigaev	5871b88487	GUC variable pg_trgm.similarity_threshold insead of set_limit() Use GUC variable pg_trgm.similarity_threshold insead of set_limit()/show_limit() which was introduced when defining GUC varuables by modules was absent. Author: Artur Zakirov	2016-03-16 17:44:58 +03:00
Tom Lane	09d8d110a6	Use FLEXIBLE_ARRAY_MEMBER in a bunch more places. Replace some bogus "x[1]" declarations with "x[FLEXIBLE_ARRAY_MEMBER]". Aside from being more self-documenting, this should help prevent bogus warnings from static code analyzers and perhaps compiler misoptimizations. This patch is just a down payment on eliminating the whole problem, but it gets rid of a lot of easy-to-fix cases. Note that the main problem with doing this is that one must no longer rely on computing sizeof(the containing struct), since the result would be compiler-dependent. Instead use offsetof(struct, lastfield). Autoconf also warns against spelling that offsetof(struct, lastfield[0]). Michael Paquier, review and additional fixes by me.	2015-02-20 00:11:42 -05:00
Tom Lane	6f5b8beb64	Make contrib/pg_trgm also support regex searches with GiST indexes. This wasn't addressed in the original patch, but it doesn't take very much additional code to cover the case, so let's get it done. Since pg_trgm 1.1 hasn't been released yet, I just changed the definition of what's in it, rather than inventing a 1.2.	2013-04-10 13:31:02 -04:00
Tom Lane	3ccae48f44	Support indexing of regular-expression searches in contrib/pg_trgm. This works by extracting trigrams from the given regular expression, in generally the same spirit as the previously-existing support for LIKE searches, though of course the details are far more complicated. Currently, only GIN indexes are supported. We might be able to make it work with GiST indexes later. The implementation includes adding API functions to backend/regex/ to provide a view of the search NFA created from a regular expression. These functions are meant to be generic enough to be supportable in a standalone version of the regex library, should that ever happen. Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane	2013-04-09 01:06:54 -04:00
Peter Eisentraut	1b81c2fe6e	Remove many -Wcast-qual warnings This addresses only those cases that are easy to fix by adding or moving a const qualifier or removing an unnecessary cast. There are many more complicated cases remaining.	2011-09-11 21:54:32 +03:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	6e2f3ae884	Support LIKE and ILIKE index searches via contrib/pg_trgm indexes. Unlike Btree-based LIKE optimization, this works for non-left-anchored search patterns. The effectiveness of the search depends on how many trigrams can be extracted from the pattern. (The worst case, with no trigrams, degrades to a full-table scan, so this isn't a panacea. But it can be very useful.) Alexander Korotkov, reviewed by Jan Urbanski	2011-01-31 21:34:49 -05:00
Tom Lane	b525bf771e	Add KNNGIST support to contrib/pg_trgm. Teodor Sigaev, with some revision by Tom	2010-12-04 00:16:21 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Teodor Sigaev	b87b52bf04	Support of multibyte encoding for pg_trgm	2008-11-12 13:43:54 +00:00
Andrew Dunstan	53972b460c	Add $PostgreSQL$ markers to a lot of files that were missing them. This particular batch was just for .c and .h file. The changes were made with the following 2 commands: find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| while read file ; do head -n 1 < $file \| grep -q '^/\' && echo $file; done \| xargs -l sed -i -e '1s/^\// /' -e '1i/\n $PostgreSQL:$ \n ' find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| xargs -l sed -i -e '1i/\n $PostgreSQL:$ \n */'	2008-05-17 01:28:26 +00:00
Bruce Momjian	224f91f66d	Modify LOOPBYTE/LOOPBIT macros to be more logical; rather than have the for() body passed as a parameter, make the macros act as simple headers to code blocks. This allows pgindent to be run on these files.	2007-11-16 00:13:02 +00:00
Teodor Sigaev	15f91f2789	Add GIN support for pg_trgm. From Guillaume Smet <guillaume.smet@gmail.com> with minor editorization by me.	2007-03-14 14:15:40 +00:00
Tom Lane	9f652d430f	Fix up several contrib modules that were using varlena datatypes in not-so-obvious ways. I'm not totally sure that I caught everything, but at least now they pass their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.	2007-02-28 22:44:38 +00:00
Tom Lane	ae643747b1	Fix a passel of recently-committed violations of the rule 'thou shalt have no other gods before c.h'. Also remove some demonstrably redundant #include lines, mostly of <errno.h> which was added to c.h years ago.	2006-07-14 05:28:29 +00:00
Bruce Momjian	ac230e7431	Alphabetically order reference to include files, "S"-"Z".	2006-07-11 18:26:11 +00:00
Tom Lane	33feb55c47	Replace bitwise looping with bytewise looping in hemdistsign and sizebitvec of tsearch2, as well as identical code in several other contrib modules. This provided about a 20X speedup in building a large tsearch2 index ... didn't try to measure its effects for other operations. Thanks to Stephan Vollmer for providing a test case.	2006-01-20 22:46:16 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Teodor Sigaev	cbfa4092bb	trgm - Trigram matching for PostgreSQL -------------------------------------- The pg_trgm contrib module provides functions and index classes for determining the similarity of text based on trigram matching.	2004-05-31 17:18:12 +00:00

28 Commits