postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2024-12-21 08:29:39 +08:00

Author	SHA1	Message	Date
Heikki Linnakangas	5028f22f6e	Switch to CRC-32C in WAL and other places. The old algorithm was found to not be the usual CRC-32 algorithm, used by Ethernet et al. We were using a non-reflected lookup table with code meant for a reflected lookup table. That's a strange combination that AFAICS does not correspond to any bit-wise CRC calculation, which makes it difficult to reason about its properties. Although it has worked well in practice, seems safer to use a well-known algorithm. Since we're changing the algorithm anyway, we might as well choose a different polynomial. The Castagnoli polynomial has better error-correcting properties than the traditional CRC-32 polynomial, even if we had implemented it correctly. Another reason for picking that is that some new CPUs have hardware support for calculating CRC-32C, but not CRC-32, let alone our strange variant of it. This patch doesn't add any support for such hardware, but a future patch could now do that. The old algorithm is kept around for tsquery and pg_trgm, which use the values in indexes that need to remain compatible so that pg_upgrade works. While we're at it, share the old lookup table for CRC-32 calculation between hstore, ltree and core. They all use the same table, so might as well.	2014-11-04 11:39:48 +02:00
Bruce Momjian	6a605cd6bd	Adjust blank lines around PG_MODULE_MAGIC defines, for consistency Report by Robert Haas	2014-07-10 14:02:08 -04:00
Peter Eisentraut	e7128e8dbb	Create function prototype as part of PG_FUNCTION_INFO_V1 macro Because of gcc -Wmissing-prototypes, all functions in dynamically loadable modules must have a separate prototype declaration. This is meant to detect global functions that are not declared in header files, but in cases where the function is called via dfmgr, this is redundant. Besides filling up space with boilerplate, this is a frequent source of compiler warnings in extension modules. We can fix that by creating the function prototype as part of the PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That makes the code of modules cleaner, because there is one less place where the entry points have to be listed, and creates an additional check that functions have the right prototype. Remove now redundant prototypes from contrib and other modules.	2014-04-18 00:03:19 -04:00
Tom Lane	c3ccc9ee58	Fix possible buffer overrun in contrib/pg_trgm. Allow for the possibility that folding a string to lower case makes it longer (due to replacing a character with a longer multibyte character). This doesn't change the number of trigrams that will be extracted, but it does affect the required size of an intermediate buffer in generate_trgm(). Per bug #8821 from Ufuk Kayserilioglu. Also install some checks that the input string length is not so large as to cause overflow in the calculations of palloc request sizes. Back-patch to all supported versions.	2014-01-13 13:07:10 -05:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Tom Lane	6f5b8beb64	Make contrib/pg_trgm also support regex searches with GiST indexes. This wasn't addressed in the original patch, but it doesn't take very much additional code to cover the case, so let's get it done. Since pg_trgm 1.1 hasn't been released yet, I just changed the definition of what's in it, rather than inventing a 1.2.	2013-04-10 13:31:02 -04:00
Tom Lane	3ccae48f44	Support indexing of regular-expression searches in contrib/pg_trgm. This works by extracting trigrams from the given regular expression, in generally the same spirit as the previously-existing support for LIKE searches, though of course the details are far more complicated. Currently, only GIN indexes are supported. We might be able to make it work with GiST indexes later. The implementation includes adding API functions to backend/regex/ to provide a view of the search NFA created from a regular expression. These functions are meant to be generic enough to be supportable in a standalone version of the regex library, should that ever happen. Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane	2013-04-09 01:06:54 -04:00
Tom Lane	7844608e54	Get rid of USE_WIDE_UPPER_LOWER dependency in trigram construction. contrib/pg_trgm's make_trigrams() was coded to ignore multibyte character boundaries and just make trigrams from bytes if USE_WIDE_UPPER_LOWER wasn't defined. This is a bit odd, since there's no obvious reason why trigram compaction rules should depend on the presence of towlower() and friends. What's more, there was an Assert() that would fail if that code path was fed any multibyte characters. We need to do something about this since the pending regex-indexing patch has an assumption that you get just one "trgm" from any three characters. The best solution seems to be to remove the USE_WIDE_UPPER_LOWER dependency, which shouldn't really have been there in the first place. The second loop in make_trigrams() is now just a fast path and not a potentially incompatible algorithm. If there is anybody still using Postgres on machines without wcstombs() or towlower(), and they have non-ASCII data indexed by pg_trgm, they'll need to REINDEX those indexes after pg_upgrade to 9.3, else searches may fail incorrectly. It seems likely that there are no such installations, though. In passing, rename cnt_trigram to compact_trigram, which seems to better describe its functionality, and improve make_trigrams' test for whether it has to use the slow path or not (per a suggestion from Alexander Korotkov).	2013-04-07 14:46:17 -04:00
Tom Lane	9728eda792	Fix contrib/pg_trgm's similarity() function for trigram-free strings. Cases such as similarity('', '') produced a NaN result due to computing 0/0. Per discussion, make it return zero instead. This appears to be the basic cause of bug #7867 from Michele Baravalle, although it remains unclear why her installation doesn't think Cyrillic letters are letters. Back-patch to all active branches.	2013-02-13 14:07:06 -05:00
Tom Lane	b2a01b9ad1	Fix bugs in contrib/pg_trgm's LIKE pattern analysis code. Extraction of trigrams did not process LIKE escape sequences properly, leading to possible misidentification of trigrams near escapes, resulting in incorrect index search results. Fujii Masao	2012-08-20 13:25:42 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	6e2f3ae884	Support LIKE and ILIKE index searches via contrib/pg_trgm indexes. Unlike Btree-based LIKE optimization, this works for non-left-anchored search patterns. The effectiveness of the search depends on how many trigrams can be extracted from the pattern. (The worst case, with no trigrams, degrades to a full-table scan, so this isn't a panacea. But it can be very useful.) Alexander Korotkov, reviewed by Jan Urbanski	2011-01-31 21:34:49 -05:00
Tom Lane	b525bf771e	Add KNNGIST support to contrib/pg_trgm. Teodor Sigaev, with some revision by Tom	2010-12-04 00:16:21 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Teodor Sigaev	b87b52bf04	Support of multibyte encoding for pg_trgm	2008-11-12 13:43:54 +00:00
Andrew Dunstan	53972b460c	Add $PostgreSQL$ markers to a lot of files that were missing them. This particular batch was just for .c and .h file. The changes were made with the following 2 commands: find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| while read file ; do head -n 1 < $file \| grep -q '^/\' && echo $file; done \| xargs -l sed -i -e '1s/^\// /' -e '1i/\n $PostgreSQL:$ \n ' find . $ \( -name 'libstemmer' -o -name 'expected' -o -name 'ppport.h' $ -prune \) -o $ -name '.[ch]' $ $ -exec grep -q '\$PostgreSQL' {} \; -o -print $ \| xargs -l sed -i -e '1i/\n $PostgreSQL:$ \n */'	2008-05-17 01:28:26 +00:00
Tom Lane	4f09b55dc3	Simplify overly-cute array coding to avoid an apparent gcc bug, which may or may not be harmless. Report from Stefan, patch from Heikki.	2007-07-12 23:10:57 +00:00
Tom Lane	9f652d430f	Fix up several contrib modules that were using varlena datatypes in not-so-obvious ways. I'm not totally sure that I caught everything, but at least now they pass their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.	2007-02-28 22:44:38 +00:00
Tom Lane	234a02b2a8	Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len). Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with VARSIZE and VARDATA, and as a consequence almost no code was using the longer names. Rename the length fields of struct varlena and various derived structures to catch anyplace that was accessing them directly; and clean up various places so caught. In itself this patch doesn't change any behavior at all, but it is necessary infrastructure if we hope to play any games with the representation of varlena headers. Greg Stark and Tom Lane	2007-02-27 23:48:10 +00:00
Neil Conway	8ff2bccee3	Squelch some VC++ compiler warnings. Mark float literals with the "f" suffix, to distinguish them from doubles. Make some function declarations and definitions use the "const" qualifier for arguments consistently. Ignore warning 4102 ("unreferenced label"), because such warnings are always emitted by bison-generated code. Patch from Magnus Hagander.	2007-01-26 17:45:42 +00:00
Tom Lane	a0ffab351e	Magic blocks don't do us any good unless we use 'em ... so install one in every shared library.	2006-05-30 22:12:16 +00:00
Neil Conway	8e5a10d46c	This patch makes the error message strings throughout the backend more compliant with the error message style guide. In particular, errdetail should begin with a capital letter and end with a period, whereas errmsg should not. I also fixed a few related issues in passing, such as fixing the repeated misspelling of "lexeme" in contrib/tsearch2 (per Tom's suggestion).	2006-03-01 06:30:32 +00:00
Bruce Momjian	1dc3498251	Standard pgindent run for 8.1.	2005-10-15 02:49:52 +00:00
Bruce Momjian	b6b71b85bc	Pgindent run for 8.0.	2004-08-29 05:07:03 +00:00
Teodor Sigaev	cbfa4092bb	trgm - Trigram matching for PostgreSQL -------------------------------------- The pg_trgm contrib module provides functions and index classes for determining the similarity of text based on trigram matching.	2004-05-31 17:18:12 +00:00

27 Commits