postgresql

mirror of https://git.postgresql.org/git/postgresql.git synced 2025-02-11 19:20:40 +08:00

Author	SHA1	Message	Date
Tom Lane	a5322ca10f	Make contrib/unaccent's unaccent() function work when not in search path. Since the fixes for CVE-2018-1058, we've advised people to schema-qualify function references in order to fix failures in code that executes under a minimal search_path setting. However, that's insufficient to make the single-argument form of unaccent() work, because it looks up the "unaccent" text search dictionary using the search path. The most expedient answer seems to be to remove the search_path dependency by making it look in the same schema that the unaccent() function itself is declared in. This will definitely work for the normal usage of this function with the unaccent dictionary provided by the extension. It's barely possible that there are people who were relying on the search-path-dependent behavior to select other dictionaries with the same name; but if there are any such people at all, they can still get that behavior by writing unaccent('unaccent', ...), or possibly unaccent('unaccent'::text::regdictionary, ...) if the lookup has to be postponed to runtime. Per complaint from Gunnlaugur Thor Briem. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAPs+M8LCex6d=DeneofdsoJVijaG59m9V0ggbb3pOH7hZO4+cQ@mail.gmail.com	2018-09-06 10:49:45 -04:00
Thomas Munro	5e8d670c31	Add Greek characters to unaccent.rules. Author: Tasos Maschalidis Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/153495048900.1368.11566580687623014380%40wrigleys.postgresql.org Discussion: https://postgr.es/m/VI1PR01MB38537EBD529FE5EE3FE9A5FEB5370%40VI1PR01MB3853.eurprd01.prod.exchangelabs.com	2018-09-02 07:12:24 +12:00
Tom Lane	fb8697b31a	Avoid unnecessary use of pg_strcasecmp for already-downcased identifiers. We have a lot of code in which option names, which from the user's viewpoint are logically keywords, are passed through the grammar as plain identifiers, and then matched to string literals during command execution. This approach avoids making words into lexer keywords unnecessarily. Some places matched these strings using plain strcmp, some using pg_strcasecmp. But the latter should be unnecessary since identifiers would have been downcased on their way through the parser. Aside from any efficiency concerns (probably not a big factor), the lack of consistency in this area creates a hazard of subtle bugs due to different places coming to different conclusions about whether two option names are the same or different. Hence, standardize on using strcmp() to match any option names that are expected to have been fed through the parser. This does create a user-visible behavioral change, which is that while formerly all of these would work: alter table foo set (fillfactor = 50); alter table foo set (FillFactor = 50); alter table foo set ("fillfactor" = 50); alter table foo set ("FillFactor" = 50); now the last case will fail because that double-quoted identifier is different from the others. However, none of our documentation says that you can use a quoted identifier in such contexts at all, and we should discourage doing so since it would break if we ever decide to parse such constructs as true lexer keywords rather than poor man's substitutes. So this shouldn't create a significant compatibility issue for users. Daniel Gustafsson, reviewed by Michael Paquier, small changes by me Discussion: https://postgr.es/m/29405B24-564E-476B-98C0-677A29805B84@yesql.se	2018-01-26 18:25:14 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Peter Eisentraut	0e1539ba0d	Add some const decorations to prototypes Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>	2017-11-10 13:38:57 -05:00
Tom Lane	ec0a69e49b	Extend the default rules file for contrib/unaccent with Vietnamese letters. Improve generate_unaccent_rules.py to handle composed characters whose base is another composed character rather than a plain letter. The net effect of this is to add a bunch of multi-accented Vietnamese characters to unaccent.rules. Original complaint from Kha Nguyen, diagnosis of the script's shortcoming by Thomas Munro. Dang Minh Huong and Michael Paquier Discussion: https://postgr.es/m/CALo3sF6EC8cy1F2JUz=GRf5h4LMUJTaG3qpdoiLrNbWEXL-tRg@mail.gmail.com	2017-08-16 16:51:56 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Noah Misch	3a0d473192	Use wrappers of PG_DETOAST_DATUM_PACKED() more. This makes almost all core code follow the policy introduced in the previous commit. Specific decisions: - Text search support functions with char* and length arguments, such as prsstart and lexize, may receive unaligned strings. I doubt maintainers of non-core text search code will notice. - Use plain VARDATA() on values detoasted or synthesized earlier in the same function. Use VARDATA_ANY() on varlenas sourced outside the function, even if they happen to always have four-byte headers. As an exception, retain the universal practice of using VARDATA() on return values of SendFunctionCall(). - Retain PG_GETARG_BYTEA_P() in pageinspect. (Page images are too large for a one-byte header, so this misses no optimization.) Sites that do not call get_page_from_raw() typically need the four-byte alignment. - For now, do not change btree_gist. Its use of four-byte headers in memory is partly entangled with storage of 4-byte headers inside GBT_VARKEY, on disk. - For now, do not change gtrgm_consistent() or gtrgm_distance(). They incorporate the varlena header into a cache, and there are multiple credible implementation strategies to consider.	2017-03-12 19:35:34 -04:00
Peter Eisentraut	f21a563d25	Move some things from builtins.h to new header files This avoids that builtins.h has to include additional header files.	2017-01-20 20:29:53 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	202ac08c08	Update unaccent extension for parallel query. All functions provided by this extension are PARALLEL SAFE. Andreas Karlsson	2016-06-14 14:55:49 -04:00
Teodor Sigaev	ce91b9209f	fix typo in comment	2016-03-16 17:18:14 +03:00
Teodor Sigaev	9a206d063c	Improve script generating unaccent rules Script now use the standard Unicode transliterator Latin-ASCII. Author: Leonard Benedetti	2016-03-16 16:47:03 +03:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Teodor Sigaev	1bbd52cb9a	Make unaccent handle all diacritics known to Unicode, and expand ligatures correctly Add Python script for buiding unaccent.rules from Unicode data. Don't backpatch because unaccent changes may require tsvector/index rebuild. Thomas Munro <thomas.munro@enterprisedb.com>	2015-09-04 12:51:53 +03:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Andres Freund	d153b80161	Fix typos in some error messages thrown by extension scripts when fed to psql. Some of the many error messages introduced in `458857cc` missed 'FROM unpackaged'. Also `e016b724` and `45ffeb7e` forgot to quote extension version numbers. Backpatch to 9.1, just like `458857cc` which introduced the messages. Do so because the error messages thrown when the wrong command is copy & pasted aren't easy to understand.	2014-08-25 18:30:37 +02:00
Noah Misch	0ffc201a51	Add file version information to most installed Windows binaries. Prominent binaries already had this metadata. A handful of minor binaries, such as pg_regress.exe, still lack it; efforts to eliminate such exceptions are welcome. Michael Paquier, reviewed by MauMau.	2014-07-14 14:07:52 -04:00
Tom Lane	5a421a47eb	Fix inadequately-sized output buffer in contrib/unaccent. The output buffer size in unaccent_lexize() was calculated as input string length times pg_database_encoding_max_length(), which effectively assumes that replacement strings aren't more than one character. While that was all that we previously documented it to support, the code actually has always allowed replacement strings of arbitrary length; so if you tried to make use of longer strings, you were at risk of buffer overrun. To fix, use an expansible StringInfo buffer instead of trying to determine the maximum space needed a-priori. This would be a security issue if unaccent rules files could be installed by unprivileged users; but fortunately they can't, so in the back branches the problem can be labeled as improper configuration by a superuser. Nonetheless, a memory stomp isn't a nice way of reacting to improper configuration, so let's back-patch the fix.	2014-07-01 11:23:21 -04:00
Tom Lane	03a25cec8d	Issue a WARNING about invalid rule file format in contrib/unaccent. We were already issuing a WARNING, albeit only elog not ereport, for duplicate source strings; so warning rather than just being stoically silent seems like the best thing to do here. Arguably both of these complaints should be upgraded to ERRORs, but that might be more behavioral change than people want. Note: the faulty line is already printed via an errcontext hook, so there's no need for more information than these messages provide.	2014-06-30 22:03:37 -04:00
Tom Lane	1b2488731c	Allow multi-character source strings in contrib/unaccent. This could be useful in languages where diacritic signs are represented as separate characters; more generally it supports using unaccent dictionaries for substring substitutions beyond narrowly conceived "diacritic removal". In any case, since the rule-file parser doesn't complain about multi-character source strings, it behooves us to do something unsurprising with them.	2014-06-30 21:46:29 -04:00
Tom Lane	97c40ce614	Allow empty replacement strings in contrib/unaccent. This is useful in languages where diacritic signs are represented as separate characters; it's also one step towards letting unaccent be used for arbitrary substring substitutions. In passing, improve the user documentation for unaccent, which was sadly vague about some important details. Mohammad Alhashash, reviewed by Abhijit Menon-Sen	2014-06-30 20:51:30 -04:00
Peter Eisentraut	e7128e8dbb	Create function prototype as part of PG_FUNCTION_INFO_V1 macro Because of gcc -Wmissing-prototypes, all functions in dynamically loadable modules must have a separate prototype declaration. This is meant to detect global functions that are not declared in header files, but in cases where the function is called via dfmgr, this is redundant. Besides filling up space with boilerplate, this is a frequent source of compiler warnings in extension modules. We can fix that by creating the function prototype as part of the PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That makes the code of modules cleaner, because there is one less place where the entry points have to be listed, and creates an additional check that functions have the right prototype. Remove now redundant prototypes from contrib and other modules.	2014-04-18 00:03:19 -04:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Bruce Momjian	0dbf9a6a91	unaccent: Revert patch `9299f61798` The reverted patch to change functions from strict to immutable was incorrect and needs additional research.	2013-11-18 15:54:34 -05:00
Bruce Momjian	9299f61798	unaccent: mark unaccent() functions as immutable Suggestion from Pavel Stehule	2013-10-08 12:20:36 -04:00
Bruce Momjian	9af4159fce	pgindent run for release 9.3 This is the first run of the Perl-based pgindent script. Also update pgindent instructions.	2013-05-29 16:58:43 -04:00
Heikki Linnakangas	4b06c1820a	The data structure used in unaccent is a trie, not suffix tree. Fix the term used in variable and struct names, and comments. Alexander Korotkov	2013-05-08 20:58:50 +03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Peter Eisentraut	48658a1b81	Fix some typos Josh Kupershmidt	2012-04-22 19:23:47 +03:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Tom Lane	ced3a93ccb	Fix assorted bugs in contrib/unaccent's configuration file parsing. Make it use t_isspace() to identify whitespace, rather than relying on sscanf which is known to get it wrong on some platform/locale combinations. Get rid of fixed-size buffers. Make it actually continue to parse the file after ignoring a line with untranslatable characters, as was obviously intended. The first of these issues is per gripe from J Smith, though not exactly either of his proposed patches.	2011-11-07 11:50:18 -05:00
Tom Lane	458857cc9d	Throw a useful error message if an extension script file is fed to psql. We have seen one too many reports of people trying to use 9.1 extension files in the old-fashioned way of sourcing them in psql. Not only does that usually not work (due to failure to substitute for MODULE_PATHNAME and/or @extschema@), but if it did work they'd get a collection of loose objects not an extension. To prevent this, insert an \echo ... \quit line that prints a suitable error message into each extension script file, and teach commands/extension.c to ignore lines starting with \echo. That should not only prevent any adverse consequences of loading a script file the wrong way, but make it crystal clear to users that they need to do it differently now. Tom Lane, following an idea of Andrew Dunstan's. Back-patch into 9.1 ... there is not going to be much value in this if we wait till 9.2.	2011-10-12 15:45:03 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Peter Eisentraut	f8ebe3bcc5	Support "make check" in contrib Added a new option --extra-install to pg_regress to arrange installing the respective contrib directory into the temporary installation. This is currently not yet supported for Windows MSVC builds. Updated the .gitignore files for contrib modules to ignore the leftovers of a temp-install check run. Changed the exit status of "make check" in a pgxs build (which still does nothing) to 0 from 1. Added "make check" in contrib to top-level "make check-world".	2011-04-25 22:27:11 +03:00
Peter Eisentraut	385942f46c	Refix the unaccent regression test on MSVC properly ... for some value of "properly". Instead of overriding REGRESS_OPTS, set the variables ENCODING and NO_LOCALE, which is more expressive and allows overriding by the user. Fix vcregress.pl to handle that.	2011-04-19 22:52:52 +03:00
Andrew Dunstan	b7b86924c6	Attempt to remedy buildfarm breakage caused by commit `f536d4194`.	2011-04-18 09:27:30 -04:00
Peter Eisentraut	f536d41942	Rename pg_regress option --multibyte to --encoding Also refactor things a little bit so that the same methods for setting test locale and encoding can be used everywhere.	2011-04-15 08:42:05 +03:00
Tom Lane	0024e34898	Fix upgrade of contrib/intarray and contrib/unaccent from 9.0. Take care of a couple of discrepancies between what you get from a fresh install and what the first-draft update-from-unpackaged scripts produced.	2011-02-17 17:45:09 -05:00
Tom Lane	029fac2264	Avoid use of CREATE OR REPLACE FUNCTION in extension installation files. It was never terribly consistent to use OR REPLACE (because of the lack of comparable functionality for data types, operators, etc), and experimentation shows that it's now positively pernicious in the extension world. We really want a failure to occur if there are any conflicts, else it's unclear what the extension-ownership state of the conflicted object ought to be. Most of the time, CREATE EXTENSION will fail anyway because of conflicts on other object types, but an extension defining only functions can succeed, with bad results.	2011-02-13 22:54:52 -05:00
Tom Lane	629b3af27d	Convert contrib modules to use the extension facility. This isn't fully tested as yet, in particular I'm not sure that the "foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some buildfarm cycles on it. sepgsql is not converted to an extension, mainly because it seems to require a very nonstandard installation process. Dimitri Fontaine and Tom Lane	2011-02-13 22:54:49 -05:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Bruce Momjian	c0577c92a8	Mark unaccent functions as STABLE, rather than defaulting to VOLATILE.	2010-12-27 15:34:42 -05:00
Peter Eisentraut	fc946c39ae	Remove useless whitespace at end of lines	2010-11-23 22:34:55 +02:00
Tom Lane	cc2c8152e6	Some more gitignore cleanups: cover contrib and PL regression test outputs. Also do some further work in the back branches, where quite a bit wasn't covered by Magnus' original back-patch.	2010-09-22 17:22:40 -04:00
Magnus Hagander	fe9b36fd59	Convert cvsignore to gitignore, and add .gitignore for build targets.	2010-09-22 12:57:04 +02:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Robert Haas	fd1843ff89	Standardize get_whatever_oid functions for other object types. - Rename TSParserGetPrsid to get_ts_parser_oid. - Rename TSDictionaryGetDictid to get_ts_dict_oid. - Rename TSTemplateGetTmplid to get_ts_template_oid. - Rename TSConfigGetCfgid to get_ts_config_oid. - Rename FindConversionByName to get_conversion_oid. - Rename GetConstraintName to get_constraint_oid. - Add new functions get_opclass_oid, get_opfamily_oid, get_rewrite_oid, get_rewrite_oid_without_relid, get_trigger_oid, and get_cast_oid. The name of each function matches the corresponding catalog. Thanks to KaiGai Kohei for the review.	2010-08-05 15:25:36 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00

1 2

56 Commits