2
0
mirror of https://git.postgresql.org/git/postgresql.git synced 2024-12-15 08:20:16 +08:00
Commit Graph

7 Commits

Author SHA1 Message Date
Thomas Munro
456e3718e7 Add combining characters to unaccent.rules.
Strip certain classes of combining characters, so that accents encoded
this way are removed.

Author: Hugh Ranalli
Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f%40postgresql.org
2019-02-01 15:23:01 +01:00
Michael Paquier
e1c1d5444e Update unaccent rules with release 34 of CLDR for Latin-ASCII.xml
This has required an update of the python script generating the rules,
as its format has changed in release 29.  This release has also added
new punctuation and symbols, and a new set of rules has been generated
to include them.  The way to find newest versions of Latin-ASCII gets
also more clearly documented.

Author: Hugh Ranalli, Michael Paquier
Discussion: https://postgr.es/m/15548-cef1b3f8de190d4f@postgresql.org
2019-01-10 14:10:21 +09:00
Thomas Munro
5e8d670c31 Add Greek characters to unaccent.rules.
Author: Tasos Maschalidis
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/153495048900.1368.11566580687623014380%40wrigleys.postgresql.org
Discussion: https://postgr.es/m/VI1PR01MB38537EBD529FE5EE3FE9A5FEB5370%40VI1PR01MB3853.eurprd01.prod.exchangelabs.com
2018-09-02 07:12:24 +12:00
Tom Lane
ec0a69e49b Extend the default rules file for contrib/unaccent with Vietnamese letters.
Improve generate_unaccent_rules.py to handle composed characters whose base
is another composed character rather than a plain letter.  The net effect
of this is to add a bunch of multi-accented Vietnamese characters to
unaccent.rules.

Original complaint from Kha Nguyen, diagnosis of the script's shortcoming
by Thomas Munro.

Dang Minh Huong and Michael Paquier

Discussion: https://postgr.es/m/CALo3sF6EC8cy1F2JUz=GRf5h4LMUJTaG3qpdoiLrNbWEXL-tRg@mail.gmail.com
2017-08-16 16:51:56 -04:00
Teodor Sigaev
9a206d063c Improve script generating unaccent rules
Script now use the standard Unicode transliterator Latin-ASCII.

Author: Leonard Benedetti
2016-03-16 16:47:03 +03:00
Teodor Sigaev
1bbd52cb9a Make unaccent handle all diacritics known to Unicode, and expand ligatures correctly
Add Python script for buiding unaccent.rules from Unicode data. Don't
backpatch because unaccent changes may require tsvector/index
rebuild.

Thomas Munro <thomas.munro@enterprisedb.com>
2015-09-04 12:51:53 +03:00
Teodor Sigaev
92e05bc6a5 Unaccent dictionary. 2009-08-18 10:34:39 +00:00