postgresql/contrib/unaccent
Michael Paquier e3dd7c06e6 Simplify a bit the special rules generating unaccent.rules
As noted by Thomas Munro, CLDR 36 has added SOUND RECORDING COPYRIGHT
(U+2117), and we use CLDR 41, so this can be removed from the set of
special cases.

The set of regression tests is expanded for degree signs, which are two
of the special cases, and a fancy case with U+210C in Latin-ASCII.xml
that we have discovered about when diving into what could be done for
Cyrillic characters (this last part is material for a future patch, not
tackled yet).

While on it, some of the assertions of generate_unaccent_rules.py are
expanded to report the codepoint on which a failure is found, something
useful for debugging.

Extracted from a larger patch by the same author.

Author: Przemysław Sztoch
Discussion: https://postgr.es/m/8478da0d-3b61-d24f-80b4-ce2f5e971c60@sztoch.pl
2022-07-05 16:17:51 +09:00
..
expected Simplify a bit the special rules generating unaccent.rules 2022-07-05 16:17:51 +09:00
sql Simplify a bit the special rules generating unaccent.rules 2022-07-05 16:17:51 +09:00
.gitignore Add support for automatically updating Unicode derived files 2020-01-09 10:08:14 +01:00
generate_unaccent_rules.py Simplify a bit the special rules generating unaccent.rules 2022-07-05 16:17:51 +09:00
Makefile Make update-unicode target work in vpath builds 2022-03-25 09:47:50 +01:00
unaccent--1.0--1.1.sql
unaccent--1.1.sql
unaccent.c Update copyright for 2022 2022-01-07 19:04:57 -05:00
unaccent.control Mark some contrib modules as "trusted". 2020-02-13 15:02:35 -05:00
unaccent.rules Re-update Unicode data to CLDR 39 2022-03-10 14:09:21 +01:00