mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-15 08:20:16 +08:00
661ecf3c48
Included are patches intended for allowing PostgreSQL to handle multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and Mule internal code. With the MB patch you can use multi-byte character sets in regexp and LIKE. The encoding system chosen is determined at the compile time. To enable the MB extension, you need to define a variable "MB" in Makefile.global or in Makefile.custom. For further information please take a look at README.mb under doc directory. (Note that unlike "jp patch" I do not use modified GNU regexp any more. I changed Henry Spencer's regexp coming with PostgreSQL.)
68 lines
1.7 KiB
Plaintext
68 lines
1.7 KiB
Plaintext
postgresql 6.3 multi-byte(MB) patch PL2 README Mar 10 1998
|
|
|
|
Tatsuo Ishii
|
|
t-ishii@sra.co.jp
|
|
http://www.sra.co.jp/people/t-ishii/PostgreSQL/
|
|
|
|
Introduction
|
|
|
|
MB patch is intended for allowing PostgreSQL to handle multi-byte
|
|
charachter sets such as EUC(Extende Unix Code), Unicode and Mule
|
|
internal code. With the MB patch you can use multi-byte character sets
|
|
in regexp and LIKE. The encoding system chosen is determined at the
|
|
compile time.
|
|
|
|
The patch also fixes some problems concerning with 8-bit single byte
|
|
character sets including ISO8859. (I would not say all of problems
|
|
have been fixed. I just confirmed that the regression test ran fine
|
|
and a few French characters could be used with the patch. Please let
|
|
me know if you find any problem while using 8-bit characters)
|
|
|
|
How to use
|
|
|
|
After applying the MB patch, create src/Makefile.custom with a line
|
|
including:
|
|
|
|
MB=encoding_system
|
|
|
|
where encoding_system is one of:
|
|
|
|
EUC_JP Japanese EUC
|
|
EUC_CN Chinese EUC
|
|
EUC_KR Korean EUC
|
|
EUC_TW Taiwan EUC
|
|
UNICODE Unicode(UTF-8)
|
|
MULE_INTERNAL Mule internal
|
|
|
|
Example:
|
|
|
|
% cat Makefile.custom
|
|
MB=EUC_JP
|
|
|
|
If MB is not defined, nothing is changed except better supporting for
|
|
8-bit single byte character sets.
|
|
|
|
References
|
|
|
|
These are good sources to start learning various kind of encoding
|
|
systems.
|
|
|
|
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
|
|
Detailed explanations of EUC_JP, EUC_CN, EUC_KR, EUC_TW
|
|
appear in section 3.2.
|
|
|
|
Unicode: http://www.unicode.org/
|
|
The homepage of UNICODE.
|
|
|
|
RFC 2044
|
|
UTF-8 is defined here.
|
|
|
|
History
|
|
|
|
Mar 10, 1998 PL2 released
|
|
* add regression test for EUC_JP, EUC_CN and MULE_INTERNAL
|
|
* add an English document (this file)
|
|
* fix problems concerning 8-bit single byte characters
|
|
|
|
Mar 1, 1998 PL1 released
|