mirror of
https://git.postgresql.org/git/postgresql.git
synced 2024-12-03 08:00:21 +08:00
Convert more charset/locale documentation to DocBook
This commit is contained in:
parent
333cbc2dab
commit
0ba77c14aa
@ -1,113 +0,0 @@
|
||||
|
||||
PostgreSQL Charsets README
|
||||
Josef Balatka, <balatka@email.cz>
|
||||
Draft v0.1, Tue Jul 20 15:49:07 CEST 1999
|
||||
|
||||
This document is a brief overview of the national charsets support
|
||||
that PostgreSQL ver. 6.5 has implemented. Various compilation options
|
||||
and setup tips are mentioned here to be helpful in the particular use.
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
Table of Contents
|
||||
|
||||
1. Locale awareness
|
||||
|
||||
2. Single-byte charsets recoding
|
||||
|
||||
3. Multi-byte support/recoding
|
||||
|
||||
4. Credits
|
||||
|
||||
---------------------------------------------------------------------------
|
||||
|
||||
1. Locale awareness
|
||||
|
||||
PostgreSQL server supports both locale aware and locale not aware
|
||||
(default) operational modes. You can determine this mode during the
|
||||
configuration stage of the installation with --enable-locale option.
|
||||
|
||||
If you don't use --enable-locale, the multi-language code will not be
|
||||
compiled and PostgreSQL will behave as an ASCII compliant application.
|
||||
This mode is useful for its speed but only provided that you don't
|
||||
have to consider national specific chars.
|
||||
|
||||
With --enable-locale you will get a locale aware server using LC_*
|
||||
environment variables to determine how to process national specifics.
|
||||
In this case strcoll(3) and similar functions are used internally
|
||||
so speed is somewhat lower.
|
||||
|
||||
Notice here that --enable-locale is sufficient when all your clients
|
||||
use the same single-byte encoding as the database server does.
|
||||
|
||||
When your clients use encoding different from the server than you have
|
||||
to use, moreover, --enable-recode or --with-mb=<encoding> options on
|
||||
the server side or a particular client that does recoding itself (e.g.
|
||||
there exists a PostgreSQL ODBC driver for Win32 with various Cyrillic
|
||||
encoding capability). Option --with-mb=<encoding> is necessary for the
|
||||
multi-byte charsets support.
|
||||
|
||||
|
||||
2. Single-byte charsets recoding
|
||||
|
||||
You can set up this feature with --enable-recode option. This option
|
||||
is described as 'enable Cyrillic recode support' which doesn't express
|
||||
all its power. It can be used for *any* single-byte charset recoding.
|
||||
|
||||
This method uses charset.conf file located in the $PGDATA directory.
|
||||
It's a typical configuration text file where spaces and newlines
|
||||
separate items and records and # specifies comments. Three keywords
|
||||
with the following syntax are recognized here:
|
||||
|
||||
BaseCharset <server_charset>
|
||||
RecodeTable <from_charset> <to_charset> <file_name>
|
||||
HostCharset <host_spec> <host_charset>
|
||||
|
||||
BaseCharset defines encoding of the database server. All charset
|
||||
names are only used for mapping inside the charset.conf so you can
|
||||
freely use typing-friendly names.
|
||||
|
||||
RecodeTable records specify translation table between server and client.
|
||||
The file name is relative to the $PGDATA directory. Table file format
|
||||
is very simple. There are no keywords and characters are represented by
|
||||
a pair of decimal or hexadecimal (0x prefixed) values on single lines:
|
||||
|
||||
<char_value> <translated_char_value>
|
||||
|
||||
HostCharset records define IP address and charset. You can use a single
|
||||
IP address, an IP mask range starting from the given address or an IP
|
||||
interval (e.g. 127.0.0.1, 192.168.1.100/24, 192.168.1.20-192.168.1.40)
|
||||
|
||||
The charset.conf is always processed up to the end, so you can easily
|
||||
specify exceptions from the previous rules. In the src/data you will
|
||||
find charset.conf example and a few recoding tables.
|
||||
|
||||
As this solution is based on the client's IP address / charset mapping
|
||||
there are obviously some restrictions as well. You can't use different
|
||||
encoding on the same host at the same time. It's also inconvenient when
|
||||
you boot your client hosts into more operating systems.
|
||||
Nevertheless, when these restrictions are not limiting and you don't
|
||||
need multi-byte chars than it's a simple and effective solution.
|
||||
|
||||
|
||||
3. Multi-byte support/recoding
|
||||
|
||||
It's a new generation of charset encoding in PostgreSQL designed as a
|
||||
more complex solution supporting both single-byte and multi-byte chars.
|
||||
You can set up this feature with --with-mb=<encoding> option.
|
||||
|
||||
There is no IP mapping file and recoding is controlled through the new
|
||||
SQL statements. Recoding tables are included in the code. Many national
|
||||
charsets are already supported and further will follow.
|
||||
|
||||
See doc/README.mb, doc/README.mb.jp to get detailed instruction on how
|
||||
to use the multibyte support. In the file doc/README.locale there is
|
||||
a particular instruction on usage of the multibyte support with Cyrillic.
|
||||
|
||||
|
||||
4. Credits
|
||||
|
||||
I'd like to thank the PostgreSQL development team and all contributors
|
||||
for creating PostgreSQL. Thanks to Oleg Bartunov, Oleg Broytmann and
|
||||
Tatsuo Ishii for opening the door into the multi-language world.
|
||||
|
@ -1,107 +0,0 @@
|
||||
===========
|
||||
1999 Jul 21
|
||||
===========
|
||||
|
||||
Josef Balatka, <balatka@email.cz> asked us not to remove RECODE and sent me
|
||||
Czech ISO-8859-2 -> WIN-1250 translation table.
|
||||
RECODE is no longer contains just Cyrillic RECODE and will stay in
|
||||
PostgreSQL.
|
||||
|
||||
He also created some bits of documentation, mostly concerning RECODE -
|
||||
see README.Charsets.
|
||||
|
||||
|
||||
===========
|
||||
1999 Apr 14
|
||||
===========
|
||||
|
||||
Tatsuo Ishii <t-ishii@sra.co.jp> updated Multibyte support extending it
|
||||
to Cyrillic language. Now PostgreSQL supports KOI8-R, WIN-1251, ISO8859-5
|
||||
and CP866 (ALT) encodings.
|
||||
|
||||
Short instruction on using this feature follows. Longer discussion of
|
||||
Multibyte support is in README.mb.
|
||||
|
||||
WARNING! Now with Multibyte support Cyrillic RECODE declared obsolete
|
||||
and will be removed from Postgres. If you are using RECODE consider
|
||||
switching to Multibyte support.
|
||||
|
||||
Instructions on how to prepare Postgres for Cyrillic Multibyte support.
|
||||
----------------------------------------------------------------------
|
||||
|
||||
First, you need to backup all your databases. I recommend to backup the
|
||||
entire Postgres directory, including binaries and libraries - thus you can
|
||||
easily restore if something goes wrong.
|
||||
|
||||
Dump you data: pg_dumpall > dump.db
|
||||
|
||||
Stop postmaster.
|
||||
|
||||
Configure, compile and install Postgres. (I'll mostly talk about KOI8-R
|
||||
encoding, this is just to make examples a little more clear; you can use
|
||||
any supported encoding.)
|
||||
|
||||
cd src
|
||||
./configure --enable-locale --with-mb=KOI8
|
||||
make
|
||||
make install
|
||||
|
||||
Make sure you've backed up your databases. Doublecheck your backup. I
|
||||
really mean it - make regular backups and test your backups sometimes by
|
||||
fake restore.
|
||||
|
||||
Remove your data directory (better, rename or move it).
|
||||
|
||||
Run initdb saying your primary encoding: initdb -e KOI8. If you omit
|
||||
encoding, primary encoding from configure will be taken.
|
||||
|
||||
Start postmaster.
|
||||
|
||||
Create databases: createdb -e KOI8. Again, you can omit encoding -
|
||||
default encoding will be used. You are not forced to use the same encoding
|
||||
for all your databases - you can create different databases with different
|
||||
encodings.
|
||||
|
||||
Load your data from the dump you've created: psql < dump.db
|
||||
|
||||
That's all! Now you are ready to enjoy the full power of Multibyte
|
||||
support.
|
||||
|
||||
To use Multibyte support you do not need to do something special - just
|
||||
execute your queries. If client program does not set encoding, it will get
|
||||
the data in database encoding. But client may ask Postgres to do automatic
|
||||
server-to-client and client-to-server conversions. There are 2 (two) ways
|
||||
client program declares its encoding:
|
||||
1) client explicitly executes the query SET CLIENT_ENCODING TO 'win';
|
||||
2) client started with environment variable set. Examples -
|
||||
using sh syntax:
|
||||
PGCLIENTENCODING='win'; export PGCLIENTENCODING
|
||||
using csh syntax:
|
||||
setenv PGCLIENTENCODING 'win'
|
||||
|
||||
Setting PGCLIENTENCODING even if you use same client encding as the
|
||||
database would omit an overhead of asking the database encoding while
|
||||
initiating the connection, so it is good idea to set it in any case.
|
||||
|
||||
Now you may run test suite and see Multibyte support in action. Go to
|
||||
.../src/test/locale and run
|
||||
make clean all test-koi2win
|
||||
|
||||
|
||||
===========
|
||||
1998 Nov 20
|
||||
===========
|
||||
|
||||
I extended locale support, originally written by Oleg Bartunov
|
||||
<oleg@sai.msu.su>. Now ORDER BY (if PostgreSQL configured with
|
||||
--enable-locale) uses strcoll() for all text fields: char(n), varchar(n),
|
||||
text.
|
||||
|
||||
I included test suite .../src/test/locale. I didn't include this in
|
||||
the regression test because not so much people require locale support. Read
|
||||
.../src/test/locale/README for details on the test suite.
|
||||
|
||||
Many thanks to Oleg Bartunov (oleg@sai.msu.su) and Thomas G. Lockhart
|
||||
(lockhart@alumni.caltech.edu) for hints, tips, help and discussion.
|
||||
|
||||
Oleg.
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.26 2000/09/12 05:37:07 thomas Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.27 2000/09/30 16:58:20 petere Exp $
|
||||
|
||||
Postgres Administrator's Guide.
|
||||
Derived from postgres.sgml.
|
||||
@ -98,9 +98,9 @@ Derived from postgres.sgml.
|
||||
&intro-ag;
|
||||
&installation;
|
||||
&installw;
|
||||
&charset;
|
||||
&runtime;
|
||||
&client-auth;
|
||||
&charset;
|
||||
&manage-ag;
|
||||
&user-manag;
|
||||
&backup;
|
||||
|
@ -1,44 +1,235 @@
|
||||
<chapter id="charset">
|
||||
<title>Character Sets</title>
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.3 2000/09/30 16:58:20 petere Exp $ -->
|
||||
|
||||
<abstract>
|
||||
<para>
|
||||
Describes the available language and character set support in
|
||||
<productname>Postgres</productname>.
|
||||
</para>
|
||||
</abstract>
|
||||
<chapter id="charset">
|
||||
<title>Localization</>
|
||||
|
||||
<abstract>
|
||||
<para>
|
||||
Describes the available localization features from the point of
|
||||
view of the administrator.
|
||||
</para>
|
||||
</abstract>
|
||||
|
||||
<para>
|
||||
<productname>Postgres</productname> supports non-ASCII character
|
||||
sets with two approaches:
|
||||
<productname>Postgres</productname> supports localization with
|
||||
three approaches:
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Using locale features in underlying
|
||||
system libraries. This allows single-byte character sets to be
|
||||
configured with a locale-specific collation order, provided that
|
||||
the underlying system supports the required locale. This
|
||||
technique supports only one character set per server, and can
|
||||
not support multi-byte character sets.
|
||||
Using the locale features of the operating system to provide
|
||||
locale-specific collation order, number formatting, and other
|
||||
aspects.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Using explicit multiple-byte character sets defined in the
|
||||
<productname>Postgres</productname> server. These character sets
|
||||
are also known to some client libraries. The number of character
|
||||
sets is fixed at the time the server is compiled, and internal
|
||||
operations such as string comparisons require expansion of each
|
||||
character into a 32-bit word.
|
||||
<productname>Postgres</productname> server to support languages
|
||||
that require more characters than will fit into a single byte,
|
||||
and to provide character set recoding between client and server.
|
||||
The number of supported character sets is fixed at the time the
|
||||
server is compiled, and internal operations such as string
|
||||
comparisons require expansion of each character into a 32-bit
|
||||
word.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Single byte character recoding provides a more light-weight
|
||||
solution for users of multiple, yet single-byte character sets.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
|
||||
<sect1 id="locale">
|
||||
<title>Locale Support</title>
|
||||
|
||||
<para>
|
||||
<firstterm>Locale</> support refers to an application respecting
|
||||
cultural preferences regarding alphabets, sorting, number
|
||||
formatting, etc. <productname>PostgreSQL</> uses the standard ISO
|
||||
C and POSIX-like locale facilities provided by the server operating
|
||||
system. For additional information refer the documentation of your
|
||||
system.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Overview</>
|
||||
|
||||
<para>
|
||||
Locale support is not build into <productname>PostgreSQL</> by
|
||||
default; to enable it, supply the <option>--enable-locale</> option
|
||||
to the <filename>configure</> script:
|
||||
<informalexample>
|
||||
<screen>
|
||||
<prompt>$ </><userinput>./configure --enable-locale</>
|
||||
</screen>
|
||||
</informalexample>
|
||||
Locale support only affects the server; all clients are compatible
|
||||
with servers with or without locale support.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The information about which particular cultural rules to use is
|
||||
determined by standard environment variables. If you are getting
|
||||
localized behavior from other programs you probably have them set
|
||||
up already. The simplest way to set the localization information
|
||||
is the <envar>LANG</> variable, for example:
|
||||
<programlisting>
|
||||
export LANG=sv_SE
|
||||
</programlisting>
|
||||
This sets the locale to Swedish (<literal>sv</>) as spoken in
|
||||
Sweden (<literal>SE</>). Other possibilities might be
|
||||
<literal>en_US</> (U.S. English) and <literal>fr_CA</> (Canada,
|
||||
French). If more than one character set can be useful for a locale
|
||||
then the specifications look like this:
|
||||
<literal>cs_CZ.ISO8859-2</>. What locales are available under what
|
||||
names on your system depends on what was provided by the operating
|
||||
system vendor and what was installed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Occasionally it is useful to mix rules from several locales, e.g.,
|
||||
use U.S. rules but Spanish messages. To do that a set of
|
||||
environment variables exist that override the default of
|
||||
<envar>LANG</> for a particular category:
|
||||
|
||||
<informaltable>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>LC_COLLATE</>
|
||||
<entry>String sort order</>
|
||||
</row>
|
||||
<row>
|
||||
<entry>LC_CTYPE</>
|
||||
<entry>Character classification (What is a letter? What is the upper-case equivalent of this letter?)</>
|
||||
</row>
|
||||
<row>
|
||||
<entry>LC_MESSAGES</>
|
||||
<entry>Language of messages</>
|
||||
</row>
|
||||
<row>
|
||||
<entry>LC_MONETARY</>
|
||||
<entry>Formatting of currency amounts</>
|
||||
</row>
|
||||
<row>
|
||||
<entry>LC_NUMERIC</>
|
||||
<entry>Formatting of numbers</>
|
||||
</row>
|
||||
<row>
|
||||
<entry>LC_TIME</>
|
||||
<entry>Formatting of dates and times</>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</informaltable>
|
||||
|
||||
<envar>LC_MESSAGES</> only affects the messages that come from the
|
||||
operating system, not <productname>PostgreSQL</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you want the system to behave as if it had no locale support,
|
||||
use the special locale <literal>C</> or <literal>POSIX</>, or
|
||||
simply unset all locale related variables.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Once you have chosen a set of localization rules this way you must
|
||||
keep them fixed for any particular database cluster. That means
|
||||
that the locales that were active when you ran <filename>initdb</>
|
||||
must be kept the same when you start the postmaster. Otherwise,
|
||||
the changed sort order can corrupt indexes or make your data
|
||||
disappear mysteriously. It is currently not possible to change the
|
||||
locales after database initialization or to use more than one set
|
||||
of locales for a given database cluster.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Benefits</>
|
||||
|
||||
<para>
|
||||
Locale support influences in particular the following features:
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Sort order in <command>ORDER BY</> queries.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The <function>to_char</> family of functions
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The <literal>LIKE</> and <literal>~</> operators for pattern
|
||||
matching
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The only severe drawback of using the locale support in
|
||||
<productname>PostgreSQL</> is its speed. So use locale only if you
|
||||
actually need it.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Problems</>
|
||||
|
||||
<para>
|
||||
If locale support doesn't work in spite of the explanation above,
|
||||
check that the locale support in your operating system is okay.
|
||||
To check whether a given locale is installed and functional you
|
||||
can use <application>Perl</>, for example. Perl has also support
|
||||
for locales and if a locale is broken <command>perl -v</> will
|
||||
complain something like this:
|
||||
<screen>
|
||||
<prompt>$</> <userinput>export LC_CTYPE='not_exist'</>
|
||||
<prompt>$</> <userinput>perl -v</>
|
||||
<computeroutput>
|
||||
perl: warning: Setting locale failed.
|
||||
perl: warning: Please check that your locale settings:
|
||||
LC_ALL = (unset),
|
||||
LC_CTYPE = "not_exist",
|
||||
LANG = (unset)
|
||||
are supported and installed on your system.
|
||||
perl: warning: Falling back to the standard locale ("C").
|
||||
</computeroutput>
|
||||
</screen>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Check that your locale files are in the right location. Possible
|
||||
locations include: <filename>/usr/lib/locale</filename> (Linux,
|
||||
Solaris), <filename>/usr/share/locale</filename> (Linux),
|
||||
<filename>/usr/lib/nls/loc</filename> (DUX 4.0). Check the locale
|
||||
man page of your system if you are not sure.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The directory <filename>src/test/locale</> contains a test suite
|
||||
for <productname>PostgreSQL</>'s locale support.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="multibyte">
|
||||
<title>Multi-byte Support</title>
|
||||
<title>Multibyte Support</title>
|
||||
|
||||
<note>
|
||||
<title>Author</title>
|
||||
@ -53,7 +244,7 @@
|
||||
</note>
|
||||
|
||||
<para>
|
||||
Multi-byte (<acronym>MB</acronym>) support is intended to allow
|
||||
Multibyte (<acronym>MB</acronym>) support is intended to allow
|
||||
<productname>Postgres</productname> to handle
|
||||
multiple-byte character sets such as EUC (Extended Unix Code), Unicode and
|
||||
Mule internal code. With <acronym>MB</acronym> enabled you can use multi-byte
|
||||
@ -680,7 +871,78 @@ SET CLIENT_ENCODING = 'WIN1250';
|
||||
</procedure>
|
||||
</sect2>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
|
||||
<sect1 id="recode">
|
||||
<title>Single-byte character set recoding</>
|
||||
<!-- formerly in README.charsets, by Josef Balatka, <balatka@email.cz> -->
|
||||
|
||||
<para>
|
||||
You can set up this feature with the <option>--enable-recode</> option
|
||||
to <filename>configure</>. This option was formerly described as
|
||||
<quote>Cyrillic recode support</> which doesn't express all its
|
||||
power. It can be used for <emphasis>any</> single-byte character
|
||||
set recoding.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This method uses a file <filename>charset.conf</> file located in
|
||||
the database directory (<envar>PGDATA</>). It's a typical
|
||||
configuration text file where spaces and newlines separate items
|
||||
and records and # specifies comments. Three keywords with the
|
||||
following syntax are recognized here:
|
||||
<synopsis>
|
||||
BaseCharset <replaceable>server_charset</>
|
||||
RecodeTable <replaceable>from_charset</> <replaceable>to_charset</> <replaceable>file_name</>
|
||||
HostCharset <replaceable>host_spec</> <replaceable>host_charset</>
|
||||
</synopsis>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<token>BaseCharset</> defines the encoding of the database server.
|
||||
All character set names are only used for mapping inside of
|
||||
<filename>charset.conf</> so you can freely use typing-friendly
|
||||
names.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<token>RecodeTable</> records specify translation tables between
|
||||
server and client. The file name is relative to the
|
||||
<envar>PGDATA</> directory. The table file format is very
|
||||
simple. There are no keywords and characters are represented by a
|
||||
pair of decimal or hexadecimal (0x prefixed) values on single
|
||||
lines:
|
||||
<synopsis>
|
||||
<replaceable>char_value</> <replaceable>translated_char_value</>
|
||||
</synopsis>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<token>HostCharset</> records define the client character set by IP
|
||||
address. You can use a single IP address, an IP mask range starting
|
||||
from the given address or an IP interval (e.g., 127.0.0.1,
|
||||
192.168.1.100/24, 192.168.1.20-192.168.1.40).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <filename>charset.conf</> file is always processed up to the
|
||||
end, so you can easily specify exceptions from the previous
|
||||
rules. In the src/data you will find charset.conf example and a few
|
||||
recoding tables.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
As this solution is based on the client's IP address and character
|
||||
set mapping there are obviously some restrictions as well. You
|
||||
cannot use different encodings on the same host at the same
|
||||
time. It is also inconvenient when you boot your client hosts into
|
||||
more operating systems. Nevertheless, when these restrictions are
|
||||
not limiting and you do not need multi-byte characters than it is a
|
||||
simple and effective solution.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.21 2000/09/29 20:21:34 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.22 2000/09/30 16:58:20 petere Exp $ -->
|
||||
|
||||
<chapter id="installation">
|
||||
<title><![%flattext-install-include[<productname>PostgreSQL</> ]]>Installation Instructions</title>
|
||||
@ -447,8 +447,9 @@ su - postgres
|
||||
<term>--enable-recode</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Enables character set recode support. See
|
||||
<filename>doc/README.Charsets</> for details on this feature.
|
||||
Enables single-byte character set recode support. See
|
||||
<![%flattext-install-include[the <citetitle>Administrator's Guide</citetitle>]]>
|
||||
<![%flattext-install-ignore[<xref linkend="recode">]]> about this feature.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -459,7 +460,10 @@ su - postgres
|
||||
<para>
|
||||
Allows the use of multibyte character encodings. This is
|
||||
primarily for languages like Japanese, Korean, and Chinese.
|
||||
Read <filename>doc/README.mb</> for details.
|
||||
Read
|
||||
<![%flattext-install-include[the <citetitle>Administrator's Guide</citetitle>]]>
|
||||
<![%flattext-install-ignore[<xref linkend="multibyte">]]>
|
||||
for details.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.41 2000/09/12 05:37:09 thomas Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.42 2000/09/30 16:58:20 petere Exp $
|
||||
-->
|
||||
|
||||
<!doctype set PUBLIC "-//OASIS//DTD DocBook V3.1//EN" [
|
||||
@ -173,9 +173,9 @@ $Header: /cvsroot/pgsql/doc/src/sgml/postgres.sgml,v 1.41 2000/09/12 05:37:09 th
|
||||
-->
|
||||
&installation;
|
||||
&installw;
|
||||
&charset;
|
||||
&runtime;
|
||||
&client-auth;
|
||||
&charset;
|
||||
&manage-ag;
|
||||
&user-manag;
|
||||
&backup;
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.25 2000/09/29 20:21:34 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.26 2000/09/30 16:58:20 petere Exp $
|
||||
-->
|
||||
|
||||
<Chapter Id="runtime">
|
||||
@ -1553,126 +1553,6 @@ set semsys:seminfo_semmsl=32
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="locale">
|
||||
<title>Locale Support</title>
|
||||
|
||||
<note>
|
||||
<title>Acknowledgement</title>
|
||||
<para>
|
||||
Written by Oleg Bartunov. See <ulink
|
||||
url="http://www.sai.msu.su/~megera/postgres/">Oleg's web
|
||||
page</ulink> for additional information on locale and Russian
|
||||
language support.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<para>
|
||||
While doing a project for a company in Moscow, Russia, I
|
||||
encountered the problem that <productname>Postgres</> had no
|
||||
support of national alphabets. After looking for possible
|
||||
workarounds I decided to develop support of locale myself. I'm not
|
||||
a C programmer but already had some experience with locale
|
||||
programming when I work with <productname>Perl</> (debugging) and
|
||||
<productname>Glimpse</>. After several days of digging through the
|
||||
<productname>Postgres</> source tree I made very minor corections
|
||||
to <filename>src/backend/utils/adt/varlena.c</> and
|
||||
<filename>src/backend/main/main.c</> and got what I needed! I did
|
||||
support only for <envar>LC_CTYPE</envar> and
|
||||
<envar>LC_COLLATE</envar>, but later <envar>LC_MONETARY</envar> was
|
||||
added by others. I got many messages from people about this patch
|
||||
so I decided to send it to developers and (to my surprise) it was
|
||||
incorporated into the <productname>Postgres</> distribution.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
People often complain that locale doesn't work for them. There are
|
||||
several common mistakes:
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Didn't properly configure <productname>Postgres</> before
|
||||
compilation. You must run <filename>configure</> with the
|
||||
<option>--enable-locale</> option to enable locale support.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Didn't setup environment correctly when starting postmaster. You
|
||||
must define environment variables <envar>LC_CTYPE</envar> and
|
||||
<envar>LC_COLLATE</envar> before running postmaster because
|
||||
backend gets information about locale from environment. I use
|
||||
following shell script:
|
||||
<programlisting>
|
||||
#!/bin/sh
|
||||
|
||||
export LC_CTYPE=koi8-r
|
||||
export LC_COLLATE=koi8-r
|
||||
postmaster -B 1024 -S -D/usr/local/pgsql/data/ -o '-Fe'
|
||||
</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Broken locale support in the operating system (for example,
|
||||
locale support in libc under Linux several times has changed and
|
||||
this caused a lot of problems). Perl has also support of locale
|
||||
and if locale is broken <command>perl -v</> will complain
|
||||
something like:
|
||||
<screen>
|
||||
<prompt>$</> <userinput>export LC_CTYPE='not_exist'</>
|
||||
<prompt>$</> <userinput>perl -v</>
|
||||
<computeroutput>
|
||||
perl: warning: Setting locale failed.
|
||||
perl: warning: Please check that your locale settings:
|
||||
LC_ALL = (unset),
|
||||
LC_CTYPE = "not_exist",
|
||||
LANG = (unset)
|
||||
are supported and installed on your system.
|
||||
perl: warning: Falling back to the standard locale ("C").
|
||||
</computeroutput>
|
||||
</screen>
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Wrong location of locale files. Possible locations include:
|
||||
<filename>/usr/lib/locale</filename> (Linux, Solaris),
|
||||
<filename>/usr/share/locale</filename> (Linux),
|
||||
<filename>/usr/lib/nls/loc</filename> (DUX 4.0).
|
||||
|
||||
Check <command>man locale</command> to find the correct
|
||||
location. Under Linux I made a symbolic link between
|
||||
<filename>/usr/lib/locale</filename> and
|
||||
<filename>/usr/share/locale</filename> to be sure that the next
|
||||
libc will not break my locale.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<formalpara>
|
||||
<title>What are the Benefits?</title>
|
||||
<para>
|
||||
You can use ~* and order by operators for strings contain
|
||||
characters from national alphabets. Non-english users definitely
|
||||
need that.
|
||||
</para>
|
||||
</formalpara>
|
||||
|
||||
<formalpara>
|
||||
<title>What are the Drawbacks?</title>
|
||||
<para>
|
||||
There is one evident drawback of using locale - its speed! So, use
|
||||
locale only if you really need it.
|
||||
</para>
|
||||
</formalpara>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="postmaster-shutdown">
|
||||
<title>Shutting down the server</title>
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user