From: t-ishii@sra.co.jp
Attached are patches to enhance the multi-byte support. (patches are
against 7/18 snapshot)
* determine encoding at initdb/createdb rather than compile time
Now initdb/createdb has an option to specify the encoding. Also, I
modified the syntax of CREATE DATABASE to accept encoding option. See
README.mb for more details.
For this purpose I have added new column "encoding" to pg_database.
Also pg_attribute and pg_class are changed to catch up the
modification to pg_database. Actually I haved added pg_database_mb.h,
pg_attribute_mb.h and pg_class_mb.h. These are used only when MB is
enabled. The reason having separate files is I couldn't find a way to
use ifdef or whatever in those files. I have to admit it looks
ugly. No way.
* support for PGCLIENTENCODING when issuing COPY command
commands/copy.c modified.
* support for SQL92 syntax "SET NAMES"
See gram.y.
* support for LATIN2-5
* add UNICODE regression test case
* new test suite for MB
New directory test/mb added.
* clean up source files
Basic idea is to have MB's own subdirectory for easier maintenance.
These are include/mb and backend/utils/mb.
Attached are diffs (from current cvs sources) to bring libpq.sgml
and libpq.3 up to date.
It appears that at various times in the past, people have made edits to
one or the other of these files but not both. I propagated some changes
from each into the other, but I don't think I caught every
inconsistency. It'd be real nice if the man pages could be
automatically generated from the SGML...
Making PQrequestCancel safe to call in a signal handler turned out to be
much easier than I feared. So here are the diffs.
Some notes:
* I modified the postmaster's packet "iodone" callback interface to allow
the callback routine to return a continue-or-drop-connection return
code; this was necessary to allow the connection to be closed after
receiving a Cancel, rather than proceeding to launch a new backend...
Being a neatnik, I also made the iodone proc have a typechecked
parameter list.
* I deleted all code I could find that had to do with OOB.
* I made some edits to ensure that all signals mentioned in the code
are referred to symbolically not by numbers ("SIGUSR2" not "2").
I think Bruce may have already done at least some of the same edits;
I hope that merging these patches is not too painful.
I have implemented a framework of encoding translation between the
backend and the frontend. Also I have added a new variable setting
command:
SET CLIENT_ENCODING TO 'encoding';
Other features include:
Latin1 support more 8 bit cleaness
See doc/README.mb for more details. Note that the pacthes are
against May 30 snapshot.
Tatsuo Ishii
Attached to the mail is locale-patch.tar.gz. In the archive
there are:
file README.locale
short description
directory src/test/locale
test suite; currently only koi8-r tests, but the suite can be
easily extended
file locale.patch
the very patch; to apply: patch < locale.patch; should be applied
to postgres-6.3.2 (at least I created it with 6.3.2 without any
additional
patches)
Files touched by the patch: src/include/utils/builtins.h
src/backend/utils/adt/char.c src/backend/utils/adt/varchar.c
src/backend/utils/adt/varlena.c
Oleg
pg_notifies statement is eliminated, and callbacks defined by
pg_listen are instead invoked automatically from the Tcl idle loop
whenever a NOTIFY message is received.
I have done only cursory testing, so there may be problems still
lurking (particularly on non-Unix machines?). But it seems to
work.
Patch is against today's cvs sources. Note that this will not work
with the 6.3.2 release since it depends on the new libpq.
The diffs are a bit large so I've gzipped them. A patch to update
libpgtcl.sgml is included too.
regards, tom lane
1. Rewritten libpq to allow asynchronous clients.
2. Implemented client side of cancel protocol in library,
and patched psql.c to send a cancel request upon SIGINT. The
backend doesn't notice it yet :-(
3. Implemented 'Z' protocol message addition and renaming of
copy in/out start messages. These are implemented conditionally,
ie, the client protocol version is checked; so the code should
still work with 1.0 clients.
4. Revised protocol and libpq sgml documents (don't have an SGML
compiler, though, so there may be some markup glitches here).
What remains to be done:
1. Implement addition of atttypmod field to RowDescriptor messages.
The client-side code is there but ifdef'd out. I have no idea
what to change on the backend side. The field should be sent
only if protocol >= 2.0, of course.
2. Implement backend response to cancel requests received as OOB
messages. (This prolly need not be conditional on protocol
version; just do it if you get SIGURG.)
3. Update libpq.3. (I'm hoping this can be generated mechanically
from libpq.sgml... if not, will do it by hand.) Is there any
other doco to fix?
4. Update non-libpq interfaces as necessary. I patched libpgtcl
so that it would compile, but haven't tested it. Dunno what
needs to be done with the other interfaces.
Have at it!
Tom Lane
Hi, here are patches I promised (against 6.3.2):
* character_length(), position(), substring() are now aware of
multi-byte characters
* add octet_length()
* add --with-mb option to configure
* new regression tests for EUC_KR
(contributed by "Soonmyung. Hong" <hong@lunaris.hanmesoft.co.kr>)
* add some test cases to the EUC_JP regression test
* fix problem in regress/regress.sh in case of System V
* fix toupper(), tolower() to handle 8bit chars
note that:
o patches for both configure.in and configure are
included. maybe the one for configure is not necessary.
o pg_proc.h was modified to add octet_length(). I used OIDs
(1374-1379) for that. Please let me know if these numbers are not
appropriate.
Ok, I have finally gotten all of the defines for Dec/Alpha and
Linux/Alpha sorted out as Marc asked. There is no longer any need for
'-Dalpha' or '-Dlinuxalpha' in either the Dec/Alpha or the Linux/Alpha
template files (./src/template/{alpha,linuxalpha}). I have replaced every
instance of 'alpha' or '__alpha__' with '__alpha', as that appears to be
the common symbol between C compilers on both operating systems (RH4.2 &
DecUnix 4.0b) for alpha.
Included are patches intended for allowing PostgreSQL to handle
multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and
Mule internal code. With the MB patch you can use multi-byte character
sets in regexp and LIKE. The encoding system chosen is determined at
the compile time.
To enable the MB extension, you need to define a variable "MB" in
Makefile.global or in Makefile.custom. For further information please
take a look at README.mb under doc directory.
(Note that unlike "jp patch" I do not use modified GNU regexp any
more. I changed Henry Spencer's regexp coming with PostgreSQL.)
Subject: [PATCHES] Documentation update
This patch updates some of the documentation that comes with the
distribution. The following files are updated:
COPYRIGHT
README
doc/README.flex
doc/README.support
doc/bug.template