ctype_wchar_t_members.cc (test01): New file.

2000-08-30  Benjamin Kosnik  <bkoz@redhat.com>

	* testsuite/22_locale/ctype_wchar_t_members.cc (test01): New file.

	* docs/22_locale/codecvt.html: Re-number.
	* docs/22_locale/howto.html: Add entry for ctype
	documentation. Add entry for Nathan's introduction to locales
	paper.
	* docs/22_locale/ctype.html: New file. In progress...

	* docs/22_locale/codecvt.html: Formatting cleanups.
	* src/locale.cc (ctype<wchar_t>::do_is): Fix thinko.

From-SVN: r36082
This commit is contained in:
Benjamin Kosnik 2000-08-31 01:17:53 +00:00
parent 83bbca3be7
commit e203a9886a
5 changed files with 320 additions and 24 deletions

View File

@ -1,9 +1,22 @@
2000-08-30 Benjamin Kosnik <bkoz@redhat.com>
* testsuite/22_locale/ctype_wchar_t_members.cc (test01): New file.
* docs/22_locale/codecvt.html: Re-number.
* docs/22_locale/howto.html: Add entry for ctype
documentation. Add entry for Nathan's introduction to locales
paper.
* docs/22_locale/ctype.html: New file. In progress...
* docs/22_locale/codecvt.html: Formatting cleanups.
* src/locale.cc (ctype<wchar_t>::do_is): Fix thinko.
2000-08-30 Benjamin Kosnik <bkoz@redhat.com>
2000-08-30 Phil Edwards <pme@sources.redhat.com>
* docs/22_locale/codecvt.html: Behind-the-scenes ASCII->HTML
tweaks for certain browsers.
2000-08-29 Benjamin Kosnik <bkoz@redhat.com>
* bits/locale_facets.h (ctype<char>): Remove __table_type.
Add include for bits/std_cwctype.h, for wctype_t.
@ -14,13 +27,8 @@
* config/gnu-linux/ctype.cc: Tweak.
* testsuite/22_locale/ctype.cc: Tweak.
* bits/codecvt.h (__enc_traits): Mangle names.
* bits/codecvt.h (__enc_traits): Uglify names.
2000-08-30 Phil Edwards <pme@sources.redhat.com>
* docs/22_locale/codecvt.html: Behind-the-scenes ASCII->HTML
tweaks for certain browsers.
2000-08-28 Benjamin Kosnik <bkoz@purist.soma.redhat.com>
* docs/22_locale/codecvt.html: Add more bits, format.

View File

@ -97,7 +97,7 @@ mcsrtombs and wcsrtombs in particular.</P>
<P>
<H2>
2. Some thoughts on what would be useful
3. Some thoughts on what would be useful
</H2>
Probably the most frequently asked question about code conversion is:
&quot;So dudes, what's the deal with Unicode strings?&quot; The dude part is
@ -208,7 +208,7 @@ mechanism may be required.
<P>
<H2>
3. Problems with &quot;C&quot; code conversions : thread safety, global
4. Problems with &quot;C&quot; code conversions : thread safety, global
locales, termination.
</H2>
@ -251,7 +251,7 @@ LC_CTYPE category implements.
<P>
<H2>
4. Design
5. Design
</H2>
The two required specializations are implemented as follows:
@ -370,7 +370,7 @@ codecvt usage.
<P>
<H2>
5. Examples
6. Examples
</H2>
<UL>
@ -431,7 +431,7 @@ More information can be found in the following testcases:
<P>
<H2>
6. Unresolved Issues
7. Unresolved Issues
</H2>
<UL>
<LI>
@ -474,7 +474,7 @@ More information can be found in the following testcases:
<P>
<H2>
7. Acknowledgments
8. Acknowledgments
</H2>
Ulrich Drepper for the iconv suggestions and patient answering of
late-night questions, Jason Merrill for the template partial
@ -482,7 +482,7 @@ specialization hints, language clarification, and wchar_t fixes.
<P>
<H2>
8. Bibliography / Referenced Documents
9. Bibliography / Referenced Documents
</H2>
Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters &quot;6. Character Set Handling&quot; and &quot;7 Locales and Internationalization&quot;

View File

@ -0,0 +1,146 @@
<HTML>
<HEAD>
<H1>
Notes on the ctype implementation.
</H1>
</HEAD>
<I>
prepared by Benjamin Kosnik (bkoz@redhat.com) on August 30, 2000
</I>
<P>
<H2>
1. Abstract
</H2>
<P>
Woe is me.
</P>
<P>
<H2>
2. What the standard says
</H2>
<P>
<H2>
3. Problems with &quot;C&quot; ctype : global locales, termination.
</H2>
<P>
For the required specialization codecvt&lt;wchar_t, char, mbstate_t&gt; ,
conversions are made between the internal character set (always UCS4
on GNU/Linux) and whatever the currently selected locale for the
LC_CTYPE category implements.
<P>
<H2>
4. Design
</H2>
The two required specializations are implemented as follows:
<P>
<TT>
ctype&lt;char&gt;
</TT>
<P>
This is simple specialization. Implementing this was a piece of cake.
<P>
<TT>
ctype&lt;wchar_t&gt;
</TT>
<P>
This specialization, by specifying all the template parameters, pretty
much ties the hands of implementors. As such, the implementation is
straightforward, involving mcsrtombs for the conversions between char
to wchar_t and wcsrtombs for conversions between wchar_t and char.
<P>
Neither of these two required specializations deals with Unicode
characters. As such, libstdc++-v3 implements
<P>
<H2>
5. Examples
</H2>
<pre>
typedef ctype<char> cctype;
</pre>
More information can be found in the following testcases:
<UL>
<LI> testsuite/22_locale/ctype_char_members.cc
<LI> testsuite/22_locale/ctype_wchar_t_members.cc
</UL>
<P>
<H2>
6. Unresolved Issues
</H2>
<UL>
<LI> how to deal with the global locale issue?
<LI> how to deal with different types than char, wchar_t?
<LI> codecvt/ctype overlap: narrow/widen
<LI> mask typedef in codecvt_base, argument types in codecvt.
what is know about this type?
<LI> why mask* argument in codecvt?
<LI> can this be made (more) generic? is there a simple way to
straighten out the configure-time mess that is a by-product of
this class?
<LI> get the ctype<wchar_t>::mask stuff under control. Need to
make some kind of static table, and not do lookup evertime
somebody hits the do_is... functions. Too bad we can't just
redefine mask for ctype<wchar_t>
<LI> rename abstract base class. See if just smash-overriding
is a better approach. Clarify, add sanity to naming.
</UL>
<P>
<H2>
7. Acknowledgments
</H2>
Ulrich Drepper for patient answering of late-night questions, skeletal
examples, and C language expertise.
<P>
<H2>
8. Bibliography / Referenced Documents
</H2>
Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters &quot;6. Character Set Handling&quot; and &quot;7 Locales and Internationalization&quot;
<P>
Drepper, Ulrich, Numerous, late-night email correspondence
<P>
ISO/IEC 14882:1998 Programming languages - C++
<P>
ISO/IEC 9899:1999 Programming languages - C
<P>
Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales, Advanced Programmer's Guide and Reference, Addison Wesley Longman, Inc. 2000
<P>
Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000
<P>
System Interface Definitions, Issue 6 (IEEE Std. 1003.1-200x)
The Open Group/The Institute of Electrical and Electronics Engineers, Inc.
http://www.opennc.org/austin/docreg.html

View File

@ -9,14 +9,13 @@
<TITLE>libstdc++-v3 HOWTO: Chapter 22</TITLE>
<LINK REL="home" HREF="http://sources.redhat.com/libstdc++/docs/22_locale/">
<LINK REL=StyleSheet HREF="../lib3styles.css">
<!-- $Id: howto.html,v 1.2 2000/07/11 21:45:07 pme Exp $ -->
<!-- $Id: howto.html,v 1.3 2000/08/25 08:52:56 bkoz Exp $ -->
</HEAD>
<BODY>
<H1 CLASS="centered"><A NAME="top">Chapter 22: Localization</A></H1>
<P>Chapter 22 deals with the FORTRAN subroutines for automatically
transforming lemmings into gold.
<P>Chapter 22 deals with the C++ localization facilities.
</P>
@ -24,8 +23,10 @@
<HR>
<H1>Contents</H1>
<UL>
<LI><A HREF="#1">Stroustrup on Locales</A>
<LI><A HREF="#2">Notes on the codecvt implementation</A>
<LI><A HREF="#1">Bjarne Stroustrup on Locales</A>
<LI><A HREF="#2">Nathan Myers on Locales</A>
<LI><A HREF="#3">codecvt</A>
<LI><A HREF="#4">ctype</A>
</UL>
<HR>
@ -45,15 +46,51 @@
</P>
<HR>
<H2><A NAME="2">Notes on the codecvt implementation</A></H2>
<P> This document turned out to be larger than anticipated. As
such, it gets its own page, which can be found
<A HREF="codecvt.html">here</A>.
<H2><A NAME="2">Nathan Myers on Locales</A></H2>
<P> An article entitled "The Standard C++ Locale" was published in
Dr. Dobb's Journal and can be found
<A HREF="http://www.cantrip.org/locale.html">here</A>
</P>
<P>Return <A HREF="#top">to top of page</A> or
<A HREF="../faq/index.html">to the FAQ</A>.
</P>
<HR>
<H2><A NAME="3">codecvt</A></H2>
<P> Notes made during the implementation of codecvt can be found
<A HREF="codecvt.html">here</A>.
</P>
<P> The following is the abstract from the implementation notes:
<BLOCKQUOTE>
The standard class codecvt attempts to address conversions
between different character encoding schemes. In particular, the
standard attempts to detail conversions between the
implementation-defined wide characters (hereafter referred to as
wchar_t) and the standard type char that is so beloved in classic
&quot;C&quot; (which can now be referred to as narrow characters.)
This document attempts to describe how the GNU libstdc++-v3
implementation deals with the conversion between wide and narrow
characters, and also presents a framework for dealing with the huge
number of other encodings that iconv can convert, including Unicode
and UTF8. Design issues and requirements are addressed, and examples
of correct usage for both the required specializations for wide and
narrow characters and the implementation-provided extended
functionality are given.
</BLOCKQUOTE>
<P>Return <A HREF="#top">to top of page</A> or
<A HREF="../faq/index.html">to the FAQ</A>.
</P>
<HR>
<H2><A NAME="4">ctype</A></H2>
<P> Notes made during the implementation of ctype can be found
<A HREF="ctype.html">here</A>.
</P>
<P>Return <A HREF="#top">to top of page</A> or
<A HREF="../faq/index.html">to the FAQ</A>.
</P>
@ -64,7 +101,7 @@
Comments and suggestions are welcome, and may be sent to
<A HREF="mailto:pme@sources.redhat.com">Phil Edwards</A> or
<A HREF="mailto:gdr@egcs.cygnus.com">Gabriel Dos Reis</A>.
<BR> $Id: howto.html,v 1.2 2000/07/11 21:45:07 pme Exp $
<BR> $Id: howto.html,v 1.3 2000/08/25 08:52:56 bkoz Exp $
</EM></P>

View File

@ -0,0 +1,105 @@
// 2000-09-01 Benjamin Kosnik <bkoz@redhat.com>
// Copyright (C) 2000 Free Software Foundation, Inc.
//
// This file is part of the GNU ISO C++ Library. This library is free
// software; you can redistribute it and/or modify it under the
// terms of the GNU General Public License as published by the
// Free Software Foundation; either version 2, or (at your option)
// any later version.
// This library is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License along
// with this library; see the file COPYING. If not, write to the Free
// Software Foundation, 59 Temple Place - Suite 330, Boston, MA 02111-1307,
// USA.
// As a special exception, you may use this file as part of a free software
// library without restriction. Specifically, if other files instantiate
// templates or use macros or inline functions from this file, or you compile
// this file and link it with other files to produce an executable, this
// file does not by itself cause the resulting executable to be covered by
// the GNU General Public License. This exception does not however
// invalidate any other reasons why the executable file might be covered by
// the GNU General Public License.
// 22.2.1.3.2 ctype<char> members
#include <locale>
// NB: Don't include any other headers in this file.
#include <debug_assert.h>
class gnu_ctype: public std::ctype<wchar_t> {};
void test01()
{
bool test = true;
typedef wchar_t char_type;
const char_type strlit00[] = L"manilla, cebu, tandag PHILIPPINES";
const char_type strlit01[] = L"MANILLA, CEBU, TANDAG PHILIPPINES";
const char_type strlit02[] = L"manilla, cebu, tandag philippines";
const char_type c00 = L'S';
const char_type c10 = L's';
const char_type c20 = L'9';
const char_type c30 = L' ';
const char_type c40 = L'!';
const char_type c50 = L'F';
const char_type c60 = L'f';
const char_type c70 = L'X';
const char_type c80 = L'x';
gnu_ctype gctype;
char_type c100;
int len = std::char_traits<char_type>::length(strlit00);
char_type c_array[len + 1];
// bool is(mask m, char_type c) const;
VERIFY( gctype.is(std::ctype_base::space, c30) );
VERIFY( gctype.is(std::ctype_base::upper, c00) );
VERIFY( gctype.is(std::ctype_base::lower, c10) );
VERIFY( gctype.is(std::ctype_base::digit, c20) );
VERIFY( gctype.is(std::ctype_base::punct, c40) );
VERIFY( gctype.is(std::ctype_base::alpha, c50) );
VERIFY( gctype.is(std::ctype_base::alpha, c60) );
VERIFY( gctype.is(std::ctype_base::xdigit, c20) );
VERIFY( !gctype.is(std::ctype_base::xdigit, c80) );
VERIFY( gctype.is(std::ctype_base::alnum, c50) );
VERIFY( gctype.is(std::ctype_base::alnum, c20) );
VERIFY( gctype.is(std::ctype_base::graph, c40) );
VERIFY( gctype.is(std::ctype_base::graph, c20) );
// char_type toupper(char_type c) const
c100 = gctype.toupper(c10);
VERIFY( c100 == c00 );
// char_type tolower(char_type c) const
c100 = gctype.tolower(c00);
VERIFY( c100 == c10 );
// char_type toupper(char_type* low, const char_type* hi) const
std::char_traits<char_type>::copy(c_array, strlit02, len + 1);
gctype.toupper(c_array, c_array + len);
VERIFY( !std::char_traits<char_type>::compare(c_array, strlit01, len - 1) );
// char_type tolower(char_type* low, const char_type* hi) const
std::char_traits<char_type>::copy(c_array, strlit01, len + 1);
gctype.tolower(c_array, c_array + len);
VERIFY( !std::char_traits<char_type>::compare(c_array, strlit02, len - 1) );
#ifdef DEBUG_ASSERT
assert(test);
#endif
}
int main() {
test01();
return 0;
}