Notes on the locale implementation.
prepared by Benjamin Kosnik (bkoz@redhat.com) on August 8, 2001
1. Abstract Describes the basic locale object, including nested
classes id, facet, and the reference-counted implementation object,
class _Impl.
2. What the standard says
See Chapter 22 of the standard.
3. Problems with "C" locales : global locales, termination.
The major problem is fitting an object-orientated and non-global locale
design ontop of POSIX and other relevant stanards, which include the
Single Unix (nee X/Open.)
Because POSIX falls down so completely, portibility is an issue.
4. Design
Class locale in non-templatized and has three distinct types nested
inside of it:
class facet
22.1.1.1.2 Class locale::facet
Facets actually implement locale functionality. For instance, a facet
called numpunct is the data objects that can be used to query for the
thousands separator is in the German locale.
Literally, a facet is strictly defined:
- containing
public:
static locale::id id;
- or derived from another facet
The only other thing of interest in this class is the memory
management of facets. Each constructor of a facet class takes a
std::size_t __refs argument: if __refs == 0, the facet is deleted when
no longer used. if __refs == 1, the facet is not destroyed, even when
it is no longer reference.
class id
Provides an index for looking up specific facets.
class _Impl
5. Examples
typedef __locale_t locale;
More information can be found in the following testcases:
- testsuite/22_locale/ctype_char_members.cc
- testsuite/22_locale/ctype_wchar_t_members.cc
6. Unresolved Issues
- locale -a displays available locales on linux
- locale initialization: at what point does _S_classic,
_S_global get initialized? Can named locales assume this
initialization has already taken place?
- document how named locales error check when filling data
members. Ie, a fr_FR locale that doesn't have
numpunct::truename(): does it use "true"? Or is it a blank
string? What's the convention?
- explain how locale aliasing happens. When does "de_DE"
use "de" information? What is the rule for locales composed of
just an ISO language code (say, "de") and locales with both an
ISO language code and ISO country code (say, "de_DE").
- what should non-required facet instantiations do? If the
generic implemenation is provided, then how to end-users
provide specializations?
7. Acknowledgments
8. Bibliography / Referenced Documents
Drepper, Ulrich, GNU libc (glibc) 2.2 manual. In particular, Chapters "6. Character Set Handling" and "7 Locales and Internationalization"
Drepper, Ulrich, Numerous, late-night email correspondence
ISO/IEC 14882:1998 Programming languages - C++
ISO/IEC 9899:1999 Programming languages - C
Langer, Angelika and Klaus Kreft, Standard C++ IOStreams and Locales, Advanced Programmer's Guide and Reference, Addison Wesley Longman, Inc. 2000
Stroustrup, Bjarne, Appendix D, The C++ Programming Language, Special Edition, Addison Wesley, Inc. 2000
System Interface Definitions, Issue 6 (IEEE Std. 1003.1-200x)
The Open Group/The Institute of Electrical and Electronics Engineers, Inc.
http://www.opennc.org/austin/docreg.html