mirror of
git://gcc.gnu.org/git/gcc.git
synced 2025-01-28 20:54:08 +08:00
cfe17bd93b
* docs/html/17_intro/howto.html: Make "chapter 22 notes" a link. * docs/html/faq/index.html: Mention that GCC ships with a newer version of the library than the last snapshot. Make "see below" a link. Add missing <html> tag. * docs/html/faq/index.txt: Regenerate. * docs/html/22_locale/locale.html, docs/html/22_locale/messages.html, docs/html/23_containers/howto.html, docs/html/24_iterators/howto.html, docs/html/25_algorithms/howto.html, docs/html/26_numerics/howto.html, docs/html/27_io/howto.html, docs/html/ext/howto.html, docs/html/ext/sgiexts.html: Add missing <html> tag. From-SVN: r59000
452 lines
22 KiB
HTML
452 lines
22 KiB
HTML
<?xml version="1.0" encoding="ISO-8859-1"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
|
|
<meta name="AUTHOR" content="pme@gcc.gnu.org (Phil Edwards)" />
|
|
<meta name="KEYWORDS" content="HOWTO, libstdc++, GCC, g++, libg++, STL" />
|
|
<meta name="DESCRIPTION" content="HOWTO for the libstdc++ chapter 23." />
|
|
<meta name="GENERATOR" content="vi and eight fingers" />
|
|
<title>libstdc++-v3 HOWTO: Chapter 23</title>
|
|
<link rel="StyleSheet" href="../lib3styles.css" />
|
|
</head>
|
|
<body>
|
|
|
|
<h1 class="centered"><a name="top">Chapter 23: Containers</a></h1>
|
|
|
|
<p>Chapter 23 deals with container classes and what they offer.
|
|
</p>
|
|
|
|
|
|
<!-- ####################################################### -->
|
|
<hr />
|
|
<h1>Contents</h1>
|
|
<ul>
|
|
<li><a href="#1">Making code unaware of the container/array difference</a></li>
|
|
<li><a href="#2">Variable-sized bitmasks</a></li>
|
|
<li><a href="#3">Containers and multithreading</a></li>
|
|
<li><a href="#4">"Hinting" during insertion</a></li>
|
|
<li><a href="#5">Bitmasks and string arguments</a></li>
|
|
<li><a href="#6"><code>std::list::size()</code> is O(n)!</a></li>
|
|
<li><a href="#7">Space overhead management for vectors</a></li>
|
|
</ul>
|
|
|
|
<hr />
|
|
|
|
<!-- ####################################################### -->
|
|
|
|
<h2><a name="1">Making code unaware of the container/array difference</a></h2>
|
|
<p>You're writing some code and can't decide whether to use builtin
|
|
arrays or some kind of container. There are compelling reasons
|
|
to use one of the container classes, but you're afraid that you'll
|
|
eventually run into difficulties, change everything back to arrays,
|
|
and then have to change all the code that uses those data types to
|
|
keep up with the change.
|
|
</p>
|
|
<p>If your code makes use of the standard algorithms, this isn't as
|
|
scary as it sounds. The algorithms don't know, nor care, about
|
|
the kind of "container" on which they work, since the
|
|
algorithms are only given endpoints to work with. For the container
|
|
classes, these are iterators (usually <code>begin()</code> and
|
|
<code>end()</code>, but not always). For builtin arrays, these are
|
|
the address of the first element and the
|
|
<a href="../24_iterators/howto.html#2">past-the-end</a> element.
|
|
</p>
|
|
<p>Some very simple wrapper functions can hide all of that from the
|
|
rest of the code. For example, a pair of functions called
|
|
<code>beginof</code> can be written, one that takes an array, another
|
|
that takes a vector. The first returns a pointer to the first
|
|
element, and the second returns the vector's <code>begin()</code>
|
|
iterator.
|
|
</p>
|
|
<p>The functions should be made template functions, and should also
|
|
be declared inline. As pointed out in the comments in the code
|
|
below, this can lead to <code>beginof</code> being optimized out of
|
|
existence, so you pay absolutely nothing in terms of increased
|
|
code size or execution time.
|
|
</p>
|
|
<p>The result is that if all your algorithm calls look like
|
|
</p>
|
|
<pre>
|
|
std::transform(beginof(foo), endof(foo), beginof(foo), SomeFunction);</pre>
|
|
<p>then the type of foo can change from an array of ints to a vector
|
|
of ints to a deque of ints and back again, without ever changing any
|
|
client code.
|
|
</p>
|
|
<p>This author has a collection of such functions, called "*of"
|
|
because they all extend the builtin "sizeof". It started
|
|
with some Usenet discussions on a transparent way to find the length
|
|
of an array. A simplified and much-reduced version for easier
|
|
reading is <a href="wrappers_h.txt">given here</a>.
|
|
</p>
|
|
<p>Astute readers will notice two things at once: first, that the
|
|
container class is still a <code>vector<T></code> instead of a
|
|
more general <code>Container<T></code>. This would mean that
|
|
three functions for <code>deque</code> would have to be added, another
|
|
three for <code>list</code>, and so on. This is due to problems with
|
|
getting template resolution correct; I find it easier just to
|
|
give the extra three lines and avoid confusion.
|
|
</p>
|
|
<p>Second, the line
|
|
</p>
|
|
<pre>
|
|
inline unsigned int lengthof (T (&)[sz]) { return sz; } </pre>
|
|
<p>looks just weird! Hint: unused parameters can be left nameless.
|
|
</p>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="2">Variable-sized bitmasks</a></h2>
|
|
<p>No, you cannot write code of the form
|
|
</p>
|
|
<!-- Careful, the leading spaces in PRE show up directly. -->
|
|
<pre>
|
|
#include <bitset>
|
|
|
|
void foo (size_t n)
|
|
{
|
|
std::bitset<n> bits;
|
|
....
|
|
} </pre>
|
|
<p>because <code>n</code> must be known at compile time. Your compiler is
|
|
correct; it is not a bug. That's the way templates work. (Yes, it
|
|
<em>is</em> a feature.)
|
|
</p>
|
|
<p>There are a couple of ways to handle this kind of thing. Please
|
|
consider all of them before passing judgement. They include, in
|
|
no particular order:
|
|
</p>
|
|
<ul>
|
|
<li>A very large N in <code>bitset<N></code>.</li>
|
|
<li>A container<bool>.</li>
|
|
<li>Extremely weird solutions.</li>
|
|
</ul>
|
|
<p><strong>A very large N in
|
|
<code>bitset<N></code>. </strong> It has
|
|
been pointed out a few times in newsgroups that N bits only takes up
|
|
(N/8) bytes on most systems, and division by a factor of eight is pretty
|
|
impressive when speaking of memory. Half a megabyte given over to a
|
|
bitset (recall that there is zero space overhead for housekeeping info;
|
|
it is known at compile time exactly how large the set is) will hold over
|
|
four million bits. If you're using those bits as status flags (e.g.,
|
|
"changed"/"unchanged" flags), that's a <em>lot</em>
|
|
of state.
|
|
</p>
|
|
<p>You can then keep track of the "maximum bit used" during some
|
|
testing runs on representative data, make note of how many of those bits
|
|
really need to be there, and then reduce N to a smaller number. Leave
|
|
some extra space, of course. (If you plan to write code like the
|
|
incorrect example above, where the bitset is a local variable, then you
|
|
may have to talk your compiler into allowing that much stack space;
|
|
there may be zero space overhead, but it's all allocated inside the
|
|
object.)
|
|
</p>
|
|
<p><strong>A container<bool>. </strong> The Committee
|
|
made provision
|
|
for the space savings possible with that (N/8) usage previously mentioned,
|
|
so that you don't have to do wasteful things like
|
|
<code>Container<char></code> or
|
|
<code>Container<short int></code>.
|
|
Specifically, <code>vector<bool></code> is required to be
|
|
specialized for that space savings.
|
|
</p>
|
|
<p>The problem is that <code>vector<bool></code> doesn't behave like a
|
|
normal vector anymore. There have been recent journal articles which
|
|
discuss the problems (the ones by Herb Sutter in the May and
|
|
July/August 1999 issues of
|
|
<u>C++ Report</u> cover it well). Future revisions of the ISO C++
|
|
Standard will change the requirement for <code>vector<bool></code>
|
|
specialization. In the meantime, <code>deque<bool></code> is
|
|
recommended (although its behavior is sane, you probably will not get
|
|
the space savings, but the allocation scheme is different than that
|
|
of vector).
|
|
</p>
|
|
<p><strong>Extremely weird solutions. </strong> If you have
|
|
access to
|
|
the compiler and linker at runtime, you can do something insane, like
|
|
figuring out just how many bits you need, then writing a temporary
|
|
source code file. That file contains an instantiation of
|
|
<code>bitset</code>
|
|
for the required number of bits, inside some wrapper functions with
|
|
unchanging signatures. Have your program then call the
|
|
compiler on that file using Position Independent Code, then open the
|
|
newly-created object file and load those wrapper functions. You'll have
|
|
an instantiation of <code>bitset<N></code> for the exact
|
|
<code>N</code>
|
|
that you need at the time. Don't forget to delete the temporary files.
|
|
(Yes, this <em>can</em> be, and <em>has been</em>, done.)
|
|
</p>
|
|
<!-- I wonder if this next paragraph will get me in trouble... -->
|
|
<p>This would be the approach of either a visionary genius or a raving
|
|
lunatic, depending on your programming and management style. Probably
|
|
the latter.
|
|
</p>
|
|
<p>Which of the above techniques you use, if any, are up to you and your
|
|
intended application. Some time/space profiling is indicated if it
|
|
really matters (don't just guess). And, if you manage to do anything
|
|
along the lines of the third category, the author would love to hear
|
|
from you...
|
|
</p>
|
|
<p>Also note that the implementation of bitset used in libstdc++-v3 has
|
|
<a href="../ext/sgiexts.html#ch23">some extensions</a>.
|
|
</p>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="3">Containers and multithreading</a></h2>
|
|
<p>This section discusses issues surrounding the design of
|
|
multithreaded applications which use Standard C++ containers.
|
|
All information in this section is current as of the gcc 3.0
|
|
release and all later point releases. Although earlier gcc
|
|
releases had a different approach to threading configuration and
|
|
proper compilation, the basic code design rules presented here
|
|
were similar. For information on all other aspects of
|
|
multithreading as it relates to libstdc++, including details on
|
|
the proper compilation of threaded code (and compatibility between
|
|
threaded and non-threaded code), see Chapter 17.
|
|
</p>
|
|
<p>Two excellent pages to read when working with the Standard C++
|
|
containers and threads are
|
|
<a href="http://www.sgi.com/tech/stl/thread_safety.html">SGI's
|
|
http://www.sgi.com/tech/stl/thread_safety.html</a> and
|
|
<a href="http://www.sgi.com/tech/stl/Allocators.html">SGI's
|
|
http://www.sgi.com/tech/stl/Allocators.html</a>.
|
|
</p>
|
|
<p><em>However, please ignore all discussions about the user-level
|
|
configuration of the lock implementation inside the STL
|
|
container-memory allocator on those pages. For the sake of this
|
|
discussion, libstdc++-v3 configures the SGI STL implementation,
|
|
not you. This is quite different from how gcc pre-3.0 worked.
|
|
In particular, past advice was for people using g++ to
|
|
explicitly define _PTHREADS or other macros or port-specific
|
|
compilation options on the command line to get a thread-safe
|
|
STL. This is no longer required for any port and should no
|
|
longer be done unless you really know what you are doing and
|
|
assume all responsibility.</em>
|
|
</p>
|
|
<p>Since the container implementation of libstdc++-v3 uses the SGI
|
|
code, we use the same definition of thread safety as SGI when
|
|
discussing design. A key point that beginners may miss is the
|
|
fourth major paragraph of the first page mentioned above
|
|
("For most clients,"...), which points out that
|
|
locking must nearly always be done outside the container, by
|
|
client code (that'd be you, not us). There is a notable
|
|
exceptions to this rule. Allocators called while a container or
|
|
element is constructed uses an internal lock obtained and
|
|
released solely within libstdc++-v3 code (in fact, this is the
|
|
reason STL requires any knowledge of the thread configuration).
|
|
</p>
|
|
<p>For implementing a container which does its own locking, it is
|
|
trivial to provide a wrapper class which obtains the lock (as
|
|
SGI suggests), performs the container operation, and then
|
|
releases the lock. This could be templatized <em>to a certain
|
|
extent</em>, on the underlying container and/or a locking
|
|
mechanism. Trying to provide a catch-all general template
|
|
solution would probably be more trouble than it's worth.
|
|
</p>
|
|
<p>The STL implementation is currently configured to use the
|
|
high-speed caching memory allocator. Some people like to
|
|
test and/or normally run threaded programs with a different
|
|
default. For all details about how to globally override this
|
|
at application run-time see <a href="../ext/howto.html#3">here</a>.
|
|
</p>
|
|
<p>There is a better way (not standardized yet): It is possible to
|
|
force the malloc-based allocator on a per-case-basis for some
|
|
application code. The library team generally believes that this
|
|
is a better way to tune an application for high-speed using this
|
|
implementation of the STL. There is
|
|
<a href="../ext/howto.html#3">more information on allocators here</a>.
|
|
</p>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="4">"Hinting" during insertion</a></h2>
|
|
<p>Section [23.1.2], Table 69, of the C++ standard lists this function
|
|
for all of the associative containers (map, set, etc):
|
|
</p>
|
|
<pre>
|
|
a.insert(p,t);</pre>
|
|
<p>where 'p' is an iterator into the container 'a', and 't' is the item
|
|
to insert. The standard says that "iterator p is a hint
|
|
pointing to where the insert should start to search," but
|
|
specifies nothing more. (LWG Issue #233, currently in review,
|
|
addresses this topic, but I will ignore it here because it is not yet
|
|
finalized.)
|
|
</p>
|
|
<p>Here we'll describe how the hinting works in the libstdc++-v3
|
|
implementation, and what you need to do in order to take advantage of
|
|
it. (Insertions can change from logarithmic complexity to amortized
|
|
constant time, if the hint is properly used.) Also, since the current
|
|
implementation is based on the SGI STL one, these points may hold true
|
|
for other library implementations also, since the HP/SGI code is used
|
|
in a lot of places.
|
|
</p>
|
|
<p>In the following text, the phrases <em>greater than</em> and <em>less
|
|
than</em> refer to the results of the strict weak ordering imposed on
|
|
the container by its comparison object, which defaults to (basically)
|
|
"<". Using those phrases is semantically sloppy, but I
|
|
didn't want to get bogged down in syntax. I assume that if you are
|
|
intelligent enough to use your own comparison objects, you are also
|
|
intelligent enough to assign "greater" and "lesser"
|
|
their new meanings in the next paragraph. *grin*
|
|
</p>
|
|
<p>If the <code>hint</code> parameter ('p' above) is equivalent to:
|
|
</p>
|
|
<ul>
|
|
<li><code>begin()</code>, then the item being inserted should have a key
|
|
less than all the other keys in the container. The item will
|
|
be inserted at the beginning of the container, becoming the new
|
|
entry at <code>begin()</code>.
|
|
</li>
|
|
<li><code>end()</code>, then the item being inserted should have a key
|
|
greater than all the other keys in the container. The item will
|
|
be inserted at the end of the container, becoming the new entry
|
|
at <code>end()</code>.
|
|
</li>
|
|
<li>neither <code>begin()</code> nor <code>end()</code>, then: Let <code>h</code>
|
|
be the entry in the container pointed to by <code>hint</code>, that
|
|
is, <code>h = *hint</code>. Then the item being inserted should have
|
|
a key less than that of <code>h</code>, and greater than that of the
|
|
item preceding <code>h</code>. The new item will be inserted
|
|
between <code>h</code> and <code>h</code>'s predecessor.
|
|
</li>
|
|
</ul>
|
|
<p>For <code>multimap</code> and <code>multiset</code>, the restrictions are
|
|
slightly looser: "greater than" should be replaced by
|
|
"not less than" and "less than" should be replaced
|
|
by "not greater than." (Why not replace greater with
|
|
greater-than-or-equal-to? You probably could in your head, but the
|
|
mathematicians will tell you that it isn't the same thing.)
|
|
</p>
|
|
<p>If the conditions are not met, then the hint is not used, and the
|
|
insertion proceeds as if you had called <code> a.insert(t) </code>
|
|
instead. (<strong>Note </strong> that GCC releases prior to 3.0.2
|
|
had a bug in the case with <code>hint == begin()</code> for the
|
|
<code>map</code> and <code>set</code> classes. You should not use a hint
|
|
argument in those releases.)
|
|
</p>
|
|
<p>This behavior goes well with other container's <code>insert()</code>
|
|
functions which take an iterator: if used, the new item will be
|
|
inserted before the iterator passed as an argument, same as the other
|
|
containers. The exception
|
|
(in a sense) is with a hint of <code>end()</code>: the new item will
|
|
actually be inserted after <code>end()</code>, but it also becomes the
|
|
new <code>end()</code>.
|
|
</p>
|
|
<p><strong>Note </strong> also that the hint in this implementation is a
|
|
one-shot. The insertion-with-hint routines check the immediately
|
|
surrounding entries to ensure that the new item would in fact belong
|
|
there. If the hint does not point to the correct place, then no
|
|
further local searching is done; the search begins from scratch in
|
|
logarithmic time. (Further local searching would only increase the
|
|
time required when the hint is too far off.)
|
|
</p>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="5">Bitmasks and string arguments</a></h2>
|
|
<p>Bitmasks do not take char* nor const char* arguments in their
|
|
constructors. This is something of an accident, but you can read
|
|
about the problem: follow the library's "Links" from the
|
|
homepage, and from the C++ information "defect reflector"
|
|
link, select the library issues list. Issue number 116 describes the
|
|
problem.
|
|
</p>
|
|
<p>For now you can simply make a temporary string object using the
|
|
constructor expression:
|
|
</p>
|
|
<pre>
|
|
std::bitset<5> b ( std::string("10110") );
|
|
</pre>
|
|
instead of
|
|
<pre>
|
|
std::bitset<5> b ( "10110" ); // invalid
|
|
</pre>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="6"><code>std::list::size()</code> is O(n)!</a></h2>
|
|
<p>Yes it is, and that's okay. This is a decision that we preserved when
|
|
we imported SGI's STL implementation. The following is quoted from
|
|
<a href="http://www.sgi.com/tech/stl/FAQ.html">their FAQ</a>:
|
|
</p>
|
|
<blockquote>
|
|
<p>The size() member function, for list and slist, takes time
|
|
proportional to the number of elements in the list. This was a
|
|
deliberate tradeoff. The only way to get a constant-time size() for
|
|
linked lists would be to maintain an extra member variable containing
|
|
the list's size. This would require taking extra time to update that
|
|
variable (it would make splice() a linear time operation, for example),
|
|
and it would also make the list larger. Many list algorithms don't
|
|
require that extra word (algorithms that do require it might do better
|
|
with vectors than with lists), and, when it is necessary to maintain
|
|
an explicit size count, it's something that users can do themselves.
|
|
</p>
|
|
<p>This choice is permitted by the C++ standard. The standard says that
|
|
size() "should" be constant time, and "should"
|
|
does not mean the same thing as "shall". This is the
|
|
officially recommended ISO wording for saying that an implementation
|
|
is supposed to do something unless there is a good reason not to.
|
|
</p>
|
|
<p>One implication of linear time size(): you should never write
|
|
</p>
|
|
<pre>
|
|
if (L.size() == 0)
|
|
...</pre>
|
|
Instead, you should write
|
|
<pre>
|
|
if (L.empty())
|
|
...</pre>
|
|
</blockquote>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
<hr />
|
|
<h2><a name="7">Space overhead management for vectors</a></h2>
|
|
<p>In
|
|
<a href="http://gcc.gnu.org/ml/libstdc++/2002-04/msg00105.html">this
|
|
message to the list</a>, Daniel Kostecky announced work on an
|
|
alternate form of <code>std::vector</code> that would support hints
|
|
on the number of elements to be over-allocated. The design was also
|
|
described, along with possible implementation choices.
|
|
</p>
|
|
<p>The first two alpha releases were announced
|
|
<a href="http://gcc.gnu.org/ml/libstdc++/2002-07/msg00048.html">here</a>
|
|
and
|
|
<a href="http://gcc.gnu.org/ml/libstdc++/2002-07/msg00111.html">here</a>.
|
|
The releases themselves are available at
|
|
<a href="http://www.kotelna.sk/dk/sw/caphint/">
|
|
http://www.kotelna.sk/dk/sw/caphint/</a>.
|
|
</p>
|
|
<p>Return <a href="#top">to top of page</a> or
|
|
<a href="../faq/index.html">to the FAQ</a>.
|
|
</p>
|
|
|
|
|
|
<!-- ####################################################### -->
|
|
|
|
<hr />
|
|
<p class="fineprint"><em>
|
|
See <a href="../17_intro/license.html">license.html</a> for copying conditions.
|
|
Comments and suggestions are welcome, and may be sent to
|
|
<a href="mailto:libstdc++@gcc.gnu.org">the libstdc++ mailing list</a>.
|
|
</em></p>
|
|
|
|
|
|
</body>
|
|
</html>
|