The STL is based on generalization. Arrays are generalized into con-tainers
and parameterized on the types of objects they contain. Func-tions
are generalized into algorithms and parameterized on the types
of iterators they use. Pointers are generalized into iterators and
parameterized on the type of objects they point to.
That¡¯s just the beginning. Individual container types are generalized
into sequence and associative containers, and similar containers are
given similar functionality. Standard contiguous-memory containers
(see Item 1) offer random-access iterators, while standard node-based
containers (again, see Item 1) provide bidirectional iterators. Sequence
containers support push_front and/or push_back, while associative
containers don¡¯t. Associative containers offer logarithmic-time
lower_bound, upper_bound,andequal_range member functions, but
sequence containers don¡¯t.
With all this generalization going on, it¡¯s natural to want to join the
movement. This sentiment is laudable, and when you write your own
containers, iterators, and algorithms, you¡¯ll certainly want to pursue
it. Alas, many programmers try to pursue it in a different manner.
Instead of committing to particular types of containers in their soft-ware,
they try to generalize the notion of a container so that they can
use, say, a vector, but still preserve the option of replacing it with
something like a deque or a list later ¡ª all without changing the code
that uses it. That is, they strive to write container-independent code.
This kind of generalization, well-intentioned though it is, is almost
always misguided.
Even the most ardent advocate of container-independent code soon
realizes that it makes little sense to try to write software that will work
with both sequence and associative containers. Many member func-tions
exist for only one category of container, e.g., only sequence con-tainers
support push_front or push_back, and only associative
containers support count and lower_bound,etc.Even such basicsas
insert and erase have signatures and semantics that vary from category
to category. For example, when you insert an object into a sequence
container, it stays where you put it, but if you insert an object into anഊassociative container, the container moves the object to where it
belongs in the container¡¯s sort order. For another example, the form of
erase taking an iterator returns a new iterator when invoked on a
sequence container, but it returns nothing when invoked on an asso-ciative
container. (Item 9 gives an example of how this can affect the
code you write.)
Suppose, then, you aspire to write code that can be used with the
most common sequence containers: vector, deque,andlist. Clearly,
you must program to the intersection of their capabilities, and that
means nouses ofreserve or capacity (see Item 14), because deque and
list don¡¯t offer them. The presence of list also means you give up opera-tor[],
and you limit yourself to the capabilities of bidirectional itera-tors.
That, in turn, means you must stay away from algorithms that
demand random access iterators, including sort, stable_sort,
partial_sort,andnth_element (see Item 31).
On the other hand, your desire to support vector rules out use of
push_front and pop_front,andbothvector and deque put the kibosh on
splice and the member form of sort. In conjunction with the con-straints
above, this latter prohibition means that there is no form of
sort you can call on your ¡°generalized sequence container.¡±
That¡¯s the obvious stuff. If you violate any of those restrictions, your
code will fail to compile with at least one of the containers you want to
be able to use. The code that will compile is more insidious.
The main culprit is the different rules for invalidation of iterators,
pointers, and references that apply to different sequence containers.
To write code that will work correctly with vector, deque,andlist,you
must assume that any operation invalidating iterators, pointers, or
references in any of those containers invalidates them in the container
you¡¯re using. Thus, you must assume that every call to insert invali-dates
everything, because deque::insert invalidates all iterators and,
lacking the ability to call capacity, vector::insert must be assumed to
invalidate all pointers and references. (Item 1 explains that deque is
unique in sometimes invalidating its iterators without invalidating its
pointers and references.) Similar reasoning leads to the conclusion
that every call to erase must be assumed to invalidate everything.
Want more? You can¡¯t pass the data in the container to a C interface,
because only vector supports that (see Item 16). You can¡¯t instantiate
your container with bool asthe type of objects to be stored,because,
as Item 18 explains, vector<bool> doesn¡¯t always behave like a vector,
and it never actually stores bools. You can¡¯t assume list¡¯s constant-ഊtime insertions and erasures, because vector and deque take linear
time to perform those operations.
When all is said and done, you¡¯re left with a ¡°generalized sequence
container¡± where you can¡¯t call reserve, capacity, operator[], push_front,
pop_front, splice, or any algorithm requiring random access iterators; a
container where every call to insert and erase takes linear time and
invalidates all iterators, pointers, and references; and a container
incompatible with C where bools can¡¯t be stored. Is that really the
kind of container you want to use in your applications? I suspect not.
If you rein in your ambition and decide you¡¯re willing to drop support
for list, you still give up reserve, capacity, push_front,andpop_front;you
still must assume that all calls to insert and erase take linear time and
invalidate everything; you still lose layout compatibility with C; and
you still can¡¯t store bools.
If you abandon the sequence containers and shoot instead for code
that can work with different associative containers, the situation isn¡¯t
much better. Writing for both set and map is close to impossible,
because sets store single objects while maps storepairs of objects.
Even writing for both set and multiset (or map and multimap)is tough.
The insert member function taking only a value has different return
types for sets/maps than fortheirmulti cousins, and you must reli-giously
avoid making any assumptions about how many copies of a
value are stored in a container. With map and multimap,youmust
avoid using operator[], because that member function exists only for
map.
Face the truth: it¡¯s not worth it. The different containers are different,
and they have strengths and weaknesses that vary in significant
ways. They¡¯re not designed to be interchangeable, and there¡¯s little
you can do to paper that over. If you try, you¡¯re merely tempting fate,
and fate doesn¡¯tlike to be tempted.
Still, the day will dawn when you¡¯ll realize that a container choice you
made was, er, suboptimal, and you¡¯ll need to use a different container
type. You now know that when you change container types, you¡¯ll not
only need to fix whatever problems your compilers diagnose, you¡¯ll
also need to examine all the code using the container to see what
needs to be changed in light of the new container¡¯s performance char-acteristics
and rules for invalidation of iterators, pointers, and refer-ences.
If you switch from a vector to something else, you¡¯ll also have to
make sure you¡¯re no longer relying on vector¡¯s C-compatible memory
layout, and if you switch to a vector,you¡¯llhave to ensure that you¡¯re
not using it to store bools.ഊGiven the inevitability of having to change container types from time to time, you can facilitate such changes in the usual manner: by
encapsulating, encapsulating, encapsulating. One of the easiest ways
to do this is through the liberal use of typedefs for container and iter-ator
types. Hence, instead of writing this,
class Widget { ... };
vector<Widget> vw;
Widget bestWidget;
... // give bestWidget a value
vector<Widget>::iterator i = // find a Widget with the
find(vw.begin(), vw.end(), bestWidget); // same value as bestWidget
write this:
class Widget { ... };
typedef vector<Widget> WidgetContainer;
typedef WidgetContainer::iterator WCIterator;
WidgetContainer vw;
Widget bestWidget;
...
WCIterator i = find(vw.begin(), vw.end(), bestWidget);
This makes it a lot easier to change container types, something that¡¯s
especially convenient if the change in question is simply to add a cus-tom
allocator. (Such a change doesn¡¯t affect the rules for iterator/
pointer/reference invalidation.)
class Widget { ... };
template<typename T> // see Item 10 for why this
SpecialAllocator { ... }; // needs to be a template
typedef vector<Widget, SpecialAllocator<Widget> > WidgetContainer;
typedef WidgetContainer::iterator WCIterator;
WidgetContainer vw; // still works
Widget bestWidget;
...
WCIterator i = find(vw.begin(), vw.end(), bestWidget); // still works
If the encapsulating aspects of typedefs mean nothing to you, you¡¯re
still likely to appreciate the work they can save. For example, if you
have an object of typeഊmap<string,
vector<Widget>::iterator,
CIStringCompare> // CIStringCompare is ¡°case-//
insensitive string compare;¡±
// Item 19 describes it
and you want to walk through the map using const_iterators, do you
really want to spell out
map<string, vector<Widget>::iterator, CIStringCompare>::const_iterator
more than once? Once you¡¯ve used the STL a little while, you¡¯ll realize
that typedefs are your friends.
A typedef is just a synonym for some other type, so the encapsulation
it affords is purely lexical. A typedef doesn¡¯t prevent a client from
doing (or depending on) anything they couldn¡¯t already do (or depend
on). You need bigger ammunition if you want to limit client exposure
to the container choices you¡¯ve made. You need classes.
To limit the code that may require modification if you replace one con-tainer
type with another, hide the container in a class, and limit the
amount of container-specific information visible through the class
interface. For example, if you need to create a customer list, don¡¯t use
a list directly.Instead,create aCustomerList class, and hide a list in its
private section:
class CustomerList {
private:
typedef list<Customer> CustomerContainer;
typedef CustomerContainer::iterator CCIterator;
CustomerContainer customers;
public: // limit the amount of list-specific
... // information visible through
}; // this interface
At first, this may seem silly. After all a customer list is a list,right?
Well, maybe. Later you may discover that you don¡¯t need to insert or
erase customers from the middle of the list as often as you¡¯d antici-pated,
but you do need to quickly identify the top 20% of your cus-tomers
¡ª a task tailor-made for the nth_element algorithm (see
Item 31). But nth_element requires random access iterators. It won¡¯t
work with a list. In that case, your customer ¡°list¡± might be better
implemented as a vector or a deque.
When you consider this kind of change, you still have to check every
CustomerList member function and every friend to see how they¡¯ll be
affected (in terms of performance and iterator/pointer/reference
invalidation, etc.), but if you¡¯ve done a good job of encapsulating Cus-ഊtomerList¡¯s implementation details, the impact on CustomerList clients
should be small. You can¡¯t write container-independent code, but they
might be able to.