The Standard Librarian: Defining Iterators and Const Iterators
Matt Austern
Writing an iterator isn’t hard, and it’s a natural way to extend the C++ Standard library. But if you want to do it right, there are a few wrinkles you ought to know about.
The Standard C++ library is extensible by design: standard algorithms such as reverse and partition operate on the predefined containers like vector and list, and they can also operate on any user-defined data structure that supplies the appropriate iterators. Using the library effectively involves extending it.
By now, iterators are familiar to most C++ programmers. Iterators abstract the most basic properties of pointers: a forward iterator points to some object in a sequence, and it can be incremented so that it points to the next element in the sequence. (Stronger iterator categories, bidirectional iterators and random access iterators, provide additional means of traversing sequences. Weaker iterator categories are inappropriate for most data structures.)
Every iterator has a category type (std::forward_iterator_tag in the case of a forward iterator), a value type (the type of the object it points to), a difference type (an integer type that represents the distance between two elements of a sequence), and a pointer and reference type (pointer and reference to the iterator’s value type). Those types are accessed through the std::
iterator_traits class; the easiest way for you to provide them, when defining your own iterator class, is to provide them as nested typedefs: iterator_category, value_type, difference_type, pointer, and reference.
A forward iterator is any type that satisfies the requirements in §24.1.3 of the C++ Standard; that section of the Standard tells you what member functions and overloaded operators have to be defined. Once you’ve figured out what information an iterator must keep track of so that it can point to an element and so that it can find the next element, defining a forward iterator is just a matter of filling in those functions’ definitions.
Matched Pairs of Iterators
One complication is that it usually isn’t enough to define an iterator class. You probably need to define two iterator classes, one that permits modification of the object it points to (*i returns a reference to the object) and one that does not (*i returns a const reference). The library’s predefined container classes do this: the std::list class, for example, has a nested type iterator and a different nested type const_iterator; the latter can be used to traverse a const std::list. The value types of list<T>::iterator and list<T>::const_iterator are both T, but the reference type and pointer type differ: for list<T>::iterator they are T& and T* respectively, while for list<T>::const_iterator they are const T& and const T*. You can convert a list<T>::iterator to a list<T>::
const_iterator, but (for obvious reasons of const correctness) not the other way around.
Matched pairs of iterators are just as common in user-defined types as they are in predefined standard library types. Suppose, for example, that you’re defining a simple singly linked list class. You might start with something like this:
template <class T>
struct slist_node {
T val;
slist_node* next;
slist_node
(const T& t, slist_node* p)
: val(t), next(p) { }
};
template <class T> struct slist {
slist_node<T>* head;
...
};
An iterator class for slist can be equally simple:
template <class T>
struct slist_iterator {
typedef std::forward_iterator_tag
iterator_category;
typedef T value_type;
typedef std::ptrdiff_t
difference_type;
typedef T& reference;
typedef T* pointer;
slist_iterator(slist_node<T>* x=0)
: p(x) { }
slist_iterator
(const slist_iterator& i)
: p(i.p) { }
reference operator*() const
{ return p->val; }
pointer operator->() const
{ return &(p->val); }
slist_iterator& operator++() {
p = p->next;
return *this;
}
slist_iterator operator++(int) {
slist_iterator tmp(*this);
++*this;
return tmp;
}
slist_node<T>* p;
};
How should we define the matching const iterator? We could just define a separate slist_const_iterator class, but code duplication is wasteful and error-prone. The changes to turn slist_iterator into slist_const_iterator are tiny:
• Declare p to be of type const slist_
node<T>* instead of slist_node<T>*.
• Declare pointer and reference to be const T* and const T& instead of T* and T&.
• Define a converting constructor that takes an argument of type slist_iterator.
None of these are obstacles to defining a single class that can take the place of both slist_iterator and slist_const_
iterator. We define an iterator class with additional template parameters, and those parameters determine whether or not it’s a const iterator. We give the class a constructor that takes the non-const version as an argument; in one case it will be a copy constructor, in the other case a converting constructor. The other two differences just involve changing one type to another, so it’s easy to encapsulate those differences as template parameters.
Finally: what should those extra template parameters look like? In my book [1], I proposed explicitly passing the pointer and reference types as template parameters. That method is adequate, but it results in somewhat cumbersome type names; there’s a cleaner solution. We can provide just a single extra template parameter, a Boolean flag that determines whether or not we’re defining a const iterator, and then use a little bit of machinery, a ‘‘compile-time ?: operator’’ that selects one type or another based on that flag [2]. This is shown in Listing 1.
Equality Comparisons
We haven’t yet defined an equality operator. There’s still one snag, and you can even see it in some of the standard library’s predefined iterators. Try compiling this program:
#include <deque>
int main() {
std::deque<int> d;
std::deque<int>::const_iterator i = d.begin();
while (d.end() != i)
++d;
}
The program doesn’t do anything, but that’s not the point. The point is that, with many existing library implementations, it won’t even compile. Are those implementations buggy? Not necessarily; i is of type deque<int>::const_iterator and d.begin returns a deque<int>::iterator, and the C++ Standard isn’t completely clear whether or not an equality comparison between the two is guaranteed to work [3]. Even if the Standard doesn’t explicitly require this, however, it’s certainly more friendly if you support it in your own iterator classes.
You might wonder how this could possibly be a problem. After all, haven’t we already said that a container’s iterator type can always be converted to its const iterator type? If d.begin() can be converted into a deque<>::const_iterator, then why can’t you compare them?
The problem is that there are a number of different ways to define equality operators for an iterator; if they’re defined in either of the two most obvious ways, comparisons between a container’s iterator and const iterator types won’t work.
First, suppose operator== is defined as a member function. That’s not quite good enough. If i is a deque<>::const_iterator, and j is a deque<>::iterator, then i == j will work but j == i won’t. It’s simple enough to see the reason for the asymmetry: member functions are inherently asymmetrical. An expression like a.f(b) (or, in this case, j.operator==(i)) invokes a specific class’s member function. The compiler won’t try to convert a to some other class; conversions only get applied to the function’s arguments.
That’s obvious enough, so your next thought might be to define operator== as a non-member function. Unfortunately, doing that in the obvious way is even worse! A simple toy program illustrates the problem:
template <class T> struct A { };
template <class T> struct B {
B() { }
B(A<T>) { }
};
template <class T>
void f(B<T>, B<T>) { }
int main() {
A<int> a;
B<int> b(a); // OK, A is
// convertible to B
f(a, b); // Doesn’t work
}
It’s not good enough for A to be convertible to B. If f weren’t a template, there would be no problem: the compiler would apply the user-defined conversion from A<int> to B<int>. Since f depends on a template parameter T, however, another step has to come first: the compiler has to deduce a value for T that makes the function call match f’s argument list. In this case there can be no match: f’s declaration says that its arguments are of the same type, but we’re trying to call it with two different types. Template argument deduction requires an exact match [4]; user-defined conversions aren’t considered.
We can’t declare operator== as a member function, and we can’t declare it as a non-member function template. It would seem that what we need is a way to declare a whole family of non-template functions, one for every possible instantiation of the iterator class. This is an odd requirement, since a family of parameterized functions is just what templates are for, but the oddest part is that it’s actually possible.
It’s possible because of an obscure loophole in the way friend declarations of class templates work [5]. You can explicitly declare a friend function to be a template. If you don’t, however, and if it doesn’t match a previously declared function template, then the compiler assumes it to be an ordinary non-template function. For example:
template <class T> void g(T);
template <class T> struct X {
// f is a function template
friend void f<T>(T);
// g is a function template
friend void ::g(T);
// h is a non-template function
friend void h(T);
};
Usually this is just a nuisance: normally you want the compiler to treat something like h as the declaration of a function template, so you have to remember to declare it in such a way that it does get treated as one. Occasionally, however, this quirk is genuinely useful. If you define a function as part of a friend declaration, and if you declare it as a non-template friend, you’ll get a family of non-template functions — just what we needed for an iterator’s equality operator.
template <class T> struct X {
friend void f(T) { } // f is a non-template function
};
A complete definition of slist_iterator, including the equality operator, is shown in Listing 2.
Summary
When you write a container, or something like a container, it’s usually not enough to define a single iterator class. You need to define a matched pair of iterator and const iterator classes. Defining such a matched pair presents some implementation issues that don’t arise if you’re defining a single iterator class that stands on its own.
The two classes in the iterator/const iterator pair must have the same iterator category, difference type, and value type; the iterator class must be convertible to the const iterator class, but not vice versa. You can define the iterator and const iterator types as a single class, by adding an additional template parameter and using the choose template, or something like it, to define the iterator’s nested types appropriately. If you’re using one of the predefined container classes (string, vector, list, deque, set, map, multiset, multimap), you should avoid comparing one of its iterators to one of its const iterators. If d is a deque (as opposed to a const deque&), you shouldn’t write
std::deque<int>::const_iterator i = d.begin();
while (i != d.end())
...
You should write
std::deque<int>::iterator i = d.begin();
while (i != d.end())
...
instead. The C++ Standard doesn’t explicitly say that the first form should work, and, indeed, it does not work on all implementations. Your programs will be more portable if you avoid it. When you’re defining your own matched pair of iterator and const iterator classes, you can easily make sure that comparisons between the two will work properly; just make sure to define the comparison operators as non-template friend functions. o