Angelika Langer - Training & Consulting
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | Twitter | Lanyrd | Linkedin
 
HOME 

  OVERVIEW

  BY TOPIC
    JAVA
    C++

  BY COLUMN
    EFFECTIVE JAVA
    EFFECTIVE STDLIB

  BY MAGAZINE
    JAVA MAGAZIN
    JAVA SPEKTRUM
    JAVA WORLD
    JAVA SOLUTIONS
    JAVA PRO
    C++ REPORT
    CUJ
    OTHER
 

GENERICS 
LAMBDAS 
IOSTREAMS 
ABOUT 
CONTACT 
Iterators in the Standard C++ Library

Iterators in the Standard C++ Library
Iterators in the Standard C++ Library

C++ Report, Nov./Dec. 1996
Klaus Kreft & Angelika Langer


 
 

THE ISO/ANSI STANDARD C++ library will contain a set of general-purpose data structures and algorithms. They were proposed to the standards committee by Alexander Stepanov and Meng Lee of Hewlett-Packard in 1994. The proposal was based on what is known as the Standard Template Library (STL). The committee scrutinized the proposal, found it sound and convincing, and voted it in. Thus, the STL became part of the Standard C++ Library.
At the time the STL was proposed to the committee, it was made available to the public, thanks to Hewlett-Packard. C++ programmers started to experiment with it and use it. Independently, the standardization process continued; changes and additions were made since 1994. Today the containers and algorithms in the Standard C++ Library depart from the original they stem from. Nowadays we have both: the original STL, in parallel to the Standard C++ Library.
In a previous article [1] , we introduced you to the design of iterators in the STL; it was a tutorial on compile-time polymorphism. Compile-time polymorphism allows the C++ compiler to select which of your functions to invoke based on specific properties of their parameters without introducing the runtime overhead of inheritance and dynamic binding.

Until March 1996, iterators in the STL and iterators in the Standard C++ Library were almost identical. Today they differ. Bjarne Stroustrup proposed an alternative design for iterators in the Standard C++ Library, which was accepted by the standards committee.

In this article we will explain the new design of iterators in the Standard C++ Library. We want to introduce you to several general-purpose generic programming techniques and idioms that are used in the standard for building iterators. However, these techniques and idioms are not limited to this particular domain. In subsequent issues of the C++ Report, we'll continue to cover STL and the Standard C++ library in our new column entitled "Effective C++ Standard Library."

BACKGROUND Let us start with a short recap: What are iterators in STL and Standard C++ Library?

Iterators Iterators are pointer-like objects that allow programs to step through the elements of a container sequentially without exposing the underlying representation. Iterators can be advanced from one element to the next by incrementing them. Some iterators can also be decremented or allow arbitrary jumps from one element to another, as we will see later. When they are dereferenced, iterators yield a reference to a container element. In addition, they can be compared to each other for equality or inequality.

Iterators interact seamlessly with built-in C++ types. In particular, native C++ pointers are treated as iterators to C++ arrays. Naturally, all containers in the Standard C++ Library define an iterator type, i.e., a nested type iterator that represent their respective pointer-like type.

Iterator Categories Iterators fall into categories. This is because different algorithms impose different requirements on an iterator they use. For example, the find() algorithms needs an iterator that can be advanced by incrementing it, whereas the reverse() algorithm needs an iterator that can be decremented as well, etc. Ultimately, there are five categories of iterators in STL and Standard C++ Library:

  • input iterators
  • output iterators
  • forward iterators
  • bidirectional iterators
  • random access iterators
For detailed references consult an STL or C++ Standard Library manual or take a look at one of the publications we have listed as references. [2-6] The first sidebar gives a brief overview of Iterator categories.
An iterator category is an abstraction. It represents a set of requirements to an iterator.
Using the knowledge of an iterator's category one can provide optimized implementations of an algorithm. The advance() operation is an example. It increments (or decrements for negative n) an iterator.
template <class Iterator, class Distance>
inline void advance (Iterator& i, Distance n);
Obviously, there are many ways to do this. For a C++ array one would simply perform pointer arithmetic, i.e., add n to the C++ pointer.
i += n;
For a list, Iterators must step through the sequence and advance step-by-step.
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
Observation 1 : The iterator category, which is an abstraction that represents a set of requirements to an iterator, is information related to an iterator. It is useful for providing optimized versions of an operation like advance();

Certain type information is related to an iterator. One of the questions that must be answered when designing the iterators in STL and the Standard C++ Library is: how can information related to a type be expressed in C++? Let's defer this discussion for a moment until we discuss other information related to an iterator.

TYPES ASSOCIATED WITH AN ITERATOR

There are two types that might vary depending on the iterator type:

The Distance Type An operation like advance() obviously needs an argument that indicates how far to advance the iterator:

template <class Iterator, class Distance>
inline void advance (Iterator& i, Distance n);
The type of this distance argument must represent the distance between any two iterators. Hence the distance type depends on the iterator type. For C++ pointers the distance type is the C++ type ptrdiff_t, which can represent the differenc between any two C++ pointers. Also, ptrdiff_t is the distance type of all other iterators in STL and Standard Library. However, the distance type in STL and the Standard C++ Library is not limited to ptrdiff_t.
ITERATOR CATEGORIES
There are five categories of Iterators in STL and the Standard C++ Library:
  • Input iterators allow algorithms to advance the iterator and give "read only" access to the value.
  • Output iterators allow algorithms to advance the iterator and give "write only" access to the value.
  • Forward iterators combine read and write access, but only in one direction (i.e., forward).
  • Bidirectional iterators allow algorithms to traverse the sequence in both directions, forward and backward.
  • Random access iterators allow jumps and "pointer arithmetics."
Each category adds new features to the previous one. The iterator categories obey the following order:

The Value Type An iterator can be dereferenced. It then returns a reference to a value stored in a container. The type of this referenced value also depends on the respective iterator. For example, if the iterator refers to a container holding integers, the value type will be int. More generally, if the iterator refers to a container that stores elements of an arbitrary type T, the value type will be T.

Observation 2 : Each iterator has two related types, its value type and its distance type.

So, we have found more information related to an iterator type. What is this information used for?

Value type and distance type are sometimes needed to implement algorithms. In STL and Standard C++ Library algorithms are separated from containers, i.e., an algorithm takes an iterator and uses it to access the container. No information about the container itself is available to an algorithm. This clear separation of containers and algorithms is the basic idea of Generic Programming, which is the key design idea behind the STL.

To implement algorithms only in terms of iterators it is often necessary to infer both the value type and the distance type from an iterator. Here is an example of a fictitious reverse algorithm:

template <class BidirectionalIterator>
void reverse(BidirectionalIterator first, BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{ value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
Note that distance_type and value_type aren't declared anywhere. How do we know what they are? We need to find a way to make available the distance type and value type associated to the iterator type.

Also, we want to implement optimized versions of operations like advance() for different iterator categories. Here we would want to make use of the related iterator category information. Assuming the category could be represented as a C++ type a tentative implementation could look like this:

template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
__advance(i, n, iterator_category);
}
 
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i, Distance n, random_access_iterator_category)
{
i += n;
}
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i, Distance n,
bidirectional_iterator_category)
{
if (n>=0)
while (n--) ++i;
else
while (n++) --i;
}
Again, it is not entirely obvious how we can represent and eventually find the category information associated to an iterator type. We'll explain how this is done shortly.

THE REQUIREMENTS Basically, the requirements to the design of iterators in the STL and the Standard C++ Library are:

Algorithms shall be generic , i.e., they shall be implemented only in terms of iterators. Hence we need to deduce type information associated with an iterator type, like the value and distance type, directly from the iterator type itself.

Operations shall be optimized if possible . In other words, for reasons of efficiency we want to implement polymorphic operations that exhibit different behavior depending on some property of their argument types. For example, different versions of the advance() function, optimized for each iterator category, should be available.

There are additional requirements that stem from other design principles of the STL and the Standard C++ Library:

Everything must be resolved at compile-time rather than at runtime . The goal is to make components as fast as possible at run-time. Hence, the function dispatch of polymorphic operations mentioned above can be performed at compile-time.

C++ pointers are iterators on a C++ array . They stand on equal footing with all other iterator types. Hence, whatever solution we choose, it must work for C++ pointers as well.

The way in which information is associated to an iterator type is generic . For example, the fact that an iterator is associated to a certain value type has nothing to do with the fact that the iterator falls into a certain category.

In the July 1996 issue, [1] we explained the solution to the above requirements that was used in the original STL. Now, we'll explore the solution chosen for the Standard C++ Library.

THE SOLUTION CHOSEN FOR THE STANDARD C++ LIBRARY In the Standard C++ Library iterators have traits . The traits technique is a means to associate information with class types as well as with nonclass types. See the second sidebar for more information about the traits technique in general. Here are the iterator traits:

template <class Iterator>
struct iterator_traits
{
typedef Iterator::distance_type distance_type;
typedef Iterator::value_type value_type;
typedef Iterator::iterator_category iterator_category;
};
The iterator traits are a class template to be instantiated with an iterator type that provides all the information associated with the respective iterator type as typedefs. Therefore, all iterators must define nested types, like Iterator::value_type, for each piece of related information. The value and distance type of an iterator are types in the sense of the C++ type system. Similarly, the information about the iterator category is represented by C++ types as well: There are tag classes for each of the iterator categories. They are empty structures named input_iterator_tag, output_iterator_tag, and so on.

Since native C++ pointers can be treated as iterators, we need traits for C++ pointers, as well. This is provided as a partial specialization of the iterator_traits template:

template <class T>
struct iterator_traits<T*>
{
typedef ptrdiff_t distance_type;
typedef T value_type;
typedef random_access_iterator_tag iterator_category;
};
Now let us see how iterator traits help to meet the requirements. Here is our ctitious reverse function again:
template <class BidirectionalIterator>
void reverse(BidirectionalIterator first, BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{ value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
Now that we have iterator traits it is easy to determine the mysterious distance and value type from the iterator type: The distance type is iterator_traits <BidirectionalIterator> ::distance_type, and the value type is iterator_traits <BidirectionalIterator> ::value_type.

Also, we wanted to offer versions of the advance() function that are optimized for each iterator category. Here is how this function dispatch is done using iterator traits:

template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
__advance(i, n, iterator_traits<InputIterator>::iterator_category());
}
 
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i, Distance n, random_access_iterator_tag)
{
i += n;
}
 
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i, Distance n, bidirectional_iterator_tag)
{
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
}
The global advance() function calls the overloaded versions of an __advance() function, which takes an iterator tag as function argument. The iterator tags are of different types. Each type of iterator tag represents an iterator category. Consequently, there are five types of iterator tags:
  • input_iterator_tag
  • output_iterator_tag
  • forward_iterator
  • bidirectional_iterator_tag
  • random_access_iterator_tag
These are empty types; their only purpose it to allow the above described function overloading.

Note that function overloading is always resolved at compile-time. No runtime overhead is imposed.

In addition, note that there are iterator traits for C++ pointers. Thus, the solution works for C++ pointers as well, as stated in the original requirements we listed above.

COMPARING THE STL TECHNIQUE TO THE STANDARD LIBRARY TECHNIQUE

The solution using iterator traits meets the requirements listed above. So did the solution in STL. What did we gain? Why was it changed? The iterator traits have an edge over the old solution. Let's see why.

Consider the count algorithm:
template<class InputIterator, class T>
iterator_traits<InputIterator>::distance_type
count (InputIterator first, InputIterator last, const T& val)
{
iterator_traits<InputIterator>::distance_type n = 0;
while (first != last)
if (*first++ == val)
++n;
return n;
}
The distance type is used here as the return type of an algorithm. Without iterator traits the count algorithm cannot deduce its return type from its iterator type. This explains why the count algorithm in STL looks like this:
template <class InputIterator, class T, class Distance>
void count (InputIterator first, InputIterator last, const T& value, Distance& n)
{
while (first != last)
if (*first++ == value)
++n;
}
The distance type is explicitly specified by an additional template parameter. As template parameters can only be automatically deduced from the function argument types and never from the function return type, the result is returned via a fourth function argument. This way a call to the count algorithm in STL is error-prone and looks counterintuitive:
list<string> lst;
int i = 0; // you'd better not forget to initialize !!!
count(lst.begin(),lst.end(),"Hello",i)
Compare to the call of the count algorithm in the Standard C++ Library:
list<string> lst;
int i = count(lst.begin(),lst.end(),"Hello")
Conclusion: The most convincing advantage of iterator traits compared to the old technique is that types associated to an iterator type can be deduced from the iterator alone. Hence associated types can be used as return types of function arguments, which improves the count algorithm.*

Of course there is more to iterator traits. They improve the design of iterators in the general. Consider for instance the way the iterator category is retrieved in STL. It is via the global function

iterator_category().For each type of iterator, including C++ pointers, there is an overloaded versions of iterator_category(). There is one version taking C++ pointers, and five function templates taking iterator types that are derived from certain iterator base classes. For sake of brevity, we will spare you the details of those base classes. See our July '96 article for reference [1] .

template <class T, class Distance>
inline input_iterator_tag iterator_category
(const input_iterator<T,Distance>&)
{
return input_iterator_tag();
}
 
// etc. for output_iterator, forward_iterator,
// and bidirectional_iterator
 
template <class T, class Distance>
inline random_access_iterator_tag iterator_category
(const random_access_iterator<T,Distance>&)
{
return random_access_iterator_tag();
}
 
template <class T>
inline random_access_iterator_tag iterator_category
(const T*)
{
return random_access_iterator_tag();
}
The iterator_category() functions are empty. Their only purpose is to allow the selection of the associated iterator tag type, which is then used for function overload resolution.

It is certainly more intuitive to do a compile-time function dispatch via a type, i.e., iterator_traits <Iterator> ::itertor_category, opposed to the return type of a function, iterator_category(). It somehow suggests that the function has "functionality," i.e., performs something at runtime, which in fact it doesn't.

SUMMARY This article is the second in a series of articles on the iterators in STL and the Standard Library. It described iterator traits, which is the technique used to design iterators in the Standard C++ Library. It is a general-purpose technique for association of types to other types; it meets the following requirements:

The technique must allow a C++ compiler to

deduce the associated type only from the type, i.e., without requiring any other information or mechanism,

perform the deduction at compile-time so that it can be used for function overloading, and

associate types with nonclass types.

We compared this technique to the one used in the STL. The iterator traits technique is more elegant and less baroque than the STL technique. Also it allows us to improve the interface of certain algorithms.

Our forthcoming column in C++ Report will demonstrate the use of both techniques in terms of an example-we will illustrate how you can build a safe iterator adaptor.
 
 

THE TRAITS IDIOM

Idioms for the use of templates in C++ are still evolving. One reason might be that, for a long time, templates were only seen as a mechanism to implement C++ containers in a generic way. Coplien, [7] for instance, discusses only this usage. To our recollection, the rst book that contained more advanced template idioms was by Barton and Nackman. [8] , One idiom they describe is the nested type definitions for name commonality idiom. Here is an example for this idiom from the STL:
template <class T>
class vector {
public:
typedef typename allocator_type::types<T>::pointer iterator;
typedef typename allocator_type::types<T>::const_reference 

const_reference;
//
};
This code is a simplified excerpt from the class declaration of vector. Like all other containers, vector establishes common names by using typedefs in its public interface. The type iterator can be used for implementing the algorithms, for instance:
template <class Container>
void delete (Container& c,
Container::iterator first,
Container::iterator last);
An alternative design without the nested type definitions for iterator would require a third template parameter for delete: the type of the container's iterator.

There are several benefits of using this idiom. First, it reduces the number of template parameters. Second, it clearly expresses the design relationship between the template parameter and its associated class(es), which makes the code more understandable and maintainable. Third, it eliminates a whole category of mistakes you could otherwise make inadvertently.

Without the nested type definition for a container's iterator the delete function would look like this:

template <class Container, class Iterator>
void delete (Container& c, Iterator first, Iterator last);
Hence you can do something like this:
void foo(vector<int>& v, int* first, int* last)
{ delete(v,first,last); }
which works fine, because in most implementations of the STL a vector's iterator type is a plain pointer to the vector's value type. Now imagine you would change the type of container from a vector to a list later on:
void foo(list<int>& v, int* first, int* last)
{ delete(v,first,last); }
The catch is that this will compile and crash at runtime. This is different with the interface for the delete function that uses the nested typedef for the container's iterator type; it would already cause a compile time error instead.
Unfortunately the solution is limited: no built-in types can be used as template parameters because it is not possible to put associated types into the scope of a built-in type. For the iostream of the Standard Library Nathan Myers developed an idiom called "traits" which overcomes this limitation. He gave a detailed description of this idiom in the June 1995 C++ Report. [9] ,

Basically it is the introduction of an additional helper class template and its specializations for built-in types which contain the type definitions of the associated types. In the iostream the associated types are the traits of a certain character type, e.g. the type of the end-of-file character. Hence the name traits. Here is an example how we can use this idiom to enhance our example so that it can be used with built-in C++ arrays.

Here is the above mentioned helper class:
template <class Container>
class container_traits {
public:
typedef typename Container::iterator iterator;
};
and the partial specialization for built-in pointers:
template <class T>
class container_traits<T*> {
public:
typedef typename T* iterator;
};
Our algorithm can be improved and would now work for C++ arrays as well:
template <class Container>
void delete (Container& c,
container_traits<Container>::iterator first,
container_traits<Container>::iterator last);
The caveat with the traits technique is that it often requires partial specialization of templates. No compiler we know of supports this language feature at present.
References

1. Kreft, K., and A. Langer. "Iterators in the STL," C++ Report 8(7), July 1996.

2. Vilot, M. J. "An introduction to the Standard Template Library," C++ Report 6(8), Oct. 1994.

3. Nelson, M. C++ Programmer's Guide to the Standard Template Library, IDG Books, Foster City, CA, 1995.

4. Working Paper for Draft Proposed International Standard for Information Systems Programming Language C++, Doc. No. X3J16/95-0185 - WG21/N0785, Sept. 26, 1995.

5. Glass, G., and B. Schuchert. The STL<Primer>, Prentice Hall, Englewood Cliffs, NJ, 1995.

6. Musser, D. R., and A. Saini. STL Tutorial and Reference Guide, Addison-Wesley, Reading, MA, 1996.

7. Coplien, J. O. Advanced C++ Programming Styles and Idioms, Addison-Wesley, Reading, MA, 1992.

8. Barton, J. J., and L. R. Nackman. Scientific and Engineering C++, Addison-Wesley, Reading, MA, 1994.

9. Myers, N. "A new and useful Template technique: 'Traits,' " C++ Report, June 1995.


 
 

If you are interested to hear more about this and related topics you might want to check out the following seminar:
Seminar
 
Effective STL Programming - The Standard Template Library in Depth
4-day seminar (open enrollment and on-site)
IOStreams and Locales - Standard C++ IOStreams and Locales in Depth
5-day seminar (open enrollment and on-site)
 

 
  © Copyright 1995-2003 by Angelika Langer.  All Rights Reserved.    URL: < http://www.AngelikaLanger.com/Articles/C++Report/IteratorsInStdlib/IteratorsInStdlib.html  last update: 22 Oct 2003