The New C++: The Group of Seven — Extensions under Consideration for the C++ Standard Library
Herb Sutter
Copyright © 2002 Herb Sutter
Last time [1], I gave an overview of the past, present, and likely future directions for the C++ Standard, who the major players are, and how they interact and affect you. This time, as promised, I'll give a survey of the first batch of suggested library extensions that were considered at the October 2001 WG21/J16 meeting in Redmond, Washington, USA.
Ground Rules
When you read the proposal summaries in the next section, please remember four important things:
No final decisions are being made right now. This first group of proposals is primarily intended to give the library working group something concrete to chew on. Looking at actual proposals has let us better learn what we want in a proposal, and what kinds of questions we want to be able to ask.
None of these proposals is a shoo-in. Many of these proposals happen to come from Boost [2], but Boost isn't getting played as a favorite here. The door is not closed to alternatives to these same proposals. These are samples that have received initial consideration only, and the library working group knows that most of these proposals have competing designs or implementations. For example, one of the proposals was for Boost's regular expression facility, but Microsoft Research has also recently made available a competing regular expression facility of their own that may be better at some things. In such cases, the library working group may choose to adopt one of the competing proposals, or aspects of both, or possibly even none at all if it decides that the Standard doesn't need to include a given kind of facility. (For example, even if there are multiple proposals for a facility to automatically convert an integer to base 42 modulo the current phase of the moon, I doubt we'd accept any of them.)
Compatibility with standards is important. That means compatibility with C++98, as proposals generally ought to be implementable in current Standard C++. It also means compatibility with other standards, notably C99 (for example, adoption of new C99 facilities such as its header <stdint.h>, mentioned below) and POSIX (for example, in the thread library proposal).
Implementability is important. All of the proposals we look at should have reference implementations that are available for inspection by the library working group members. This follows the tradition of the original HP STL, which in 1995 was provided with a working implementation and in a form the committee could freely add to the Standard.
There are three broad categories of things we know we'd like to add to the C++ Standard library:
C99 compatibility. Wherever possible, we would like to adopt C99 facilities so as to promote better compatibility between the two languages. Again, C99's header <stdint.h> is a prominent example.
Filling in gaps. We would like to add things that fill gaps and omissions in the current C++ Standard library. One example is hash-based containers. Another is a wider choice of smart pointers in addition to the current auto_ptr, which unfortunately seems to be used even in inappropriate places just because it's the only standard smart pointer; we can do better.
Useful facilities. Now, just because a facility is useful doesn't mean it has to be standardized. But some facilities, such as strings, are so widely used that it would be embarrassing to fail to have them in a standard. We do in fact have strings in C++98 (unlike pre-standard C++) for just this reason; what we don't have are things like standard support for regular expression matching and tokenization, both of which are common tasks we want to perform on strings in particular and on iterator ranges and streams in general. In this category, I also include features that facilitate systems programming and generic programming tasks, such as the thread and type traits features discussed below.
As you'll now see, this set of proposals includes representatives from all three of these categories.
The Proposals
Here are the Group of Seven proposals.
Header <cstdint> [3]
The C99 Standard added several new facilities to the C Standard library. In particular, the C99 <stdint.h> header supplies standard integer types of given widths, which are useful for improving code portability across platforms:
Exact-width integers (optional in C99): the types intNN_t and uint_NN_t where NN can be 8, 16, 32, or 64 (e.g., int32_t) are signed and unsigned integers of exactly NN bits.
Minimum-width integers: the types int_leastNN_t and uint_leastNN_t where NN can be 8, 16, 32, or 64 (e.g., int_least32_t) are signed and unsigned integers of at least NN bits.
Fastest minimum-width integers: the types int_fastNN_t and uint_fastNN_t where NN can be 8, 16, or 32 (e.g., int_fast32_t) are signed and unsigned integers of at least NN bits that are usually the fastest for most kinds of integer operations.
Greatest-width integers: the types intmax_t and uintmax_t are signed and unsigned integers able to hold any value that can be held in any other signed or unsigned type, respectively.
This is useful because in most commercial C++ projects today that have to target multiple platforms, we already have to define our own versions of these facilities for better portability. You probably have an OUR_INT32 typedef or macro in your project's common system header already. Facilities like these help to keep us from reinventing too many basic wheels and are especially useful as we prepare for the shift to 64-bit computing if we're not there already.
Boost's header <cstdint> provides typedef wrappers for these C99 types and places all names into the Boost namespace. If adopted into the Standard, presumably the names would appear in the standard library's namespace.
Type Traits [4]
The second facility submitted was Boost's Type Traits. If you have any doubt about how important it is to know things about types when doing generic programming, reread Alexandrescu's Modern C++ Design [5]. That book's Loki library includes similar facilities to the Boost facility, although details differ and each has advantages the other does not.
Say you're writing a template that has a template parameter T:
template<typename T>
void f( T t ) { /* whatever */ }
Inside your function template, do you want to know if the type T is really a class (instead of, say, an int or a function)? Just ask is_class<T>. Want to know if it's a member pointer? Just ask is_member_pointer<T>. Is it a floating-point type? is_float<T> will tell you. That and more are available today.
That a type traits facility was among the first submissions to be considered for the next C++ Standard library is an indication of how important and how often handcrafted it already is today. Just as there were a lot of strings before the Standard had its basic_string template, today a lot of people are rolling their own type traits facilities. Regardless of which proposal is eventually accepted, having this kind of facility will provide what we now realize is an essential service for certain kinds of generic programming.
Regular Expressions [6]
Regular expression parsing and matching is another one of those things that many projects do every day. Languages like Perl provide this capability right out of the box. Boost's regular expression matching library provides powerful tools that are deliberately similar to and compatible with those in the Perl, POSIX, and other popular regular expression libraries. If you know Perl's tools, you should be able to use Boost's without breaking a sweat.
Here's a simple example from the library's own documentation, showing how to check if a normal C++ string happens to hold a human-readable credit-card number:
bool validate_card_format(const std::string s)
{
static const boost::regex e("(\\d{4}[- ]){3}\\d{4}");
return regex_match(s, e);
}
As I've pointed out above, Microsoft Research has also made available a competing regular expression facility that claims significant performance advantages over Boost's. Other competing facilities may also appear. I personally think it's likely that one (or some combination) of them will be adopted into the C++ Standard library, but at this point the field is wide open. If you have a good regex library sitting around that you think is superior to these, let us know by posting to the newsgroup comp.std.c++. Now's the time.
Smart Pointers [7]
As noted above, it's a real shame that auto_ptr is the only standard smart pointer. That didn't need to happen; indeed, during the first round of C++ standardization, Greg Colvin in particular was several times encouraged to submit, and did submit, smart pointer variants — and then the committee accepted only auto_ptr, and that in a, well, er, let us politely say "modified" form. In particular, Colvin's counted_ptr didn't make it into the Standard. But never fear, for it is here, in Boost: counted_ptr is now called boost::shared_ptr, and there's a parallel shared_array. There's also a scoped_ptr, which is arguably what auto_ptr should have been (that is, limited to uses as an "auto" object that deallocates its pointee when it goes out of scope) and a complementary scoped_array.
While we were discussing this proposal, Andrei Alexandrescu was able to attend the meeting and offered comments on his own Loki SmartPtr [5] that uses policy-based design. SmartPtr provides a superset of the functionality of the four Boost pointers. It remains to be seen just which of these or other proposals will finally be adopted, but these alternatives are important.
If you know nothing else about Boost, know about shared_ptr. It's especially valuable if you ever want to have a container of pointers, because you just can't put auto_ptrs into containers (doing that shouldn't and had better not compile, by design, and if it does compile you're left walking naked in a minefield whether you know it or not [8]). What you almost always really want is a container of shared_ptrs. Conveniently, shared_ptr specializes std::less and is otherwise specifically designed for this use.
Random Numbers [9]
Because of my own interest in cryptography, I have a soft spot in my heart for a good RNG (random number generator). RNGs are used all the time for all sorts of things, from unimportant things like generating die rolls in a board game, to important modeling applications like generating random input for stock market simulations, to vital and crucial and easy-to-get-wrong security applications like creating unguessable input for cryptographically secure secret key generation. Each of those kinds of random number generation has different requirements; for example, some require flat distributions (you generally want your dice to have a 1/6 chance of each result, instead of deliberately loaded dice), and some require non-flat distributions (such as normal or Poisson distributions).
I personally think it's important to have decent RNG facilities in the Standard so that people will be less inclined to roll their own and get them wrong. In particular, because C provides rand in its standard library, people are far too quick to rely on it when they shouldn't, which is most of the time. In C++, we provide rand because we support the C Standard library, and I personally feel we have a responsibility to do better. What's there now is too often misleading and more often than not gives people a false sense of security.
Rational Numbers [10]
Fractions, anyone? Here's a standardizable facility that provides capabilities like rational<float>(1,10) to represent the fraction 1/10 exactly (something you can't do in binary floating point), as well as helper facilities like gcd to compute the greatest common divisor. There's even a rational_cast template to explicitly convert a rational number to a floating-point approximation using natural C++ cast notation, for example rational_cast<double>( r ).
Threads [11]
"Why doesn't C++ have threads?" is a commonly heard refrain. Many of us write multithreaded C++ programs every day of the week, but it's true that the C++ Standard is silent on the subject of threads, provides no facilities for handling them (including issues like the initialization of static objects in the presence of race conditions when if you're not careful it's easy to accidentally initialize them more than once).
It's virtually a given that the next revision of the C++ Standard will include thread support. What exact form that takes, and how much change is required in the standard library as opposed to in the core language itself, remains to be seen. But the interest in this area and its pent-up demand make the Boost thread library perhaps the most interesting submission of the bunch. This thread library has been implemented using POSIX threads on Unix and Windows and also using the native Win32 threads on Windows.
Summary
No final decisions have been made on any of these facilities, and competing versions of many of them do exist. The committee welcomes those and other future submissions too. In the meantime, we are already seeing concrete and useful proposals in the areas of C99 compatibility (header <cstdint>), filling gaps (smart pointers), and useful facilities especially for systems programming and generic programming (type traits, regular expressions, random numbers, rational numbers, and threads).
Next time, a closer look at one of the above proposed facilities. After that, it will be time to include news from the upcoming April 2002 standards meeting in Curaçao. Stay tuned.
References
[1] H. Sutter. "The New C++," C/C++ Users Journal Experts Forum, February 2002, <www.cuj.com/experts/2002/sutter.htm>.
[2] <www.boost.org>
[3] <www.boost.org/libs/integer/cstdint.htm>
[4] <www.boost.org/libs/type_traits/index.htm>
[5] A. Alexandrescu. Modern C++ Design (Addison-Wesley, 2001).
[6] <www.boost.org/libs/regex/index.htm>
[7] <www.boost.org/libs/smart_ptr/index.htm>
[8] H. Sutter. Exceptional C++, Item 37 (Addison-Wesley, 2000).
[9] <www.boost.org/libs/random/index.html>
[10] <www.boost.org/libs/rational/index.html>
[11] <www.boost.org/libs/thread/doc/index.html>
About the Author
Herb Sutter is an independent consultant and secretary of the ISO/ANSI C++ standards committee. He is also one of the instructors of The C++ Seminar (www.gotw.ca/cpp_seminar). Herb can be reached at hsutter@acm.org.