<!-- begin story summary -->
C++ in 2005: Can It Become A Java Beater? (Technology)
By Carnage4Life
Wed Mar 7th, 2001 at 10:56:16 PM EST
Recently in an interview with Linux World Bjarne Stroustrup described his concerns for the C++ language as well as a wish list of libraries he would like added to the standard when it is next up for review in 2005. The following article is an analysis of Bjarne's wishlist as it relates to C++ regaining the ground it has lost due to Java's emerging popularity on the server. I'll also discuss a few libraries I'd like to see added that were not discussed by Bjarne.
NOTE: Bjarne Stroustrup does not mention Java in this interview, Java vs. C++ is simply the direction I have decided to take the results of the interview so that there is something to compare the wishlist against.
<!-- end story summary --><!-- begin story body -->
Concurrency
Bjarne: I'd like to see a library supporting threads and a related library supporting concurrency without shared memory.
Multithreaded programming is very important because it allows for unique advantages including exploiting parallelism on machines with multiple processors, making programs appear faster by parallelizing disk bound I/O and allowing for better modularization of code.
Currently C++ developers who want to use threads in a standard compliant manner must use the PThreads library which uses the Mutual Exclusion model to handle concurrency. Mutual Exclusion primarily involves using Mutex variables and Condition variables to handle thread synchronization and access to shared data. The main problem with using PThreads in C++ is that the POSIX thread standard is not designed for object oriented programming and one constantly slams into limitations such as being unable to specifically thread objects which hamper design decisions.
There are a few alternatives to using PThreads if standards compliance is not important. RogueWave Software has what is probably the most popular C++ threads library called Threads.h++ 2.0. There are also The Adaptive Communication Environment ( ACE) and ZThread libraries which both contain a number of wrappers to both POSIX and Win32 threads and enable Object Oriented thread programming in C++. Some people have been known to eschew threads and instead use multiple processes and shared memory to achieve their goals.
One of the greatest boons of Java is that it provides a concurrency API which uses the Monitor model. In the monitor model, access to certain shared data is done only through certain synchronized functions and the object is locked while a synchronized function is executing which effectively regulates access to shared data. In the Monitor model there is no need for mutexes or condition variables, all that is needed are wait(), signal() and variations thereof.
Hopefully a thread library will be added that uses the Monitor model or even the more user-friendly Serializer model which differs from the Monitor model in that signalling of threads is done automatically and shared resources can be accessed from outside the Serializer even though concurrency is maintained.
Reflection
Bjarne: I'd like to see something like that supported through a library defining the interface to extended type information.
Currently both Java and C++ support Run-Time Type Identification (RTTI), which is a mechanism that allows one to determine the specific class of an Object and is rather useful when dealing with collections of various derived classes that are referenced via a pointer to a single base class.
Many feel that RTTI is an essential part of Object Oriented Programming, Doug Lea's Usenix paper on RTTI lists several situations where RTTI provides the best solution to certain problems. On the other hand there is also a certain camp of OO purists who argue that using RTTI is usually a sign of bad object oriented design, Scott Meyers is noted as having stated "Anytime you find yourself writing code of the form 'If the object is of type T1 do something and if it is of type T2 do something else,' slap yourself". Meyers is refering to the fact that with the judicious use of polymorphism and encapsulation, the type of an object shouldn't be an issue because each object can perform operations on itself based on an interface specified in a base class. Personally I believe that in the general case Meyers is right but there are certain situations (especially when modules are being written by different developers or interaction is done across different modules) where RTTI is preferrable to using proper OO.
Reflection is the logical next step to RTTI. Reflection enables one to discover the fields, methods and constructors of a class at runtime and manipulate them in various ways including invoking methods dynamically at runtime and creating new instances of these unknown objects. Reflection is primarily useful for developers who create tools such as debuggers, class browsers, interpreters, and a host of others that need to be able to extract information on arbitrary objects and execute code within these objects at runtime. I for one would like to see Reflection added to the C++ standard because it will make writing class browsers in various IDEs a whole lot easier.
Persistence
Bjarne: I'd like to see some support in the Standard Library, probably in connection with the extended type information, but I don't currently have any concrete suggestions.
Object Persistence also known as Serialization is the ability to read and write objects via a stream such as a file or network socket. Object Persistence is useful in situations where the state of an object must be retained across invokations of a program. Usually in such cases simply storing data in a flat file is insufficient yet using a Database Management System (DBMS) is overkill.
There are many subtleties that make a creating an object persistence library a non-trivial problem. Chief of which is the fact that a reflection library or other similar mechanism is needed to be able to dynamically obtain all the fields in a class and load or write them to or from a stream. Secondly, obtaining the fields in an object and persisting them to disk is a relatively easy task when the fields are made up of simple types (int, float, char, etc) but is problematic once the fields are actually objects which may also contain objects, ad infitum. Finally an object persistence format needs to be designed, in this regard I am torn between proposing the use of XML so as to create a human readable, extensible and easily validated format and a binary format to reduce bloat and increase speed of reads & writes.
Although object persistence is interesting I'm not sure it is something that needs to be explicitly addressed by being in the standard but instead should be allowed to be solved by C++ developers as they see fit.
Hash tables
Bjarne: Of course, some variant of the popular hash_map will be included.
This is a no-brainer. The current C++ standard has a sorted hashtable declared in <map> but does not have a facility for developers who want a hash table without the overhead of sorting. SGI's hash_map is commonly used and is expected to make into the standard at the soonest opportunity.
Constraints for template arguments
Bjarne: This can be simply, generally, and elegantly expressed in C++ as is.
Templates are a C++ language facility that enable generic programming via parameteric polymorphism. The principal idea behind generic programming is that many functions and procedures can be abstracted away from the particular data structures on which they operate and thus can operate on any type.
In practice, the fact that templates can work on any type of object can lead to unforeseen and hard to detect errors in a program. It turns out that although most people like the fact that template functions can work on many types without the data having to be related via inhertance (unlike Java), there is a clamor for a way to specialize these functions so that they only accept or deny a certain range of types.
The most common practice for constraining template arguments is to have a constraints() function that tries to assign an object of the template argument class to a specified base class's pointer. If the compilation fails then the template argument did not meet the requirements. Of course, if you are going to do this you might as well forego using templates and simply use pointers to the base class and polymorphism thus avoiding the cryptic compiler error messgaes usually associated with using templates as well as code bloat associated with templates. Bjarne Stroustrup has proposed adding constraints() to the standard. Here are links to code that shows how to use template argument constraints and constrain template arguments regarding built in types.
It should be noted that although Java™ currently doesn't support generic programming, there are extensions of Java that do such as Pizza and GenericJava. Also a proposal to add generic types to Java has been submitted to the Java Community Process and it seems generic programming has been is scheduled to be added to the Java standard soon, some people expect it to make it into version 1.4
Assertions
Bjarne: Many of the most useful assertions [a means of code verification and error handling] can be expressed as templates. Some such should be added to the Standard Library.
Assertions are a useful debugging technique where a predicate is evaluated and if false causes the program to terminate while printing the location of the failed assertion and the condition that caused it to fail. Assertions are usually used during development and removed from the code before the software actually ships.
Errors which the programmer never expects to happen (E.g. age < 0) are prime candidates for using assertions. Typical locations for assertions include; dealing with internal invariants within a function which are usually dealt with via nested if statements where the last else is a catchall that handles "can't happen" values, dealing with control-flow invariants such as the when the default case in a switch statement should never be reached , handling function preconditions (E.g. argv != NULL), handling function post conditions (E.g. x_squared = x * x then assert x_squared >= 0 ) or simply verifying that the state of class is valid (E.g. AVLTree.isbalanced() ).
Currently the only way to use assertions in a portable manner in C++ is to use the ANSI C assert MACRO located in assert.h. There is also the useful static_assert library available at the BOOST site which is probably what the assertion library that will be proposed to the standards commitee will be based on. Assertions are quite useful for debugging and there should be little difficulty in adding a more powerful version of assert to the Standard Library.
Regular expression matching
Bjarne: I'd like to see a pattern-matching library in the standard.
Regular Expressions are a powerful method of describing text patterns and are the major reason that Perl is now the hacker's language of choice for creating programs that search or process text. Until quite recently C++ programmers had to use the C library functions regcmp and regex located in libgen.h on *nix or the RegExp Object via COM on Windows if they wanted to do any sophisticated text processing with regular expression. With the advent of Dr. John Maddock's Regex++ this is no longer the case.
As for actually adding regexes to the standard, I think this is a case of unnecessarily bloating the standard. Java has done fine without having regexes in the standard and there are a slew of Java regex libraries including OROMatcher, pat, and GNU Regexp.
Garbage collection
Bjarne: I'd like to see the C++ standard explicitly acknowledge that it is an acceptable implementation technique for C++, specifying that "concealed pointers" can be ignored and what happens to destructors for collected garbage. (See section C.4.1 of The C++ Programming Language for details.)
I have covered this in a previous article on Garbage Collection and C++ and will thus simply provide an pared down version of that article with minor modifications:
Hans Boehm's site on garbage collection has a well written page that dissects the advantages and disadvantages of garbage collection in C++.
Basically the advantages of Garbage Collection are:
30 to 40 percent faster development time.
Less restrictive interfaces and more reusable code.
Easier implementation of sophisticated data structures.
Eliminates some premature deallocation errors.
Uses equivalent or less CPU-time than a program that uses explicit memory deallocation. While the disadvantages are
More interaction with the hard disk (virtual memory/paging) due to examining all pointers in the application.
May not work if programmers use various tricks and hacks while coding (e.g. casting pointers to ints and back)
May utilize more space than a program that uses explicit memory deallocation.
Unpredictable nature of collector runs may create unexpected latency and time lags in the system. Reference counting although popular is not the only garbage collection algorithm and in fact is considered by language purists as unsatisfactory since it can't handle circular links. There are many more algorithms including Mark-Sweep Garbage Collection, Mark-Compact Garbage Collection, Copying Garbage Collection, Generational Garbage Collection,and Incremental and Concurrent Garbage Collection. Han's Boehm's site discusses mark-sweep garbage collection and not reference counting. This doesn't mean that Mark-Sweep doesn't have its problems (making two passes across memory and the space taken up by the marks is expensive) but these may be remedied by using Generational garbage collection.
Bjarne Stroustrup is very keen on having GC added to C++ but also wants to make sure that the principle of "Only pay for it if you use it" which has been the hallmark of C++ for years is preserved. In my opinion, the advantages of using garbage collection in C++ greatly outweigh any disadvantages.
GUI
Bjarne: It would be nice to have a standard GUI framework, but I don't see how that could be politically feasible.
Not worth discussing because it isn't going to happen for a variety of reasons:
The committee will never agree on a UI design model (MVC or UI delegate which will it be?)
Will require too much work for library writers.
More mature native toolkits will always be ahead of the game.
Platform-independent system facilities
Bjarne: I'd like to see the Standard Library provide a broader range of standard interfaces to common system resources (where available), such as directories and sockets.
Again I must rhapsodize over Java and the way that its Socket classes abstract away completely from the native socket calls. I am all for creating more standard interfaces to system calls beyond file I/O. Then one doesn't need to rewrite code when moving a program from *nix to Windows simply because it opens a socket or reads from a directory.
Conclusion
Wow, that was longer than I expected. I've decided to skip describing the libraries I'd like to see added (I would like an interface keyword) and just go straight to the question.
Do you think if these libraries are added to the C++ standard in 2005, that Java will begin to lose some of the ground it has gained to C++ or do you think that this is unlikely and if so why?