Angelika Langer - Training & Consulting
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | NEWSLETTER | CONTACT | Twitter | Lanyrd | Linkedin
 
HOME 

  OVERVIEW

  BY TOPIC
    JAVA
    C++

  BY COLUMN
    EFFECTIVE JAVA
    EFFECTIVE STDLIB

  BY MAGAZINE
    JAVA MAGAZIN
    JAVA SPEKTRUM
    JAVA WORLD
    JAVA SOLUTIONS
    JAVA PRO
    C++ REPORT
    CUJ
    OTHER
 

GENERICS 
LAMBDAS 
IOSTREAMS 
ABOUT 
NEWSLETTER 
CONTACT 
C++ IOStreams In-Depth

C++ IOStreams In-Depth
New Features in Standard IOStreams
Comparing Classic and Standard IOStreams

Whitepaper, 1998
Angelika Langer


 
 

Introduction

What is Standard IOStreams? It is the standardised version of the classic IOStreams that has been around since the first days of C++. All C++ programmers use it. Just think of the notorious "Hello-world-program" that basically consists of one statement:

cout << "Hello world" << endl; Even in presence of graphical user interfaces, IOStreams has not lost much of its importance. IOStreams’ capabilities go far beyond implementing command line oriented user interfaces via cin and cout as demonstrated in the "Hello-world-program". File I/O is still an issue in real-life projects. Plus, the file abstraction can be vastly extended to communication channels for instance, like sockets, pipes, etc. Even we do not work with files or other external devices, IOStreams is still invaluable as a parsing and formatting tool.

In the process of standardisation, IOStreams was formally specified and cleaned-up and enhanced. What are the differences between the traditional and the new standard IOStreams? This question might pop up in the heads of those developers who have existing IOStreams applications and want to migrate to the standard IOStreams. Let's explore the major changes.
 

Templatizing the IOStreams Classes.

When you look at the new IOStreams header files you will immediately notice that most classes that you might know from the traditional IOStreams turned into class templates in the standard IOStreams. The template parameters are the character type, and the character traits type. Here is an example:

class ostream          turned into

template <class charT, class Traits = char_traits<charT> > class basic_ostream.

The character type usually is one of the built-in character types char or wchar_t . However, it can also be of any other conceivable, user-defined type. Naturally, a user-defined character type should exhibit the expected behavior of a character type, such as comparisons for instance. The exact requirements to a user-defined character type are not specified though, and depend on the respective implementation of the standard IOStreams.

The character traits type describes the properties of the character type, such as:

  • The end-of-file value. For type char, the end-of file value is represented by an integral constant called EOF . For type wchar_t , there is a constant defined that is called WEOF . For an arbitrary user-defined character type, the associated character traits define what the end-of-file value for this particular character type is. This value can be obtained via the traits’ member function eof().
  • The equality of two characters. For an "exotic" user-defined character type, the equality of two characters might mean something different from just bit-wise equality. You can define it by providing the traits’ member function compare() .
The list above is just an excerpt of what character traits provide.

There is a standard character traits class template defined in the C++ Standard. Its name is char_traits<class charT> . Specializations of this class template are defined for the built-in character types char and wchar_t . Every standard conforming library implementation has to provide them. Note, however, that char_traits<class charT> is not meant to be instantiated for an arbitrary character type. It just defines the interface that a specialization of this class template is expected to provide. For all iostreams class templates the traits template parameter has a sensible default value. Class basic_ostream for instance is defined as:

template <class charT, class Traits = char_traits<charT> >
class basic_ostream{ ... };
The standards committee decided to turn the traditional IOStreams classes into class templates because templates allows input and output of character types other than char or wchar_t .

For ease of use, and for backward compatibility, the standard defines type definitions for the stream class templates instantiated with the character types char and wchar_t . For type char these are:

typedef basic_istream<char> istream;
typedef basic_ostream<char> ostream;
typedef basic_iostream<char> iostream;
typedef basic_ifstream<char> ifstream;
typedef basic_ofstream<char> ofstream;
typedef basic_fstream<char> fstream;
Note that these typedefs define names identical to the class names in the traditional IOStreams. In other words, there still is an ostream ; the only difference is that it now stands for a basic_ostream<char, char_traits<char> >.
 

Splitting Class ios .

In the process of transforming the IOStreams classes into class templates, the base class of all traditional IOStreams classes, class ios, was split into:

  • a common, character type independent part: ios_base , and
  • a character type dependent class template: basic_ios<class charT, class Traits>, having the character type and the character traits type as template parameters.
ios_base is the base class of basic_ios<class charT, class Traits>, which again is the base class template of all remaining stream classes. ios_base contains all the information in a stream that is independent of the stream’s character type, such as the definition of all flags that are used for formatting control, error indication, open modes, stream positioning, plus the type definitions for these flags. Additionally ios_base manages the formatting information as well as the user allocable storage ( iword / pword ). It also handles registration and invocation of callbacks, and imbuing of locales.

One might expect that the error handling would be contained in ios_base because it is character independent. However, error indication is done in basic_ios<class charT, class Traits> . This is because ios_base is also used in the locale section of the standard library, where is serves as an abstraction for passing formatting information to the locale. Would ios_base contain the error handling, which in the standard iostreams includes the indication of errors by throwing exceptions (see subsequent sections for details), then these exceptions could also be raised by the standard locale. This effect was neither intended nor acceptable. Hence, ios_base only contains the definition of all flags for error indication; the raising of exceptions and the indication of error states is located in basic_ios<class charT, class Traits>.

The advantage of splitting class ios into class ios_base and class template basic_ios<class charT, class Traits> is that all behavior that is independent of the template parameters is factored out into a non-template. This minimizes the binary code size of the library as well as a user programs. For instance, if you write a function that resets the formatting of a stream to the default settings, this functions does not have to be a function template; it can be an ordinary function receiving a reference to an ios_base object as parameter:

void set_default_formatring(ios_base& str)
{
 str.width(0);
 str.precision(6);
 str.setf(ios_base::skipws);
 str.setf(ios_base::left, ios_base::adjustfield);
 str.setf(ios_base::dec, ios_base::basefield);
 str.setf(ios_base::fixed, ios_base::floatfiled);
}
Indicating Errors.

In IOStreams each stream maintains a stream state that indicates success or failure of a operation. The stream state can either be good , or any of the following three states when an exceptional condition occurred in a preceding operation:

  • end-of-file ; an input operation reached the end of an input sequence.
  • fail; an input operation failed to read the expected characters, or an output operation failed to generate the desired characters.
  • bad; the stream or the underlying input or output sequence lost its integrity .
In the standard IOStreams the stream’s state can be checked in the same way as it was retrieved in the traditional IOStreams: Each stream offers member functions, e.g. good(), fail(), bad(), to indicate its current state. The table below gives a detailed description of all functions supported.
 
ios_base member function Effect
bool good() True if no error flag is set.
bool eof() True if eofbit is set.
bool fail() True if failbit or badbit is set.
bool bad() True if badbit is set.
bool operator!() As fail().
operator void*() Null pointer if fail() and non-null value otherwise.
Table 1: Stream member functions for error checking







The following code example demonstrates how to check whether some text is properly written to standard output:

if (!(cout << "Hello World !"))
    handle_error();
One of the advantages of IOStreams is its intuitive use of the operator<<(). This is particularly convenient for grouping output operations; for instance, you can put into one line of source code all operations that are needed to produce one line of output. Here is an example: int value;
// some calculation
...
cout << "The calculated value is: " << value << ‘\n’;
if (!cout)
   handle_error();
As convenient as it may be, it has one drawback: In the example above is not possible to check the stream state after each output operation. C++ exceptions can help in this situation because they allow a more active error indication. For this reason the standard IOStreams optionally allows error indication via exceptions.

Before we see how the code example changes when we use exceptions, lets have a more detailed look at exceptions in the standard IOStreams. The classes ios_base and basic_ios<class charT, class Traits> provide the means for enabling iostreams exception: ios_base contains type definitions for a type called iostate along with the following flags of that type:

static const iostate badbit;
static const iostate eofbit;
static const iostate failbit;
static const iostate goodbit;
basic_ios<class charT, class Traits> contains the following two member functions: void exceptions(iostate exept_mask);
iostate exceptions();
The first function, exceptions(iostate), sets a mask that determines for which exceptional conditions an exception shall be thrown by the stream. The mask can be set to either eofbit , badbit , failbit ,or by applying operator|() to a combination of these flags. goodbit can be used to deactivate the throwing of exceptions. The second function, exceptions(void), returns the current mask settings.

The type of the exception that is thrown by the stream is ios_base::failure . To determine which exceptional condition triggered the throw, you can either use he exception’s member functions what(), which returns a descriptive text of type const char* , or you check the stream with one of the stream’s member functions shown in table 1.

Let’s see how exceptions change our example code:

int value;
// some calculation
...
ios_base::iostate old_flags = cout.exceptions();
try
{
  cout.exceptions(ios_base::badbit | ios_base::failbit);
  cout << "The calculated value is: " << value << ‘\n’;
}
catch(ios_base::failure& exc)
{
  cerr << exc.what() << endl;
}
cout.exceptions(old_flags);
In this example the old mask is saved and restored later on. Note that the new flags are set in a try block. This is because setting the exception mask can raise an exception. The stream checks its state when a new mask is set and immediately propagates, via an exception, any exceptional state that matches the newly set mask.

Note, that it is not guaranteed that all exceptions will be suppressed after a call to exceptions(ios_base::goodbit), although this call clears all bits in the exception mask . All that is assured is that errors detected by the stream and the stream buffer are not indicated via exceptions. Any other kind of error might as well result in an exception thrown. Think, for instance, of a stream that is instantiated with a user defined character and traits type. Imagine that an operation of the character or traits type throws exceptions, e.g. bad_alloc . These exceptions will not be caught by the stream and might be propagated into your application.
 

Internationalizing IOStreams.

As already mentioned above, the Standard Library includes a component for internationalization. Internationalization services are bundled into a so-called locale object. The standard IOStreams is internationalized and uses standard locales.

Each stream holds a locale object in its base object ios_base . The stream stores an additional locale object in its stream buffer. When a locale is attached to the stream via basic_ios::imbue(locale loc) the locale received is stored in ios_base and, redundantly, in the stream buffer. A locale is a rather lightweight object. Hence, storing two locale objects in each stream does not impose much space overhead. The advantage is that those classes that eventually need a locale for processing have direct access to the locale object.

Moreover, the two locale objects are used for different purposes.

The locale in ios_base .

The locale that is held in ios_base is used for the formatting of numeric values. The radix separator, for instance, is no longer hard-coded as a decimal point. Instead, a character that is specified by the attached locale is used. For example, in a German locale the radix separator will be ‘,’ and the output of 0 as a float will not be 0.000000 but 0,000000 .

In the traditional IOStreams the radix separator was hard-coded as a decimal point. In the standard IOStreams the radix separator depends on the locale imbued to the stream. This change might lead to surprising results. Consider a situation where a file, that was written with a traditional output stream, shall now be read with a standard input stream that holds a locale with ‘,’ as the radix separator. If the file contains rational numbers that were written in a decimal notation, the input stream will try to parse these numbers with its different radix separator. It is very likely that the input stream will fail to produce the same rational numbers that were once written to the file.

The problem can easily be solved be imbuing an appropriate locale into the standard input stream, i.e. a locale where the radix separator is a decimal point. The best thing to do is not to imbue a locale at all, in which case a default locale will be used. For reasons of compatibility the default locale in standard IOStreams is the US English ASCII locale.

The locale in ios_base is not only used for formatting of numbers. It is also used to determine which characters of the character set are to be treated as white space characters. This information is needed when an input stream parses input data and has to skip white spaces during this process. There is a subtle difference between the traditional and the standard IOStreams: In the traditional IOStreams the recognition of white space character depends on the active C locale, because the functionality is based on the C standard function isspace(), which is internationalized using the C locale. In the standard IOStreams the recognition of white space characters depends on the C++ locale imbued to each stream. However, it is safe to assume that for the same locale the behavior of the standard IOStreams is compatible to the behavior of the traditional IOStreams.

The locale in streambuf .

The locale that is held in the stream buffer is used for file i/o when code conversion between the internal and external character set is required. The traditional IOStreams did not perform any code conversions. Code conversion is a new feature in the standard IOStreams. Let’s see what it is and why it is needed.

Some cultures, such a Japan or China, have large alphabets with tens of thousands of characters. Characters of such a huge alphabet cannot be encoded in just one byte. Instead there are encodings that use two or more bytes for representing a character. Some of these encodings mix characters of different size (multibyte character encodings); in other encodings all characters are of same size (wide character encodings). It is common practice to use wide character encodings inside the program and multibyte character encodings outside on the external device.

  • The internal character set inside the program has to allow fast and arbitrary access to each character in a sequence. This is a functionality that comes with wide character encodings.
  • The external character set is used for storing text data in a file or any other kind of external device. The main purpose is to keep the file size small. This is a functionality typically provided by multibyte character encodings.
With each input or output operation the program has to translate between the internal and the external representation of a character sequence. A typical example is computer software for the Japanese market. A "Japanese" program might want to handle multibytes text files, encoded in JIS (= Japanese Industry Standard) for instance. The program would internally use a wide character encoding, such as Unicode for instance. Hence the program would need to convert between the Unicode and the JIS encoding whenever it performs an input or output operation.

This chapter briefly sketched some aspects of internationalization that are related to IOStreams. As already mentioned before, a more detailed description of internationalization support in the Standard C++ Library supports will be given in our next column.
 

Removing _withassign Classes.

In the traditional IOStreams the classes istream , ostream , and iostream had a private copy constructor and assignment operator. They were private in order to prevent copy and assignment for objects of these classes because they contained a stream buffer by reference. (To be precise, their common base class ios held a pointer to the stream buffer.) The crucial point is that there is no ‘right’ semantics for copying or assigning a stream with respect to its stream buffer. There are different possibilities, e.g. sharing the stream buffer after the assignment, or flushing the stream buffer during the assignment and then providing both streams with entirely independent buffers, and so on. None of these possibilities is intuitively right, though. Consequently, copying and assigning was prohibited.

However, there is a need for assigning streams on the other hand. The most convincing example is the wish to redirect standard output (or any of the other standard i/o objects) by assigning a valid stream object to cout . In order to satisfy this requirement, the classes istream_withassign , ostream_withassign , and iostream_withassign were introduced. They implemented a public copy constructor and assignment operator, which let both streams share the stream buffer after the copying or assignment. One might expect that the references to the shared stream buffer would be counted. However, we don’t know of any traditional IOStreams implementation that counted the references to shared stream buffers. Instead, the shared stream buffer was deleted when the stream object that was constructed with the stream buffer went out of scope. In other words, the responsibility for the buffer stayed with the stream object that had initially created the stream buffer. Naturally, this imposed dependencies between the lifetimes of the two stream objects used in the copy constructor or assignment operator. In sum, the correct use of the _withassign classes was rather complicated.

This is the reason why in the Standard IOStreams the classes istream_withassign , ostream_withassign , and iostream_withassign do not exist anymore. To perform operations equivalent to the copy constructor and the assignment operator of the old _withassign classes, the user of the standard streams has to explicitly implement this functionality. Standard streams have the following member functions defined in basic_ios<class charT, class Traits>, that can be used for this purpose:

  • iostate rdstate(), which allows to retrieve the stream state,
  • void clear(iostate state = goodbit), which allows to set the stream state,
  • basic_streambuf<class charT, class Traits>* rdbuf() and basic_streambuf<class charT, class Traits>* rdbuf(basic_streambuf<class charT, class Traits>* sb), which allows to retrieve and set the stream buffer, and
  • basic_ios<class charT, class Traits>& copyfmt(basic_ios<class charT, class Traits>& rhs) , which allows to set all other data members of rhs .
The following functions template shows the use of these functions in an example: template<class Stream>
streamcpy(Stream &dest, const Stream& src)
{
  dest.copyfmt(src);
  dest.clear(src.rdstate());
  typedef StreamBase basic_ios<typename Stream::char_type,typename Stream::traits_type>;
  (static_cast<StreamBase&> dest).rdbuf((static_cast<StreamBase&> src).rdbuf());
}
Please note that the stream classes in the standard library do not prevent copying and assigning streams, which they could easily prohibit by declaring the respective operations private. However, the use of this functionality is hazardous because its semantics is not defined by the standard. The functionality of copy constructor and assignment operator of streams is completely up to the library vendor. Even if the respective standard library’s reference manual guarantees the functionality you desire, the use of the copy constructor and assignment operator would still not be portable.
 

Removing File Descriptors.

In the traditional IOStreams all file streams offered a member function fd(). It returned the file descriptor of the file that was associated with the file stream. This feature was helpful when some functionality of the underlying file system was needed, that was not available in IOStreams. For example the function int ftruncate(int fd, off_t length) is available on some UNIX platforms and allows to set a file to a defined length. This non-portable feature was not supported in the traditional IOStreams.

The fd() function is omitted from the C++ Standard. The simple reason is that the C++ standard does not want to exclude operating systems that do not have file descriptors from providing a standard conforming IOStreams library.

On the other hand, vendors of the Standard C++ Library are free to extend the library, as long as these extensions do not conflict with the standard. Hence it is quite possible that a functionality like fd() will be included as a non standard extension in some library implementations.
 

String Streams: Replacing strstream by stringstream .

The string stream classes in the traditional IOStreams, class strstream , istrstream , ostrstream , and strstreambuf, are deprecated features in the standard IOStreams. This means that they are still provided by implementations of the standard IOStreams, but will be omitted in the future. The purpose of string streams is to facilitate text input and output to memory locations. The deprecated strstream classes allow input and output to and from character arrays of type char*. In the standard IOStreams they are replaced by corresponding stringstream classes that allow input and output to and from strings of type basic_string<charT>, charT being char , wchar_t , or any user-defined character type. The most obvious difference is that instead of providing character arrays to a strstream you now provide string objects to a stringstream . As you can convert character arrays into string objects and vice versa, there are no major restrictions regarding the functionality of string streams. However, there are subtle differences.

String streams are dynamic, which means that the internal character buffer is resized and reallocated once it is full. String streams also allow to retrieve the content of the internal character buffer by calling the member function str().

In the traditional IOStreams str() returns a pointer to the internal character buffer . After such a call to str() the string stream is frozen, i.e. the buffer is not resized any longer. This is very sensible since every reallocation would invalidate the buffer pointer.

In the standard IOStreams string streams are always dynamic; they do not freeze. A call to str() provides a string object that is a copy of the internal buffer, but does not allow access to the buffer itself.

A similar difference occurs regarding the construction of string streams. There are constructors taking a character array or a string for use as the internal character buffer. In the traditional IOStreams this character array was actually used as the internal buffer, and the string stream constructed this way was frozen. In the standard IOStreams the string is not used as internal buffer; only its content is copied into an independent internal buffer area. Again, the internal buffer is not accessible from outside the string stream and freezing is not necessary.
 

Minor changes.

Additional to the differences explained above, there are a couple of minor deviations from the traditional IOStreams. Some items are renamed, for instance. Examples are: the type io_state from the traditional IOStreams, which now is named iostate . The same holds for open_mode and seek_dir , which now are openmode and seekdir . And some more.

A standard IOStreams implementation is allowed to support the old names and interfaces for sake of compatibility with the traditional IOStreams. The C++ standards document contains a list of these compatibility features.
 

Summary.

The standard IOStreams are modeled after the traditional IOStreams. However, there are a couple substantial differences:

  • The standard IOStreams are templates taking the character type as a parameter.
  • The base class ios is spilt into a character type dependent and a character type independent portions.
  • Standard IOStreams may throw exceptions.
  • Standard IOStreams are internationalized.
  • Assignment and copying of streams is prohibited.
  • File descriptors are not supported any longer.
  • The character array based string streams are replaces by string based string streams.

If you are interested to hear more about this and related topics you might want to check out the following seminar:
 
C++ IOStreams and Locales
5-day seminar (open enrollment and on-site)

For further reading you might want to take a look at our book on standard C++ IOStreams and locales. The book page (go to BOOK ) provides the table of contents, a number of reviews, the preface, and lots of other informationm including an excerpt.
 
The Stream Buffer Classes
Excerpt from the book

  © Copyright 1995-2003 by Angelika Langer.  All Rights Reserved.    URL: < http://www.AngelikaLanger.com/Articles/Papers/IOStreams/IOStreams.htm  last update: 22 Nov 2003