10 October 2012

Storing a type on the example of a simple messenger


Hi there, I have been thinking a long time about storing a type in C++. So now I want to share with you my insights. As we all know type is something that defines the amount of necessary space and the behavior for the objects of that type. So what if we need to store a type to reuse it later? This is the main problem considered in this article.
Unlike some other languages, C++ does not introduce types as objects, so they cannot directly be stored like the latter. Though, we may think of an indirect way. For example, we can indirectly get the type of an object. To achieve this, we need just one simple type, and a function, which are presented below:

template <typename T>
struct TypeRepresentation
{
       typedef T type;
};

template <typename T>
TypeRepresentation<T> get_type(const T&)
{
       return TypeRepresentation<T>();
}

What is bad here is though we get an appropriate object, we cannot use the member type ‘type’, because we only have an object, but not the type of it:

int a = 5;
get_type(a).type // error

The member types (introduced by using typedef keyword) are only accessible through the containing type name, not an object. So to access ‘type’ we need its qualified name:

TypeRepresentation<int>::type

But to use the name we use the containing type, which we assume we do not know. In this case if we use it, we know that ‘type’ is int so this kind of use is senseless. Imagine that we somehow can use the type having only the object of TypeRepresentation<> type. We then could store this objects in a container, say std::vector<boost::any>. But as soon as we wanted to use the objects, i.e. the member ‘type’, we should convert the object to the original type with any_cast, and … we do not know the original type, otherwise we again would not need the ‘type’ member at all.

The basic idea is: however we try to wrap up the type, we will still come back to it somehow, and as it is not an object, we cannot store it as an object. There are still some cases when we need to know about the types, and we need to store that information somehow. To make it clearer, let us write a simple messenger class. We will provide a registration mechanism to register for messages from concrete types of objects, and also functions to send messages. The most interesting part here is the registration for messages from specified type of sender. So we need to somehow store the type which the registered object wants to listen to.

Before C++11 the only type describing the type was type_info from <typeinfo> header. It can be easily retrieved from any object using operator typeid. Nevertheless, it only was meant for informative purpose, and we cannot construct it ourselves or copy it to store somewhere. So to know whether some object is of wanted type we can compare the results of typeid operator called both for the object and the wanted type, for example,

if (typeid(obj) == typeid(Sender))
{
       // Got it, now do what you wanted to do
}

But still we need to store the type_info object to register the receiver for further consideration, which is impossible. Fortunately, C++11 introduced a new type, type_index, which is actually a wrapper around the type_info, but is both CopyConstructible and CopyAssignable. So now we have a chance to store it. So using std::type_index, we can handle any type-specific registration. The code for the Messenger class follows:

class Messenger
{
private:
       // Prohibit explicit construction and copying
       // My compiler does not support 'delete' and 'default' keywords
       Messenger() {}
       Messenger(const Messenger&);
       Messenger& operator=(const Messenger&);

       std::map<std::type_index, std::list<IReceiver *>> typeToReceivers;
       std::list<IReceiver *> allReceivers;

public:
       static Messenger& Get()
       {
              static Messenger messenger;

              return messenger;
       }

       template <class Sender>
       void RegisterForMessagesFromType(IReceiver *receiver)
       {
              std::map<std::type_index, std::list<IReceiver *>>::
              iterator seeker = this->typeToReceivers.find(
                                std::type_index(typeid(Sender)));
              if (seeker != this->typeToReceivers.end())
              {
                     seeker->second.push_back(receiver);
              }
              else
              {
                     std::list<IReceiver *> newList;
                     newList.push_back(receiver);
                     typeToReceivers.insert(std::make_pair(
                         std::type_index(typeid(Sender)), newList));
              }
       }

       void RegisterForMessages(IReceiver *receiver)
       {
              std::list<IReceiver *>::const_iterator
              seeker = std::find(this->allReceivers.begin(),
                              this->allReceivers.end(), receiver);
              if (seeker == this->allReceivers.end())
              {
                     this->allReceivers.push_back(receiver);
              }
       }

       template <class Sender>
       void UnregisterForMessagesFromType(IReceiver *receiver)
       {
              std::map<std::type_index, std::list<IReceiver *>>::
              iterator seeker = this->typeToReceivers.find(
                                std::type_index(typeid(Sender)));
              if (seeker == this->typeToReceivers.end())
              {
                     return;
              }
              std::list<IReceiver *>::iterator iter =
                  std::find(seeker->second.begin(),
                            seeker->second.end(), receiver);
              if (iter != seeker->second.end())
              {
                     seeker->second.erase(iter);
              }
       }

       template <class Sender>
       void SendMessageFrom(const IMessage& msg, const Sender& from)
       {
              std::map<std::type_index, std::list<IReceiver *>>::
              const_iterator seeker = this->typeToReceivers.find(
                                   std::type_index(typeid(from)));
              if (seeker != this->typeToReceivers.end())
              {
                     std::for_each(seeker->second.begin(),
                                   seeker->second.end(),
                                   [&msg] (IReceiver *rcvr)
                                   { rcvr->Receive(msg); });
              }
       }

       template <class Receiver>
       void SendMessagesTo(const IMessage& msg)
       {
              std::for_each(allReceivers.begin(), allReceivers.end(),
                            [&msg] (IReceiver *rcvr)
                            { if (typeid(*rcvr) == typeid(Receiver))
                                           { rcvr->Receive(msg); } });
       }
};

This class works fine believing my tests. I missed only the IMessage and IReceiver interfaces, here is the code for them:

class IMessage
{
public:
       virtual std::string Get() const = 0;
       virtual void Set(const std::string&) = 0;
};

class IReceiver
{
public:
       virtual void Receive(const IMessage&) = 0;
};

Actually we could register a member function of a predefined signature of the receiver class, not to require the receiver to be derived from IReceiver, but that would complicate the code. You can customize the class, add new members to support filtering by message types, not the senders, or sending to receivers of specified types.

There is one more newly introduced keyword in C++, namely decltype, which retrieves the type of the expression or object, but it has really nothing to do with storing the types, as it would require storing the object to know the type of, and this in its turn implies some limitations on the object’s type.

I hope someone has learnt something new within this article. If you have any idea of how to really store the type, I would be happy to know about it. Thanks for your time.

11 August 2012

RAII: Resource acquisition is initialization

Hello, dear reader. After reviewing my previous post about exception handling many people advised me to write about RAII either. As the latter is not C++-specific only and it actually does not much relate to exception handling I decided to write about it in a separate post. So What is RAII?

RAII stands for Resource Acquisition Is Initialization. It is a programming technique invented by Bjarne  Stroustrup and intended to make the usage of resources more safe in terms of their allocation and deallocation. In C++ after an exception is thrown during program execution the only thing that the standard guarantees to be executed is the destructors of automatic objects allocated on the stack (the destructors of objects with static storage duration are also called and all this is done in std::exit() function which also calls user-defined functions registered with std::atexit(), though these are not the case for RAII). So if we want to allocate a resource to use it somehow, RAII implies its allocation in a class constructor, and deallocation in the destructor. Thus, if anything goes wrong, we can be sure that the resource will be deallocated properly. Let's try this on a simple example.

The most common example used through the literature is the file resource, e.g. a handle of the opened file stream, we also can give this example here, as it does not change the essence. Suppose we need to open an xml file, read some structured data in portions (3 high-level items at a time), process it (e.g. convert to another format, or just deserialize an object from the xml file), and finally close the file. Here is a code snippet for the function which covers all the steps:

MyClass LoadFromXMLFile(const std::string& filePath)
{
       // Create an empty object
       MyClass obj;

       // Open a stream to read from xml file
       std::ifstream in(filePath, std::ios_base::in);

       // Read data in small chunks and fill the object
       FillObjectDataFromStream(obj, in);

       // Close the stream
       in.close();

       // Return the object
       return obj;
}

The dangerous part of this function is FillObjectDataFromStream() function. If it throws an exception the file stream will never be closed (not talking about process termination). So this means that we do not release a resource which is not used already and that is very bad practice.

Now if we want to take advantage of the RAII technique we need a helper class, which will allocate the resource in the constructor and release it in the destructor. Let's name this class FileStreamOpener:

class FileStreamOpener
{
private:
       std::ifstream _in;

public:
       FileStreamOpener(const std::string& filePath)
       {
              _in.open(filePath.c_str(), std::ios_base::in);
       }

       ~FileStreamOpener()
       {
              _in.close();
       }

       std::ifstream& GetStream()
       {
              return _in;
       }
};

Now we need to modify LoadFromXMLFile() function, here we go:

MyClass LoadFromXMLFile(const std::string& filePath)
{
       // Create an object
       MyClass obj;

       // Implicitly open a stream through FileStreamOpener
       FileStreamOpener fso(filePath);

       // Read data in small chunks and fill the object
       FillObjectDataFromStream(obj, fso.GetStream());

       // We don't even need to close the stream
       // as fso will be destructed when the flow
       // goes out of this function's body scope
       // and the stream will be closed.
       return obj;
}

The details are given in comments. Now whenever the destructor of 'fso' object is called, more precisely, during stack unwinding if exception is thrown or after the execution flow leaves the function body scope, the file stream will be closed, i.e. the resource will be properly released. This is the whole idea of the RAII idiom. Thanks for your time.

28 July 2012

Exception handling

Hi all, today I am going to write about a renowned topic, namely, exception handling. Exception is a means for indication of a deviation from the normal execution flow of the program, though it can always be expected and handled appropriately.
    To indicate such a deviation we need to generate an exception. For this purpose the throw keyword is used. It needs an object as an argument which will represent the exception. This could be either an object of a built-in or a user-defined type:

throw int();
throw MyClass();

As soon as throw statement is executed the execution leaves the current scope and starts to seek a “corresponding” handler. A handler is introduced by means of catch keyword which also needs the type of the exception to be handled:

catch (int)
{
       // .. handle the exceptional situation
}

catch (MyClass)
{
       // .. handle the exceptional situation
}

To handle the exception with catch statements the dangerous code (a piece of code that may throw an exception) should be placed in a try block:

try {
       // dangerous code with possible exception
}

So the final scenario is as follows:

try {
       // .. some code
       throw objectOfMyClass;
       // .. maybe some more code
}
catch (MyClass)
{
       // .. handle the exceptional situation
}

All these things are to provide a little understanding of what goes on. Now let’s get deeper and see everything in more details. The try block usually does not contain the throw statement directly. Instead, it contains a function call and that function includes either a throw statement or another function call with throw statement. I.e., the throw statement can be nested any number of levels.

As soon as an exception is thrown something known as stack unwinding happens, i.e. all the local objects (not static and not extern) on the stack in the current stack frame are destroyed (destructors are called). If the throw statement is immediately in the try block, then only the objects declared in the block will be destroyed, otherwise, if throw is in a function body, first the destructors of all local object allocated on the stack in that function body are called, then only the ones in try block. After that the exception object is copied to be passed to a handler body and the execution continues from the “corresponding” handler. Where the copied object is kept is unspecified. If no handler is found for the exception, std::termintate() function is called from <exception> header. The same happens (std::terminate() call) if an exception is thrown while unwinding the stack. That's why it is not recommended to have unsafe code in destructors, because if a local object of that type is created its destructor will be called during stack unwinding and the second throw statement will bring to program termination. We will get back to std::terminate() soon, but now let’s understand what a “corresponding” handler is.

Usually, an exception handler not only specifies the type of the exception to be caught, but also introduces a variable which the exception object will be assigned to. This is done because exceptions as a rule contain information about what went wrong, why, or any other user-defined info, at least exceptions also are class objects, so anyone can define his/her own exception classes. So the handler looks like this:

catch (MyClass ex) {
       // handle the exception or
       // inform about failure in a friendly manner
}

Actually it is not a good practice to use the type itself, as it will copy the exception object into the locally introduced variable. So to avoid this overhead, we can catch the exception by reference similar to passing arguments to functions. Here we go:

catch (const MyClass& ex) {
       // handle the exception or
       // inform about failure in a friendly manner
}

The const keyword may be omitted, but rarely one would want to modify the thrown exception object. To catch an exception one need not specify the exact type of the thrown object. The same exception can be caught by a handler which expects an object of parent types, for example:

try {
       throw Derived();
}
catch (const Base& ex) {
       // ... handle here
}

Note, that this is possible only in case of public inheritance. If the Base is a protected or a private base of Derived, then the handler will not catch the exception. Now if we expect several types of exceptions in a hierarchy we need to handle the most specific (derived) ones first:

catch (const Derived& ex) {
       // exception of type Derived is handled here
}
catch (const Base& ex) {
       // exception of type Base is handled here
}

If we change the order of handlers, then the second one (expecting exceptions of type Derived) will never be reached, as the first one will handle exceptions of both Base and Derived types. Modern compilers should at least warn about this.

As you see, the number of handlers is not limited. We already know that the handlers with exact type or with base types are “corresponding”. To learn about other possible correspondence, see the clause 15.3.3 of C++ standard specification.

Exception handlers can also be nested. For example if one handler catches an exception of type T1 and throws another one of type T2, there could be a handler of type T2 inside the former one, as follows:

catch (const T1& ex) {
       // try to handle T1
       // if could not throw T2
       try {
              throw T2();
       } catch (const T2& ex) {
              // handle T2 here
       }
}

If we want to handle some exceptions which may raise from constructor body we can add try-catch blocks there. But if we want to handle the exceptions coming from the constructor initialization list, then the syntax changes a little. Here is what it looks like:


class MyClass : public Base
{
private:
       SomeType mem;

public:
       MyClass(const SomeType& arg)
              try : Base(), mem(arg)
       {
              // Any exception thrown from initialization list
              // or from c'tor body can be handled below.
       }
       catch (...)
       {
              // A handler should follow the c'tor body.
       }
};

It is possible, that a piece of code tries to handle the exception, but if it does not manage to, it re-throws the exception, as if this piece of code never existed. Other handlers in different parts of system may try to handle the exception and so on. In order to achieve this effect we again refer to the same throw keyword, but this time without any parameters:

catch (const T1& ex) {
       // ... try to handle the exception
       // if unable just rethrow
       throw;
}

You might ask why don't we re-throw just throwing the caught exception object (“throw ex;”)? Well, throw; throws the originally thrown object, which means:
  • it does not copy the exception object one more time as opposed to “throw ex;
  • it does not change the type of exception. What we caught is not necessarily what was thrown, so if we throw what we caught, the type of exception may be different. See the example below:

try {
       throw Derived();
}
catch (const Base& ex) {
       // The exception rethrown below is copied
       // and it has type Base, not Derived
       throw ex;
}

If we replace the last line in handler body with “throw;” the original exception of type Derived will be re-thrown.

The keyword throw has another application either. It is exception specification. This is a way to specify what types of exceptions may a function throw:

void f1() throw(); // does not throw at all
void f2() throw(int); // may throw only int
void f3() throw(Base, int); // may throw Base and int

If the function throws an exception of not specified type, std::unexpected() is called which by default calls std::terminate(). A little later about this as I already promised. We will not get much into the throw specifications as they are deprecated in C++11. For similar purpose the noexcept keyword has been introduced in C++11, which can be used either as an operator or a specifier. You can find the details on the web, but here is a brief explanation.
  • noexcept specifier specifies whether the function throws exceptions or not. It can be used with or without expression argument. If an expression is evaluated to true, then the function should not throw exceptions. Otherwise, it may. noexcept without expression argument is equivalent to noexcept(true). If a function marked with noexcept throws an exception, std::terminate() is called.
  • noexcept operator is always used with a constant expression, and performs a check whether the expression is specified as noexcept. Note that this is a compile-time check.


The specifier and operator are often used together in generic codes, for example:

template <class Container>
int my_func(const Container& c, int index) noexcept(noexcept(Container::operator[]))
{
       return c[index];
}

If the Container’s operator[] is marked with noexcept then the operator will return true, and my_func will also be noexcept, otherwise it will not.

The time has come to talk about std::terminate() and std::unexpected(). As stated above std::unexpected() is called when a function throws an exception inconsistent with its throw specification (remember this is deprecated in C++11). By default std::unexpected() in its turn calls std::terminate(). And std::terminate() by default invokes std::abort() which stops the program execution without calling destructors for any constructed object. Nevertheless, we are free to substitute this functions with our owns. There are two typedefs in <exception> header:

typedef void (*unexpected_handler)();
typedef void (*terminate_handler)();

These are just pointers to functions accepting no arguments and returning nothing. And there are two functions which make it possible to override what std::unexpected() and std::terminate() call with our own functions:

unexpected_handler set_unexpected(unexpected_handler f) throw();
terminate_handler set_terminate (terminate_handler f) throw();

In C++11 both functions are marked with noexcept specifier instead of empty exception specification. Both functions return the currently set handlers. This is good for keeping the current handlers to temporarily substitute them with our owns and then restore again. Though in C++11 it is possible just to call std::get_unexpected() or std::get_terminate() to retrieve current handlers.

There are many exception classes in standard library <exception> header. The most general one is std::exception itself. And all the others derive from it. It has one public virtual function, which returns a human-readable message about the exception:

virtual const char* what() const;

So you can call what() on any exception from standard library. There is also a set of predefined exception classes in <stdexcept> header.

That’s it. There are many things that you’ll learn experimenting with exceptions. I cannot remember and write here all the things, but I hope this post gives a good understanding of how the exception handling mechanism works and how you can modify it. Thanks for your precious time.