19 February 2012

Again templates: getting rid of unnecessary temporaries and copies

    When did you read “The C++ programming language” by Bjarne Stroustrup? In chapter 22 a situation is discussed where one may want to avoid extra temporaries, copying and loops (22.4.7).  So now I want to present a little more complicated version of such situation.
Consider we have some class, which has an operator+ member, which takes as an argument another instance of the same class, thus we can add as many terms as we want. For built-in types, or some simple user-defined types, this kind of expressions will create temporaries for each pair of terms, i.e. we can imagine the expression a1 + a2 + a3 + a4 + … as ((…(a1 + a2) + a3) + a4) + …, so for each pair of parentheses a temporary will be created during the evaluation of the whole expression. For complex types, whose operator+ consists of many calls of other functions, maybe loops, the evaluation of this expression will cost really much. And both the number of temporaries and the time taken depends on the number of terms added. So in current article I want to describe a technique this problem can be solved with.
The main type (class) I will consider throughout the article is MySet, which is a template class and represents a set of elements of template type. The ‘+’ operator should unite given sets.

template <typename T>
class MySet
{
public:
       MySet() {}

       MySet(const T& elem)
       {
              m_elements.insert(elem);
       }

       template <typename Iterator>
       MySet(const Iterator& begin,
                const Iterator& end)
       {
              m_elements.insert(begin, end);
       }

       void addElement(const T& elem)
       {
              m_elements.insert(elem);
       }
       void removeElement(const T& elem)
       {
              m_elements.erase(elem);
       }

private:
       std::set<T> m_elements;
};

To be able to add objects of type MySet<T> we need the operator+ to be defined:

template <typename T>
const MySet<T> operator+(const MySet<T>& first, const MySet<T>& second);

Now we’ve got the abovementioned problem which we need to avoid. We can create an auxiliary class which will just keep the references as described in “The C++ programming language” - 22.4.7. Whoever is not familiar with the Stroustrup’s suggestion I will tell briefly. A class is created which has two reference members, which refer to two terms of an expression, and has a simple constructor which just initializes the references:

template <class T>
struct Temp
{
       const MySet<T>& t1;
       const MySet<T>& t2;

       Temp(const MySet<T>& arg1, const MySet<T>& arg2)
              : t1(arg1), t2(arg2)
       { // nothing to do here    }
};

But how do we know how many terms will contain the final expression. We do not, indeed. So we need a way to generalize the number of terms. For that purpose we can consider the sum of terms as a sum of all the terms but the rightmost and a separate rightmost set, so any expression contains only two terms in this comprehension. So in our case the reference keeper should be a template either:

template <typename LeftTerm, typename RightTerm>
struct TempUnion
{                          
       const LeftTerm& l_;
       const RightTerm& r_;

       TempUnion(const LeftTerm& left,
                     const RightTerm& right)
              : l_(left), r_(right)
       {
       }
};

Now we can represent the sum of two MySet’s as TempUnion<MySet<T>, MySet<T> >, sum of three terms as TempUnion<TempUnion<MySet<T>, MySet<T> >, MySet<T> >, and so on. As to operator+ for MySet’s, it should just construct a TempUnion instance and return. But for this we need a common interface/approach to both MySet and TempUnion of any “depth”. Here we take advantage of CRTP pattern to support static (compile-time) polymorphism:

template <typename T>
struct Wrapper
{
       const T& get() const
       {
              return static_cast<const T&>(*this);
       }
};

We should not forget to inherit both MySet and TempUnion from Wrapper with the derived type as template argument:

template <typename T>
class MySet : public Wrapper<MySet<T> >
{ ...

template <typename LeftTerm, typename RightTerm>
struct TempUnion : public Wrapper<TempUnion<LeftTerm, RightTerm> >
{ ...

MySet should also be able to be constructed from the TempUnion, so that the outermost TempUnion is converted to MySet when assigned to the latter. Thus, we need a constructor, preliminary of this prototype:

template <typename LeftTerm, typename RightTerm>
MySet(const TempUnion<LeftTerm, RightTerm>& un);

Seems everything is already done, but… how do we know how many sets do we have to unite, when all we have got is one object of TempUnion<…> type? So we need a mechanism to count the sets and also find out references to real objects to be able to collect their elements (m_elements). Another helper class is meant to work this out:

template <typename LeftTerm, typename RightTerm>
struct SetOfSets<TempUnion<LeftTerm, RightTerm> >
{
       // recursive template instantiantion
       static const int size = 1 + SetOfSets<LeftTerm>::size;
};

We know that the right term is always a single set, so we can calculate the number of sets just adding 1 for the rightmost set and using recursive instantiation for the rest of the sum as a LeftTerm. We need a terminating condition, and it is easy to guess - for a single set the number should be 1:

template <typename Term>
struct SetOfSets
{
       static const int size = 1;
};

Actually this is more general (not specialized template) than the previous specialization, so this should come before. Now we can calculate the number of sets in an expression, in our terms, in a TempUnion<…> object. We can collect the pointers of sets using the same recursive instantiation approach: (using the same struct)

template <typename Term>
struct SetOfSets
{
       static const int size = 1;

       template <typename T>
       inline static void getSets(const MySet<T>** sets, const Term& t)
       {
              *sets = static_cast<const MySet<T>*>(&t);
       }
};

template <typename LeftTerm, typename RightTerm>
struct SetOfSets<TempUnion<LeftTerm, RightTerm> >
{
       static const int size = 1 + SetOfSets<LeftTerm>::size;

       template <typename T>
       inline static void getSets(const MySet<T>** sets,
                                                    const TempUnion<LeftTerm, RightTerm>& un)
       {
              SetOfSets<LeftTerm>::getSets(sets, un.l_);
              *(sets + size - 1) = static_cast<const MySet<T>*>(&un.r_);
       }
};

Almost done! We have to write the operator= and the constructor for MySet, taking a TempUnion. So the final code for all this stuff follows:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>

template <typename Term>
struct SetOfSets;

template <typename T>
struct Wrapper
{
       const T& get() const
       {
              return static_cast<const T&>(*this);
       }
};

template <typename LeftTerm, typename RightTerm>
struct TempUnion;

template <typename T>
class MySet : public Wrapper<MySet<T> >
{
public:
       MySet() {}

       MySet(const T& elem)
       {
              m_elements.insert(elem);
       }

       template <typename Iterator>
       MySet(const Iterator& begin,
                const Iterator& end)
       {
              m_elements.insert(begin, end);
       }

       void addElement(const T& elem)
       {
              m_elements.insert(elem);
       }
       void removeElement(const T& elem)
       {
              m_elements.erase(elem);
       }

       template <typename LeftTerm, typename RightTerm>
       MySet(const TempUnion<LeftTerm, RightTerm>& un)
       {
              operator=(un);
       }

       template <typename LeftTerm, typename RightTerm>
       const MySet& operator=(const TempUnion<LeftTerm, RightTerm>& un);

       void dump()
       {
              std::copy(m_elements.begin(), m_elements.end(),
                        std::ostream_iterator<T>(std::cout, " "));
              std::cout << std::endl;
       }

private:
       std::set<T> m_elements;
};

template <typename T>
template<typename LeftTerm, typename RightTerm>
const MySet<T>& MySet<T>::operator=(const TempUnion<LeftTerm, RightTerm>& un)
{
       const int count = SetOfSets<TempUnion<LeftTerm, RightTerm> >::size;

       const MySet<T>* sets[count];
       SetOfSets<TempUnion<LeftTerm, RightTerm> >::getSets(sets, un);
       m_elements.clear();
       for (int i = 0; i < count; ++i)
       {
              const std::set<T>& elems = sets[i]->m_elements;
              m_elements.insert(elems.begin(), elems.end());
       }
       return *this;
}

template <typename LeftTerm, typename RightTerm>
struct TempUnion : public Wrapper<TempUnion<LeftTerm, RightTerm> >
{
       const LeftTerm& l_;
       const RightTerm& r_;

       TempUnion(const LeftTerm& left,
                     const RightTerm& right)
              : l_(left), r_(right)
       {
       }
};

template <typename LeftTerm, typename RightTerm>
inline const TempUnion<LeftTerm, RightTerm>
operator+(const Wrapper<LeftTerm>& s1, const Wrapper<RightTerm>& s2)
{
       return TempUnion<LeftTerm, RightTerm>(s1.get(), s2.get());
}

template <typename Term>
struct SetOfSets
{
       static const int size = 1;

       template <typename T>
       inline static void getSets(const MySet<T>** sets, const Term& t)
       {
              *sets = static_cast<const MySet<T>*>(&t);
       }
};

template <typename LeftTerm, typename RightTerm>
struct SetOfSets<TempUnion<LeftTerm, RightTerm> >
{
       static const int size = 1 + SetOfSets<LeftTerm>::size;

       template <typename T>
       inline static void getSets(const MySet<T>** sets,
                                  const TempUnion<LeftTerm, RightTerm>& un)
       {
              SetOfSets<LeftTerm>::getSets(sets, un.l_);
              *(sets + size - 1) = static_cast<const MySet<T>*>(&un.r_);
       }
};

int main()
{
       MySet<int> s1(1), s2(2), s3(3), s4(4), s5(5), s6(6), s7(7), s8;

       s8 = s1 + s2 + s3 + s4 + s5 + s6 + s7;
       s8.dump();

       std::cout << "the end" << std::endl;
       return 0;
}

Thus, s8 is constructed from an object of type TempUnion<TempUnion<…4 more times…<MySet<T>, MySet<T> >, MySet<T> > and the latter is “constructed” at compile time. Here we did not get rid of loops, as we need to run through all elements of all sets, which, generally, are of different sizes.
I hope this post will be useful for some people, I myself admired a lot when learned this kind of stuff first time. Please feel free to comment or ask questions. Thanks for your time.

5 February 2012

Common behaviour for a hierarchy of classes: the SFINAE technique


This article is devoted to another solution of the problem mentioned in the previous post with a similar title. The main purpose is to have a function which executes one code for objects of some ‘Base’ class and also of any other class derived from ‘Base’, and another flow for objects of all the other classes. This time the solution is implemented using the so called ‘SFINAE’ technique.

‘SFINAE’ stands for ‘Substitution Failure Is Not An Error’ and refers to the situation when the template parameter’s substitution brings to an invalid code, but the compiler does not complain, it just ignores the specialization, more on this just a moment later. The SFINAE programming techniques were first introduced by David Vandevoorde.

To give an example let us refer to ‘enable_if’ structure from boost library. The code is as follows:

template <bool B, class T = void>
struct enable_if {
  typedef T type;
};

template <class T>
struct enable_if<false, T> {};

As you can see, there is not anything unclear here. For false value the template is explicitly specialized not to contain any typedef. For true the structure has a member typedef, namely enable_if::type, which is void by default. So now any piece of code instantiating the enable_if structure and using its type member will be invalid, if the value of template parameter B evaluates to false. If you are still hazy about the details, just wait a little more.

Now we need a compile-time predicate to pass to the enable_if as the first template argument. What we actually want to know is whether the type T is ‘Base’ or derived from ‘Base’. We can use the checker function from the previous article, but I prefer to write an updated version of this checker:

template <typename T>
struct is_descendant
{
       struct true_type { char t[2]; };
       typedef char false_type;
       template <typename U>
       static true_type check(const Base&);
       template <typename U>
       static false_type check(...);

       static const bool value = (sizeof(check<T>(T())) == sizeof(true_type));
       // the same result can be achieved with the following line
       // enum { value = (sizeof(check<T>(T())) == sizeof(true_type)) };
};


The result of our check should be a compile-time constant, otherwise we won’t be able to pass it as a value of another template parameter. Again the sizes of return types are compared to ensure that type T passes the check. Now we have a static predicate which we can pass to enable_if. So let’s move to the final step.

As our support point is class ‘Base’,  we should have a version of function for objects of type ‘Base’:

void func(const Base&)
{
       // call for objects of Base or derived from Base types
}

How do we achieve that this very function is called for all the descendants either? I’ll tell you in a minute. In any case, we need a template function func, too, to be able to call it for any type of object.

template <typename T>
void func(const T& obj)
{
       // call this one for objects of all other classes
}

But this template will be specialized for any type derived from ‘Base’. Now we can take advantage of enable_if and is_descendant. To catch the meaning of what we are trying to do, we can use the names of the structures as descriptors, so we have to enable the template func for all types that are not descendants of ‘Base’. So we have to update the template to something like this one:

template <typename T>
void func(const T& obj,
          typename enable_if<!is_descendant<T>::value >::type* = 0)
{
       // call this one for objects of all other classes
}

Now everything will work as we expect. Namely,
  • if type T is not ‘Base’ neither derived from Base, then is_descendant<T>::value is being evaluated as false, and the negation of it as true, and so the typedef type exists in enable_if, and the code is valid (by default the second argument of func is ‘void*’). Thus the template version is called in this case. 
  • If type T is ‘Base’ or derived from ‘Base’, then is_descendant<T>::value is being evaluated as true, and the negation of it as false, and there is not any typedef in enable_if, including type, so the code of the specialization for type T is invalid, but the compiler does not give an error and just ignores the specialization. The call to func is still valid, as the non-template version of func is an exact (T is ‘Base’) or a good (T is derived from ‘Base’) match.

I want to thank Vasily Milanich for making me aware of such techniques and features as SFINAE and boost’s ‘enable_if’ library. If not his comments and suggestions on my previous article I wouldn’t take time and write this one. Thank you.

4 February 2012

Using templates: Common behaviour for hierarchy of classes


This article presents a solution for a small problem, which is not often met in practice, at least in mine. But the solution itself can be interesting and even useful from the point of view of templates and type conversions. So here is the problem statement:

We have a bunch of classes Base, Base2, ..., BaseN. Each of them represents some logical object and has a public interface to make use of that object. Also some of the classes have a hierarchy of subclasses. Let class ‘Base’ inherits classes ‘Sub1’, ‘Sub2’, …, ‘SubN’. ‘Sub1’ inherits ‘Sub11’, ‘Sub12’, …, ‘Sub1M’; ‘Sub2’ inherits ‘Sub21’, ‘Sub22’, …’Sub2L’, and so on. Consider such a hierarchical tree of any size and shape derived from class ‘Base’ and similar hierarchies for each of the rest ‘BaseK’, (K changes from 2 to N) classes. Now we need a function which will do:
  • one thing for several concrete classes derived (not directly in general) from ‘Base’,
  • another thing for the whole hierarchy of ‘Base’ excluding the first group,
  • some other thing for all the remaining classes.


The first thing that comes to head is that the final function should be a template, as it should work for different types. Not affecting the generality of the problem let’s consider that the function only reads information from the objects, ie we can take a const reference as an argument. So for now we think about a function of this prototype:

template <typename T>
void do_something(const T& t);

The most general group is the third one, as it can contain classes of any relationship, so this should be the template function itself:

template <typename T>
void do_something(const T& t)
{
    // do something with t
}

Now a call to ‘do_something’ with an object of any type will do the same thing. To provide a different behavior for the ‘Base’ class, we must write an explicit specialization for the template:

template <>
void do_something(const Base&)
{
    // do another thing for Base type
}

 So now if we pass an object of type ‘Base’, a different code will be executed. But as soon as we pass an object of some other type from second group, ie a class not necessarily directly derived from ‘Base’, again the general template will be called. This is not what we want. So we need to provide a way to call the specialization for ‘Base’ for all its descendants.

In general any class type publicly derived from ‘Base’ is also a ‘Base’. So if we have a function which takes an object of type ‘Base’, it will also take an object of any publicly derived type. eg:

bool is_derived(const Base&)
{
       return true;
}

But we need to call this function for any other type either, so we need to overload it for any type, but we can’t make it a template as all the descendants of ‘Base’ will make their own implicit specializations at the point of instantiation. So the only reasonable way is overloading:

bool is_derived(...)
{
       return false;
}

Now we have a way to check whether the passed in object is of type ‘Base’ or derived from ‘Base’, but how can we make use of it? We still need some way to convert the derived types to the base to pass to the final template function. On top of that, as we need to call a template function, we should know the type of the object at compile time, so the ‘is_derived’ function is not of much use.

Fortunately, there is a way to perform this check at compile-time. It is the sizeof operator, which returns the size of the passed in object. But we don’t know the sizes of all classes, and, moreover, some of them may be equal, which is not convenient in this case. So we again can make use of the same mechanism as ‘is_derived’ used. But we need different return types, more precisely, types with different object sizes.

struct my_type {
       char word[2];
};

char checker(const Base&)
{
       return '0';
}

my_type checker(...)
{
       return my_type();
}

Now the ‘checker’ will return a char for all objects of type ‘Base’ or derived from ‘Base’, and my_type for all the other objects. The size of char is never equal to size of my_type, as the former is always 1, and hence the latter is at least 2. Besides that, the value of the expression sizeof(checker(<some_type>)) is known at compile-time. So we can use it as a template argument. All we have to do is provide a way to treat the objects of types derived from ‘Base’ as ‘Base’ objects. One possible solution is a template class with two arguments - one the object type, and the second the size of the object returned by ‘checker’:

template <typename T, size_t S>
class Executor;

We need two explicit specializations for argument ‘S’, exactly
  • S == 1: when checker return char, ie the object is ‘Base’ or derived from ‘Base’,
  • S == sizeof(my_type): when checker return my_type, ie the object is of other type.


template <typename T>
class Executor<T, sizeof(char)>
{
public:
       Executor(const T& obj) :
              m_obj(static_cast<const Base&>(obj))
       {
       }

       void operator()() {
              do_something(m_obj);
       }

private:
       const Base& m_obj;
};

template <typename T>
class Executor<T, sizeof(my_type)>
{
public:
       Executor(const T& obj) :
              m_obj(obj)
       {
       }

       void operator()() {
              // do something for not-related object types
       }

private:
       const T& m_obj;
};

Now we can redefine the general template ‘do_something’ the following way:

template <typename T>
void do_something(const T& t)
{
       Executor<T, sizeof(checker(t))> e(t);
       e();
}

 And the specialization for ‘Base’ is unchanged:

template <>
void do_something(const Base&)
{
    // do another thing for Base and derived from Base types
}

For the first group - several classes derived from ‘Base’ with different behaviour than ‘Base’, we have to write explicit specializations for each type.

This is just a solution for an intermediate problem, but I hope it is useful for many people who uses templates and C++ in general. Feel free to correct me if something is wrong, or ask questions if something is unclear. Thanks.