Terje Sandstrom

------ Visual Studio ALM MVP ----- (also see new blog at http://hermit.no)

  Home  |   Contact  |   Syndication    |   Login
  64 Posts | 1 Stories | 116 Comments | 0 Trackbacks

News

Subscribe

Delicious Save this on Delicious Visual Studio Feeds

¨

Tag Cloud


Article Categories

Archives

Post Categories

Image Galleries

Company stuff

Interesting bloggers

Interesting companies

Microsoft

Microsoft Norge

Microsoft Regional Directors

Microsoft Test

MSFT Blogs

MVP

NNUG

Other interesting stuff

TFS

Visual Studio

By Terje Sandstrøm and Syver Enstad

This article was originally written in 2003 and used internally within our company and for clients of our company.  The text is as it was at that time, and – given the general nature of this topic - should still be useful.

Introduction

There has been a lot of talk about design patterns, but not so much about code patterns, also referred to as idioms. A code pattern describes typical general code for common operations. This article focuses on operator patterns in C++. Not all classes need their own operators, but small classes which often represents some kind of value, will very often need a set of operators. Further, classes that use dynamic memory (heap allocation) must implement a copy constructor and an assignment operator (in addition to a destructor) if they are going to be copy able. (ref. The Big Three Law (B3L(see C++ FAQ).

In this article, we will make a catalogue of the typical operator patterns. Many of the patterns are due to other authors, the references can be found at the end. We have put all of them together and also elaborated on some of their implications.

The patterns are shown as combined declarations and implementations, as both are equally important. In many cases the operators are so small that they can safely be implemented in the header file, thus one can make them inline.

The challenge with operators is that they should be implemented as efficiently as possible, concerning both design time and runtime. Bugs caused by errors in operators can be harder to find than normal bugs, because when you use the classes in a program, you don’t see the difference between calling an internal operator or a user defined operator, so you tend to overlook them. We often find these bugs when single stepping through the program and then, very much to our surprise, we step into an operator implementation which existence was forgotten.

It is also important to note that custom class operators should conform to the interface used by the built-in types. In some cases, the compiler will transfer your definition to one compatible with the internal built-in types. Be aware of this. See for example the operator== below.

A good advice: Use operator overloading for syntactic sugar, and please do not stray from accepted semantics for an operator.

A general pattern observation

Several operators do nearly the same thing, only slightly different. As each piece of code can introduce a bug, it is good advice to only write an “algorithm” once, and then reuse it. It is similar with operator implementation – implement one operator in terms of another. This can in fact go quite a long way, as we show in this article, and it is a powerful technique. You get one more function call, but if speed optimization is highly demanded, you should make the operator methods inline, and the compiler will “remove” your function call anyway. Further, the golden rule we follow is to first make it work, then make it right, and at last make it fast.

As is the case with the Boolean and arithmetic operators below it’s often useful to implement all similar operators in terms of another member function. The Boolean operators are implemented in terms of a compare member function and the arithmetic functions in terms of their corresponding @= (+=, -=, /=) and so on.

Operator patterns table

Table 1 shows the operators discussed, and which of these is implemented directly and which one is implemented in terms of another.

Operator name

Must be member function

Operator short signature

Implementation

Assignment operator

Yes

Op=

Directly

Copy constructor[1]

Yes

 

Maybe in terms of assignment operator[2]

operator +=

No, but it probably should

Op+=

Directly

Prefix increment

No, but it probably should

Op++()

In terms of op+=

Postfix increment

No, but it probably should

Op++(int)

In terms of prefix increment

Addition operator

No, cannot be

Op+

In terms of Op+=

Subscript operators

Yes

Op[]

Directly, or member func

Stream operators

No, cannot be.

Op<< and Op>>

Directly, or member func in a class hierarchy

Equal operator

No, cannot be

Op==

Directly,

Non-equal operator

No, cannot be

Op!=

In terms of op== or member func

Less than

No, cannot be

op<

In terms of member function

Less than or equal

No, cannot be

op<=

“”

Greater than or equal

No, cannot be

op>=

“”

Greater than

No, cannot be

op>

“”

dereferencing operator

Yes

op*

Directly

dereferencing member selection operator

Yes

op->

Directly

Conversion operator

Yes

operator T

Directly

Function call operator

Yes

operator()

Directly

The operator patterns

Introduction

All patterns are described with pseudo code. The term C denotes the Class name of your class. The term X may denote a secondary class, which also may be equal to C. Member variables are just denoted as m_x. You should substitute your own members wherever you see this. We have kept the explanations rather short, so that this is more like a reference guide than a tutorial. For further discussions, you should look up the references.

We describe the patterns with the non-dependent operator first, and then follow with the operators that depend on the former.

Member or non member

Generally, operators that don’t modify any of its arguments should be implemented as free functions instead of member functions. Operators that modify the first argument often makes for shorter and more readable code if implemented as member functions and in some cases, like operator= it’s not possible to implement it as a non-member. Some operators have to be implemented as non-members. An example is the streaming operators, where it’s the second parameter that decides which operator to use, and you generally don’t want to modify the iostream classes each time you want a new implementation of the streaming operators, which is the only alternative if you’re going to implement it as a member function. All operators that can be used associatively, that is, a op b can also be called as b op a, must be implemented as non-members. This holds for many of the arithmetic operators and the comparison operators. You must be able to say bool b = 5>a ; This is valid, and should be allowed, but demands a non-member operator for the operator()>..

Difference in signature between member and non-member versions:

The non-member version of an operator will have an extra left-most parameter since it is not connected to any object.

Two typical binary operator signatures:

Member binary operator:

RetType C::operatorX(const ParamType& other);

Non-member binary operator:

RetType operatorX(const ParamType& lhs, const ParamType& rhs);

Operators for value types

The following operators do generally make most sense to implement for value types, or concrete types as they are also referred to (Str97). These types are generally very close to the built-in types of the language, and are implemented as classes where the concept of identity is generally irrelevant. Inheritance and virtual functions are seldom used in implementing such classes. This is in the contrast to reference types , objects of these classes are generally heap allocated and operated on through virtual functions.

The assignment operator, Op=

The assignment operator pattern is first shown for a single class, that is:

No inheritance
C& C::operator=(const C& other)
{
   if (this!=&other)
   {
       // copy members
       x = other.x; // and so on
   }
   return *this;
}

The assignment operator should always return a reference. That way it is possible to write code like a=b=c;

The first check is to verify that if we’re doing a=a, which is perfectly legal and which in fact happens when programmers use arrays, typedefs and other mechanism which obscure the simple fact that they are doing an a=a operation. In that case, we skip the rest of the assignment procedure.

Also, note that if the class contains dynamic memory, we have some options regarding deep or shallow copying of those blocks, which is not discussed in this article.

Inheritance

If class C inherits from class Base, then the pattern is as follows:

C& C::operator=(const C& other)
{
   if (this!=&other)
   { 
      // copy base
      Base::operator=(other);
      // copy members
      x = other.x; // and so of
   }
   return *this;
}

Note that the only change is the addition of the call to the base class assignment operator just before we start copying our own member variables. Note also that in this case we use the operator= function name of the assignment operator for the base class.

Diversion

Observe that the parameter is declared const. This has a special effect on any class that aggregates an instance of your class.

Assume you declare a class A like this:

class A
{
   private:
   C c;
   Public:
   // a lot of methods ......
}

In addition, in your code elsewhere you have something like

A a1,a2;

.... lots of code

a1 = a2;

Now when the compiler sees the assignment of a2 to a1 it will generate a default assignment operator for class A. Then comes the trap. When doing so, it will look at the member variables, and look at their assignment operators. If they ALL have assignment operators following the pattern above, with a const parameter, it will generate a similar assignment operator. If, however, one of the members (f.e. your class A) has an assignment operator where the parameter is NOT declared const, the assignment operator for class A will have a non-const parameter too.

If you think this is no big deal, just try to make a vector of A, like in

std::vector<A> va;

And look at all the nice error messages popping up from deep inside the vector template code. The reason is that the vector template demands that all members have assignment operators with const parameters because the vector must own the content.

Copy Constructor

Although this is not an operator, we have included this special constructor here, because it is part of the B3L (Big Three Law), and with some caveats, it may be implemented in terms of the assignment operator. Lets start with what a copy constructor should do. What we need for the implementation of the copy constructor is first default construction and then assignment. This is illustrated by the following example:

void Func(B& b)
{
   Obj a;
   a = b;
}

should mean the same as:

void Func(B& b)
{
   Obj a(b);
}

What is crucial here is that the precondition for calling the assignment operator is a fully constructed object, like one created by the default constructor. Following this line of thought, the ideal thing would be to implement the copy ctor something like this (pseudo code):

Obj(const Obj& other)
{
   CallDefaultCtor();  
   *this = other;
}

It is not possible to call the default ctor like this in C++, so we will have to settle for less. The first idea might be to put all initialization code into a member function, but functions cannot be called from the initializer list so this will not work for members who have to be initialized there (members with no default ctor). The general solution is therefore to duplicate the code from the default ctor to ensure that the object is in a valid state before calling the assignment operator. The class below illustrates this case:

 

class C
{
   private:
   int X;       
   const int Y;
   public:
   C() : X(0), Y(0) {};
   C(int Y) : X(0), Y(Y) {};
   C& operator=(const C& other)
   {
     if (this!=&other)
     {
        X = other.X;
     }
     return *this;
   }
   C(const C& other) : Y(other.Y) 
   { 
      *this=other; 
   }
};

In many cases the benefit gained from sharing implementation between operator= and copy ctor will be so little, and/or efficiency constraints will preclude it, so that it may be better to just implement the copy ctor directly.

NB: No matter how you define your copy ctor, remember this: It should always have the same result as default construction followed by assignment.

Arithmetic operators

The arithmetic operators are all implemented in terms of each other. We show the addition operators, but the patterns are similar for the subtraction, multiplication and division operators. They follow the same pattern, and have the same dependencies. For the sake of saving some space, we only show the addition operator patterns here. For example, for the subtraction patterns, just replace the +’s with –‘s and you should be doing fine.

Operator +=

This operator is the workhorse when it comes to arithmetic operators, and should always be implemented first when you need to do arithmetic on a class.

The pattern is:

C& C::operator+=(const X& other)
{
   x += other.x;
   return *this;
}
Addition operator

The addition operator should be implemented in terms of the op+=. The pattern for addition of an element is shown below.

const C operator+(const C& a, const X& b)
{
   C result = a; // more efficient than copy constr
   result += b;
   return result; // Note result value optimization[3]
}

The pattern is declared as a free function rather than a member of C. This is necessary in order to do operations on const values, like f.e. c = 4+a; where the number 4 no way can have a method called operator+.

The const return value stops you from writing

code like :

(a+b)=c;

Also, observe that the internal construction of the result object will have reduced negative effect due to the return value optimisation C++ compilers do. The compiler will optimise away this construction while looking at the calling code. (See Scott Meyer, item 20 in ref.2) If your code is

C a = 4;

C b = 6;

C x = a+b;

Then the compiler will combine the internal construction in the operator+ and the construction in the calling code, of x, leaving you with only one construction.

Also, note that we use an empty construction together with the assignment operator instead of a copy constructor, because the copy constructor is implemented in terms of an assignment operator. Since we know this, the assignment above is more efficient, and just as clear.

Prefix increment operator

There are two increment operators, one for prefix operations and another for postfix operations. We start with the prefix operator pattern, which is implemented in terms of the op+=. The postfix operator is implemented in terms of the prefix operator.

C& operator++()
{
   *this += 1; // Implemented in terms of op+=() 
   return *this;
}

It is natural to implement this in terms of the op+= because it is only a special case of that, which increments by one. The op+= increments by any value given.

Postfix increment operator

The postfix operator pattern has a strange signature. The pattern is:

C operator++(int)
{
   C old = *this; // more efficient than copy constr
   ++(*this); // Inc in terms of prefix oper++
   return old;
}

Note that a parameter with only the type int is used. This is a speciality just for telling the compiler that this signature is a postfix operator, and not a prefix operator. Without the int, there would be no differences between the two signatures. One can wonder why the standard committee choose this rather than make an explicit keyword that would be clearer to read like

C operator++() postfix

But I assume some old C guys are still present (pun indented!).

Also, observe that the postfix increment operator will return a copy of the old value. This is the reason why you should opt for the prefix increment operator, and only use the postfix if you need this behaviour. If you use the postfix inadvertently, you will have a useless object construction on your conscience.

A typical efficiency “error” many people make, especially old C-coders who are more used to the postfix than the prefix is in loops:

for (C i=0; i<N; i++)
{
   …..
}

Now for each iteration of the loop, a copy of ‘i’ is created and thrown away. The better way is thus:

for (C i=0; i<N; ++i)
{
   …..
}

Actually, this only holds true for user defined types. For built-in’s postfix is as efficient as prefix. But it’s a good rule anyway.

Equal operator

The equal operator has a binary and symmetrical form, and is thus defined as a function or a friend function if it needs access to the private representation. Both operands should preferably, or normally, be of the same type. You may of course define operator== with different types, which has been done in the STL for the string class, where three overloaded operator==’s exist. If you do that, implement two of those in terms of the first. If you also have an operator== it is important that the result of applying operator= to an object makes operator== return true for the two objects afterwards, or it will get people very confused.

class C
{
   public:
   friend bool operator==(const C& c1,const C& c2)
}

and implemented as

bool operator==( const C& c1,const C& c2)
{
   return c1.x==c2.x;
}
Equal operator for STL vector of pointers

If you have made yourself a vector of pointers, and you would like to use the find template algorithm on that vector, you will need a special operator== in your class.

class C
{
   public:
   friend bool operator==(const C* pc1,const C& c2)
}

and implemented as

bool operator==( const C* pc1,const C& c2)
{
   return pc1->x==c2.x;
}

and you call the find template as

vector<C*> v;
// code to fill the vector
C c; // object to look for
vector<C*>::iterator iter = find(v.begin(),v.end(),c)

Note that the find call uses the object to find as a reference argument, not a pointer as one could assume. The call is in fact equal to the call used for vector of objects (vector<C>), but a vector of objects uses only the standard operator==.

Non-equal operator

The non-equal operator follows the same rule as the equal operator, and is also implemented in terms of that operator.

class C
{
   public:
   friend bool operator!=(const C& c1,const C& c2)
}

and should be implemented as

bool operator!=( const C& c1,const C& c2)
{
   return !(c1==c2);
}

When you have only one member, this may seem a waste, but with multiple members, you save a lot of typing, and also a lot of possibilities for making errors.

A compare method and all the Boolean operators

If you need to implement all the Boolean operators it is often best to implement them in terms of a function compare that returns 0 for equality, < 0 for less and > 0 for greater.

class C {
   public:
   // Compare probably needs access to the representation of C
   virtual int compare(const C& other);.....
};

The reason for the compare method to be virtual is that in case of derivations, you won’t need to make new Boolean operators for the derived classes. All you need to do is to implement the new compare method in the derived class.

bool operator==(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) == 0;
}

bool operator!=(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) != 0;
}

bool operator<(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) < 0;
}

bool operator>(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) > 0;
}

bool operator<=(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) <= 0;
}

bool operator>=(const C& lhs, const C& rhs)
{
   return lhs.compare(rhs) >= 0;
}
Array operators

The array operator, or subscript operator, should always be defined as both a const and a non-const version. This operator can be found at both the left and the right side of an assignment.

When you need code for:

C c;
c[5] = 8;

The following operator should be implemented

c& C::operator[](int position)

{

return m_data[position];

}

However, when you only need to read out the value,

C c;

X = c[5];

The following operator pattern should be used:

const C& C::operator[](int position) const
{
   return m_data[position];
}

Conversion operators

class Rational
{ 
   public:
   Rational(int num=0,int denom=1);
   operator double() const
   {
      return double(num)/double(denom);
   }
}

Conversion operators are very often a problem, because they are often being called implicitly, and often when you don’t want them to be called. They are especially deadly in combination with a one argument non-explicit constructor for the same type. The result from implementing conversion operators is often that the programmer must explicitly cast and jump through hoops to get the wanted behaviour, so most times it’s better to just make a member function that returns the required representation. Note also that the example above don’t check for a possible zero value of denom. This might be handled by the runtime system by throwing a division-by-zero exception, at least for Win32.

Tip: Observe that the conversion operator does not declare a return type, much like a constructor.[SE1]

A special case for the conversion operator is if you want the object to signify if it is in a good or bad state for use in f.e. Boolean statements. To implement this as an operator bool conversion would seem the obvious way to implement this but it should rather be implemented like the below:

class C {
   public:
   operator const void*() const
   {
      return this->valid() ? this : 0
   }
   ....
};

If this had been implemented as a bool conversion operator the result could be used in other constructs than checking the objects state. The code above can only be used for checking the validity of the object by comparing the result to 0. [4]

Miscellaneous operators

Streaming operators

You should at least implement the put to operator (“<<”) for your classes, as it at the least is very helpful for printing debug messages.

Stream out
friend ostream& operator<<(ostream& out, const C& c)
{
   out << c.m_n1 << c.m_n2; 
   return out;
}
Stream in
friend istream& operator>>(istream& in, C& c)
{
   in >> c.m_n1 >> c.m_n2;
   return in;
}

These two operators should be declared friend of the class they work on if they need access to the representation of the object, parameter C above.

Streaming a class hierarchy:

Define a virtual function print in the base class of the hierarchy:

class Base {
   public:
   virtual ostream& print(ostream& os) = 0;
   ....
};

Implement the ostream operator in terms of the print member function

ostream& operator<<(ostream& os, const Base& object)
{
   return object.print(os);
}

It does not seem particularly useful to implement a similar scheme for the get from operator.

Dereference and member selection operator

The dereference and member selection operators are typically used to implement classes whose instances behave like pointers.

class Obj {
   public:
   void doIt();
   .....
};

class PtrToObj 
{
   private:
   Obj* pObj;
   public:
   ........
   Obj& operator*()
   {
      return *pObj;
   }
   Obj* operator->()
   {
      return pObj;
   }
}; 

This enables clients to use an instance of PtrToObj like it actually was an Obj pointer.

Obj* pObj = new ......

PtrToObj p(pObj);

p->doIt();

(*p).doIt();

Be aware that there are many ways in which to implement so called smart pointers, some are designed to be passed by value (value objects), where you write the copy constructor, assignment operator and destructor for the smart pointer to cater for features such as ownership transfer and reference counting. Another way is to disallow copying, and rely on passing by reference down the stack. See the article by B. Milewski in the references, for details of design and use of smart pointers.

You need to implement the const versions if you are going to pass the Ptr object by const reference. Like this:

void func(const PtrToObj& ptr)
{
   ptr->doIt(); // compile error if the const version of Ptr::operator-> is not defined.
}

Here are the implementations of the const versions of the dereferencing operators:

class PtrToObj
{
   ……..
   const Obj& operator*() const
   {
      return *m_pObj;
   }
   const Obj* operator->() const
   {
      return m_pObj;
   }
   ……
};

AddressOf operator

Also sometimes used with a smart pointer, implemented like this:

class PtrToObj 
{
   public:
   Obj** operator&()
   {
      return &m_pObj;
   }
   .....
}; 

Be aware that overloading the address operator breaks the identity check in the assignment operator (operator=). This may or may not be a problem.

Overriding the AddressOf operator is generally not seen as a good idea.

FunctionCall operator

This operator becomes very useful when extending/using the STL library, because the STL algorithms often take a function like object as a parameter. If you implement your custom function like object as a Class (a so called functor), instead of a function you can save and/or accumulate state through out the algorithm.

class C 
{
   public:
   void operator()(int, int);
   .....
}; 

Operators &&, ||, and ,

We have NO patterns for these operators since you should not overload them. See Scott Meyer (ref 2) item 7 for an explanation.

Conclusion

This article has laid out a series of operator patterns. By using these you can eliminate some of the most common bugs occurring with operators, and perhaps more important, you have one place to look for the patterns! Which is the reason we wrote it in the first case.

References:

The C++ Programming Language, 4th ed. Bjarne Stroustrup, 1997, Addison-Wesley

Exceptional C++, Herb Sutter, , 2000, Addison-Wesley

Effective C++ , Scott Meyer, 2nd ed. 1997, Addison- Wesley

More Effective C++, Scott Meyer, 1996, Addison-Wesley

Online:  C++ FAQs  or Book M.P. Cline and G.A. Lomow, 1998, Addison-Wesley

The ANSI/ISO C++ Professional Programmers Handbook, Danny Kalev, 1999, QUE

Large Scale C++ Software Design, Lakos, 1996, Addison-Wesley

Resource Management, Bartosz Milewski, http://www.relisoft.com/resource/resmain.html

Terje Sandstrøm is Senior Software Architect at Osiris Data AS. He has a M.Sc. in physics from the University of Oslo, and has been working with programming and program design since 1980, in a variety of languages, but C++ is still the favourite.

(2013 : Currently at Inmeta Consulting AS ,  for updated information on Terje , see http://about.me/Terjes)

Syver Enstad is a Software Developer at Osiris Data As. He has a B.Sc in computer Science from the University of Trondheim, and has been working with programming since 1998. He enjoys both C++ and C#, he also enjoys programming in Python and Smalltalk.

(2013 : Currently programmer at Tandberg ASA)


[1] Ok, ok, not exactly an operator, but it is part of the B3L, and is implemented in terms of op=, thus we add it here.

[2] As long as the assignment operator is implemented directly

[3] Before 1996 a named object would not be eligible for return value optimisation but after that date, the standard declares that both named and unnamed objects are eligible and newer compilers should do this optimisation.

[4] Lakos, page 649-650.


[SE1] This true under Win32, but not necessarily for C++ as such.

posted on Monday, January 13, 2014 5:16 PM