C#/.NET Little Wonders: The ReferenceEquals() method

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders post can be found here.

Today we’re going to look at a very small, and sometimes helpful static method of the object class. Of course, we know most of the key methods of the object class by heart, especially the ones we tend to override often such as Equals(), GetHashCode(), and ToString().

But there’s one static method which doesn’t tend to get the respect it deserves very often. It’s really quite trivial, yet it can help boost readability and maintainability of code. It’s a very simple method which makes it easy to tell if two references refer to the same object.

Now, this may sound like a trivial feat to you and not one very worthy of attention, but read on and see if you still think so after this discussion.

Overview: the == and != operators

As you probably know, the == and != operators are defined between reference types to check to see if two references refer to the same instance of an object.

Note: For value types (primitives and struct types), == and != operators are disallowed unless they are defined explicitly for the value types being used as arguments. This means that they can be used for all numeric primitives, and for struct types that explicitly overload the operators directly.

So let’s concentrate on == and != between reference types by starting with a quick example. Let’s say you have a simple class representing a Point:

1: public class Point
   2: {
   3:     public int X { get; set; }
   4:     public int Y { get; set; }
   5: }

And we initialize three references to two Point instances as follows:

  1: // s1 and s2 refer to two separate instances of "equivalent" objects
   2: var s1 = new Point { X = 5, Y = 13 };
   3: var s2 = new Point { X = 5, Y = 13 };
   4:  
   5: // s3 and s2 refer to the exact same object
   6: var s3 = s2;

So given that, we’d expect the following code to output the values indicated in the comments.

  1: // false, s1 and s2 refer to different instances of Point
   2: Console.WriteLine(s1 == s2);
   3:  
   4: // true, s2 and s3 refer to same instance of Point
   5: Console.WriteLine(s2 == s3);

And it works as expected! Since s1 and s2 refer to different instances, the result is false, whereas s2 and s3 refer to the exact same instance of a Point so the result of the == operator is true.

This is how the operator == (and in converse, it’s opposite operator !=) work by default for reference types. Since all types are derived from object, this means that unless you specify otherwise, this is the behavior you will get for any new class you create.

The wrinkle: operators == and != can be overloaded

As you know, you can overload many operators in C# for your class or struct type. Bear in mind, as always, that operators are overloaded, not overridden. In particular, for this post we are most interested in overloading == and != and its effects.

For example, in the string type, the operators == and != have been overloaded to compare two strings to see if they have the same value, even if they point to different instances:

1 : var s1 = "There";
2 : var s2 = "here";
3 : 4 :  // T + here = There, but we are avoiding string interring here to prove
         // a point.
         5 : var s3 = 'T' + s2;
6 : 7 :  // This is true, because "There" is lexicographically equal to "There"
         8 : Console.WriteLine(s1 == s3);

Now, I did a trick here to make sure that s1 and s3 didn’t refer to the same instance of a string. If I would have set s3 directly to “There” as well, the two string literals would have been collapsed into one by the compiler and we’d have two references to the exact same object, which wouldn’t have illustrated the point. Thus building s3 at runtime prevented the string from being interred and we have two strings both with values of “There”.

Note that even though those two string references refer to different instances, they have the same value, and the == operator is overloaded for string operands to compare the string values instead of the references.

The problem: how to compare references if == is overloaded

So what happens if you want to actually compare the references to see if they are the same instance, but the type has overloaded operators == and != so that you can’t use them directly?

Why would we ever want to do this? Well, let’s look at an example with our friend the Point class we implemented at the start of this post. Let’s say that we want to define == and != operators for Point so that it compares the X and Y values instead of the references. To do this we might try coding something like this:

 1: public static bool operator ==(Point first, Point second)
   2: {
   3:     // return true if both first and second are same reference, or both null
   4:     if (first == second) return true;
   5:  
   6:     // if either (but not both due to first check) is null, return false
   7:     if (first == null || second == null) return false;
   8:  
   9:     // both not null, compare values
  10:     return first.X == second.X && first.Y == second.Y;
  11: }

Remember that == should return true if both arguments are null, so we first attempt to check first and second to see if they are the same instance (in which case there’s no sense checking values, must be same by definition!) or both null. If that test returns false, then either only one is null, or they are both not null but refer to different instances. We then check to see if either is null. If one is null and one is not, the result of == should be false by definition. Finally, now that we know both are not null and not the same reference, we can compare the values.

Because operator == and != must be defined in pairs for a given type, we must also provide operator !=, but we can easily do this by negating the result from operator == as follows:

 1: public static bool operator !=(Point first, Point second)
   2: {
   3:     return !(first == second);
   4: }

Okay, now we have our pair of operators and we can try them out, so we exercise them in our program as follows:

 1: var p1 = new Point { X = 5, Y = 13 };
   2: var p2 = new Point { X = 5, Y = 13 };
   3:  
   4: // p1 and p2 are separate instances, but with equivalent values
   5: Console.WriteLine(p1 == p2);

It all compiles fine, we then run it and after we see our screen hanging for a few moments, we get a StackOverflowException. What happened? Well, generally speaking a stack overflow happens when the call stack gets too deep and the stack runs out of memory. Typically, this is the result of runaway recursion.

Recursion? Where did we do recursion? Well, if you notice we did it in three spots:

   1: public static bool operator ==(Point first, Point second)
   2: {
   3:     // *** THIS RECURSIVELY CALLS THIS OPERATOR OVERLOAD FOREVER ***
   4:     if (first == second) return true;
   5:  
   6:     // *** SO DO THESE TWO ***
   7:     if (first == null || second == null) return false;
   8:  
   9:     return first.X == second.X && first.Y == second.Y;
  10: }

Note that because we are using the == operator between two Point references, this will resolve to the operator == overloaded for Point, which is what we are currently defining! This means that for it to resolve operator == it must call operator == which must call operator == which must call operator == and so on for infinity or stack overflow, whichever happens first.

So maybe you get crafty and say, well, I can just use != instead and negate the result! But you’d then end up with the same issue because != would just call == which would call != and so on. So it’s clear that we need a way to get back to the original definition of == and != between object so that it does a strict reference comparison. So how do we do this?

Little Wonder: ReferenceEquals()

The object class has a nice static method on it called ReferenceEquals() which makes it trivial to compare two instances of a reference type to see if they refer to the same object, or if both are null. We can use this to make our operator overload more readable and correct as follows:

1 : public static bool operator ==(Point first, Point second) 2 : {
  3 :  // return true if both first and second are same reference, or both null
       4 : if (ReferenceEquals(first, second)) return true;
  5 : 6 :  // if either (but not both due to first check) is null, return false
           7 : if (ReferenceEquals(first, null) ||
                   ReferenceEquals(second, null)) return false;
  8 : 9 :  // both not null, compare values
           10 : return first.X == second.X &&
      first.Y == second.Y;
  11:
}

The call to ReferenceEquals() prevents us from making a recursive call to our own operator == overload on Point and allows us to check the references directly!

Notice, we don’t have to say object.ReferenceEquals() because all types derive from object, hence ReferenceEquals() is always available inside any class without a need to prepend the object type qualifier.

Now, those of you who are crafty (or have access to a decompiler) will notice that ReferenceEquals() is actually defined as something like this:

1: public static bool ReferenceEquals(object objA, object objB)
   2: {
   3:     return objA == objB;
   4: }

Wait a minute! It’s just invoking operator ==, so why does this work and not our original definition?

Well, the answer is that the two parameters objA and objB are both references to object! Remember that operators are overloaded, not overridden! This means that the == operator used depends on the type of the reference operands, not the type of the objects being referred to. So since objA and objB are both object, it uses the == defined between object which is a reference comparison.

Now, of course this means we could have just done this instead:

1 : public static bool operator ==(Point first, Point second) 2 : {
  3 :  // casting both to object forces the == to be a reference comparison
       4 : if ((object)first == second) return true;
  5 : 6 : if ((object)first == null || (object)second == null) return false;
  7 : 8 : return first.X == second.X && first.Y == second.Y;
  9:
}

Personally, though, I much prefer ReferenceEquals() because it’s intention is quite clear. If you perform the object cast, obviously it works. But if you use == and forget the case, or if a developer who doesn’t understand the nuances of operator overloading comes along and thinks it’s redundant and removes it, this can lead to runtime errors (in the case of unintended recursion) or incorrect results (in the case of a value comparison where a reference was intended).

Summary

So remember, when you are defining operator == and != operator overloads, consider using ReferenceEquals() to make sure the reference comparisons are performed correctly and you are not accidentally generating an infinitely recursive call. Even if not defining operator overloads, it can be good to be explicit and use ReferenceEquals() for reference comparisons so that even if an operator == is ever defined at a later point on your type, your reference comparison still performs as intended.

Is the ReferenceEquals() method revolutionary? Probably not, but it was important enough for the .NET designers to make it a part of object, and thus it deserves consideration. After all, it can make your code safer, and easier to read by making the intention of the comparison very clear (comparing references not values) and thus, it’s a Little Wonder in my book.

This article is part of the GWB Archives. Original Author: James Michael Hare

Replatforming Guide: Pros, Cons, and Impact

Deciding to replatform is no small feat; it’s like setting sails for new horizons with your digital presence. Weighing the

Cypress vs Selenium: Why Cypress is Better!

Navigating the competitive landscape of web testing tools, Cypress emerges as a noteworthy contender, outshining Selenium with its cutting-edge advantages.