Tonight I was made aware of a bug in a library that I wrote that implements the cryptographic authentication sequences used by Blizzard Entertainment's Battle.net gaming service. The user reported that his code simply stopped executing; it never occurred to me that he might just be swallowing an exception (particularly if his code was running on a secondary thread). But when I reviewed my code, I saw that all of my loops were deterministic, and although I had a couple lock { } blocks throughout the code path, none of them were in places that I would be blocking.
Frustrated, I pulled out the code and wrote a test project using his input data to call the function. I generated a NullReferenceException when I called the function he labeled suspect, so then I integrated the actual project into my solution. I found the source of the problem immediately: it was a third-party BigInteger class I'd gotten from CodeProject. This wasn't a *major* surprise - I'd had a few issues with this BigInteger class on another project that dealt with cryptographic calculations. The big one was where the error was being raised:
So at this point I'm somewhat puzzled, except when I check the values of bi1 and bi2 in the debugger: they're both null! I step back one level in the call stack and find:
At this point I'm not entirely sure what I can do. If I modify the operator == to check bi1 and bi2 as == null, won't it result in a circular loop? Then it occurs to me that I should modify the calling code:
System.Object.ReferenceEquals(objA, objB) returns true if the object references are identical, or if both object references are null.
This is a good lesson to me: don't overload operator == on reference types.