Justin a.k.a. The Code Monkey

Code Monkey [kohd muhng'-kee] n : 1. Creature known for it's ability to transform caffeine into code. 2. Justin Jones
posts - 10 , comments - 27 , trackbacks - 0

Sunday, December 21, 2008

To var or not to var, that is the question

I started this blog back in September with a particular purpose in mind.  Every yahoo and his brother has a blog these days, and by far the majority of them are absolute trash, but every so often there's a gem.  As developers, we seem to mostly agree on which ones are the gems.  Non-developers most likely have different lists, depending on their focus.  There's a long list of blogs I love to read, and couldn't possibly hope to be counted among them, but one day I realized something.  There's a lot of things I know that many others don't.  Sure, somebody somewhere knows that particular little piece of knowledge, but they may not have a blog of their own.  Fairly often when googling for the answer to a problem I can't find just the right answer I'm looking for.  Somebody somewhere knows the answer, but they didn't blog it.  That's when I saw the value of starting my own blog, so I created The Code Monkey (inside joke) for just that purpose.  It's a place to post those little tidbits of knowledge I've acquired that not everybody else knows.  Then life caught up with me and I got busy.  That's why I haven't posted since September.

There's been this little voice in the back of my head bugging me for weeks now.  I'm pretty sure it's Jeff Atwood.  I have a lot of respect for Jeff, and he talked about how difficult it is to maintain a blog in a DotNetRocks interview last year. In it he talked about how he made a decision to write X many times per week.  While that's very admirable, I think I'm going to have to set my sights a little lower to start with.  I'll aim for one post every two weeks or so.  Why am I telling you this?  So that you'll mock me if I fail to keep it up. 

Peer pressure, you can make it work for you too.

Just to top it off, a friend of mine challenged me and another friend to write two posts in two weeks.  The one who fails to has to buy the beer.  That's pretty motivating.  If you blog and have friends who blog, you should try it out.  Unfortunately I think I've already missed the two week deadline, but I can still beat the other guy and get free beer.  Helping out your fellow developer and getting free beer at the same time, you can't beat that.

So began my search for a blog topic.  Turns out it found me.  A couple of months back I was working for Magenic in Minnesota.  Check out the link, I was excited because I got to help a friend debug some of the Silverlight on the front page.  Anyway, Rocky Lhotka, creator of CSLA and Microsoft Software Legend for those who don't know, posted a question on the internal forum server which got me thinking.  I don't remember if I put my $.02 in or not, but out of curiosity I asked the same question on the forum server at my new job.  The results surprised me a bit, but they probably shouldn't have. 

The question was "Should var be used for variable declarations outside of using LINQ?"  This is a surprisingly polarizing question, and it's showed up a few other places recently.  One of the managers at Resharper, Ilya Rezhenkov, had this to say:

"Some cases where it seems just fine to suggest var are:

  1. New object creation expression: var dictionary = new Dictionary<int, string>();
  2. Cast expression: var element = (IElement)obj;
  3. Safe Cast expression: var element = obj as IElement;
  4. Generic method call with explicit type arguments, when return type is generic: var manager = serviceProvider.GetService<IManager>()
  5. Generic static method call or property with explicit type arguments, when return type is generic: var manager = Singleton<Manager>.Instance;"

I have to admit, ReSharper was a big influence in overcoming my aversion to the "var" keyword.  It's no secret that I'm a fan of ReSharper and think that it helps standardize and improve code.  It doesn't hurt that each subsequent version of Visual Studio implements a lot of the functionality that ReSharper already provided, forcing them to work hard to stay ahead of the curve to justify purchasing it. 

The main argument against the var keyword is that it reduces code readability.  Looking at the five listed scenarios above, I don't see how it reduces readability.  Take example one.  Is it really necessary to say "Dictionary<int, string>" twice?  I don't think so.  The same is true for 2-5.  Each example already states the type quite clearly, and var allows you to say it once. 

However, take this example.

var rdr = DataWrapperClass.ExecuteReader();

It's not immediately clear what type of variable rdr is.  I still think var is a good choice here, and I have backing reasons for it.  This is taken from an actual scenario I ran into in my current job.  In this case we had a wrapper class around all of our data access code, and it provided some useful functionality.  We use CSLA in our shop, and this method returned an instance of CSLA's SafeDataReader, which is itself a wrapper around the IDataReader interface.  However in the current project it fell a little short.  We're upgrading a rather dated codebase that was originally written in C++.  Much of the code dates back years.  Having started out in C++ myself, I remember when C++ didn't have a native boolean type.  What you used was integers.  0 was false, non-zero was true.  The database we were working with followed this same convention.  Many boolean values were stored as integers.  To help out with this, I extended SafeDataReader to allow reading an integer column as a boolean.  Next, I changed the return type of ExecuteReader to return my new class. All references to ExecuteReader explicitly declared the return value to be SafeDataReader.  There's a number of ways this can go.  The particular method I overrode happened to be virtual, so it turns out the code would work as is, but seeing this in the code

SafeDataReader rdr = DataWrapperClass.ExecuteReader();

can be even more confusing that var, because while you think you're working with SafeDataReader, you're not.  But what if the method I had overridden hadn't been virtual?  The code would still compile, but when you tried to read an integer column as a boolean, the code would blow up.  Why?  Because even though the actual instantiated type is my custom derived class, you are still accessing it with the SafeDataReader interface so it would call GetBoolean from SafeDataReader, which doesn't expect integers.  That becomes a runtime error, my friend, not a compile time error.

Here's another example.  In this same class I implemented a new reading function that would translate "O" and "P" to true/false.  Don't ask, but it actually happened.  This method is not accessible from SafeDataReader, it's only accessible if you use my derived class. 

Both of these examples can be shrugged off, with a "We'll fix them as we find them" attitude.  On the other hand, what if I had needed to implement my own class instead of inheriting from SafeDataReader, even though the interface is similar?  I now have to fix hundreds of lines of code before I can recompile.  If they had been declared with var instead of SafeDataReader, I could have done the same thing in a couple of lines of code. 

Dare Obasanjo responded to the ReSharper post with examples of how the codebase for RSS Bandit had become less legible as a result of ReSharper's suggested use of var.  I like Dare, but I have to take exception to some of his responses.  They're not in any way suggesting we should go back to Hungarian naming conventions, and I would sooner write Cobol than see that.  They're suggesting better names like "currentElement" instead of "e" or "element", which we should all be doing anyway.  He even invokes the holy Microsoft C# Language Reference to make his point. 

Jared Parsons also chimed in in response, and quite observantly grouped people into three camps on this topic. 

"There appear to be three groups in this debate.

  1. Whenever it's possible
  2. Only when it's absolutely clear what the type is
  3. Never, type inference is evil "

I'm still surprised at how many people, even in my own office, fall into category 3.  I'm somewhere between 1 and 2, but I lean heavily towards 1.  I'm still not convinced there's any value to

var someValue = 5;

But I can't quite articulate why that one bothers me and the others don't.  The main reason I like var, however, was listed on Jared's page rather than any of the Resharper team's reasons.

"Makes refactoring easier.  I re-factor, a lot.  I constantly split up or rename types.  Often in such a way that refactoring tools don't fixup all of the problems.  With var declarations I don't have to worry because they just properly infer their new type and happily chug along.  For explicit type cases I have to manually update all of the type names. "

Yep, as noted above, I ran into that one myself. 

I understand the readability remark, given that you're using Notepad to write your code.  It's not a problem for me because, apparently unlike the denizens of Camp #3, I use Visual Studio for my day to day work.  For those who can't afford VS, there's a free version as well.  It's got this really cool feature that Notepad doesn't give you.  If you move your mouse over the "var" keyword, you see this:

image

Yep, that's the compiler inferred type right there.  If you're not sure what type rdr is, all you have to do is move your mouse. 

I probably sound a little snide with this example, but really people!  We've moved on.  Code is too complex to be done in Notepad these days.  IDEs handle most of these kinds of problems out of necessity. It's even been suggested that the complexity of modern code could be mitigated by storing source in a custom format rather than text files.  We've been using text for decades and it might be time to upgrade.  That's a topic for another post, however.

As usual, I'm the Black Sheep in my company on this topic, and have taken a fair amount of ribbing about it.  My last parting thought to the citizens of camp #3: Check out C# 4.0.  It introduces the dynamic keyword.  It's coming people, get used to it or become the next Cobol programmers.

(My apologies to any Cobol programmers reading this.)

Posted On Sunday, December 21, 2008 9:16 PM | Comments (1) | Filed Under [ .Net 3.5 ]

Powered by: