James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 166 , comments - 1431 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in the Seattle area, who has been performing C++/C#/Java development for over 20 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

Follow BlkRabbitCoder on Twitter

Tag Cloud

Article Categories

Archives

Post Categories

.NET

CSharp

Little Wonders

Little Wonders

vNext

C#/.NET Five More Little Wonders That Make Code Better (2 of 3)

So last week I began my series with a post (here) on those little wonders in .NET/C# -- those small tips and tricks that make code either more concise, maintainable, or performant. 

This is the second of my three-part series, though there are so many things that make .NET (and in particular C#) a great development platform that I'm sure I could carry this blog on ad infinitum.  Once again, many of these are ones you may already know, but hopefully some of you will find something new or be reminded of an old friend waiting to be used again.

Update: Part 3 is now available here.

1. string.IsNullOrEmpty() and string.IsNullOrWhiteSpace()

It's always amazing how many people don't know about these two static helper methods that hang gracefully off the string class.  The string static method string.IsNullOrEmpty() has been around since the 2.0 framework, but the 4.0 framework has given us another gem, the string.IsNullOrWhitespace().

Now, the function of these methods should be self apparent from their names: the first one checks to see if the string reference is null or if the string reference contains an empty string (Length == 0), and the second, checks to see if the string is null, or if every character in the string is whitespace. 

So let's look at some code that does the same type of checks, with and without the static check methods.  First, let's look at checking for a null or empty string without these methods:

   1: public string GetFileName(string fullPathFileName)
   2: {
   3:     // we can either check Length to see if empty or compare to string.Empty
   4:     if (fullPathFileName == null || fullPathFileName.Length == 0)
   5:     {
   6:         // bad, must have a path specified!
   7:         throw new ArgumentNullException(fullPathFileName);
   8:     } 
   9:  
  10:     ...
  11: } 

That's not awful, but it looks so much more precise to say:

   1: public string GetFileName(string fullPathFileName)
   2: {
   3:     // first way to do this is to check for null and a positive length
   4:     if (string.IsNullOrEmpty(fullPathFileName))
   5:     {
   6:         // bad, must have a path specified!
   7:         throw new ArgumentNullException(fullPathFileName);
   8:     } 
   9:  
  10:     ...
  11: } 

Now, it's all in one conditional expression, which reduces the change someone will accidentally code the wrong logical operator or use != instead of ==.  Is it really a huge improvement?  Probably not, but it is a nice, modest improvement and makes the code more concise and less error prone which is always a good thing.

So what about if it's null or whitespace?  let's say you are concatenating first, middle, and last names, but don't want to have a double-space if middle name is empty:

   1: public string GetFullName(string firstName, string middleName, string lastName)
   2: {
   3:     if (middleName == null || middleName.Trim().Length == 0)
   4:     {
   5:         return string.Format("{0} {1}", firstName, lastName);
   6:     } 
   7:  
   8:     return string.Format("{0} {1} {2}", firstName, middleName, lastName);
   9: } 

Notice we did a Trim() to remove the whitespace and then check the Length property.  While this seems nice and concise, it creates a new string object on the heap that must later be garbage collected.  Now, an odd string here or there won't kill your performance, but in a program with high performance requirements you want to keep garbage to a minimum especially if there are other, just as maintainable options! 

So enter the new .NET 4.0 string.IsNullOrWhitespace().  This gives a new method that checks if a string is null or only contains whitespace characters:

   1: public string GetFullName(string firstName, string middleName, string lastName)
   2: {
   3:     if (string.IsNullOrWhiteSpace(middleName))
   4:     {
   5:         return string.Format("{0} {1}", firstName, lastName);
   6:     } 
   7:  
   8:     return string.Format("{0} {1} {2}", firstName, middleName, lastName);
   9: } 

It's more concise, we don't have to worry about the right logical operators, and it doesn't create any extra string objects that need to be garbage collected! 

2. string.Equals()

The string.Equals() method set is probably a lot more varied than you expect.  There are a lot of options for using these methods and some of which unfortunately get overlooked. 

First of all, did you know that there is a static string.Equals() method?  Why would we care?  Well, what if it's possible either string in the comparison may be null?  Let's look:

   1: public Order CreateOrder(string orderType, string product, int quantity, double price)
   2: {
   3:     if (orderType.Equals("equity"))
   4:     {
   5:         // ...
   6:     } 
   7:  
   8:     // ...
   9: } 

What happens if orderType is null?  Obviously, this will throw a NullReferenceException.  Now, of course we could check for that before hand, but sometimes you are doing string compares on strings that you have little control over, and if you have reasonable doubt that one of the strings may be null, instead of typing:

   1: if (orderType != null && orderType.Equals("equity"))

You can use the static string.Equals() method that is safe to call even if one or both of the arguments are null:

   1: if (string.Equals(orderType, "equity"))

So that's one tool to keep in mind for your toolbox, here's another.  How many times have you seen people check for case-insensitive string equality by doing this:

   1: if (orderType.ToUpper().Equals("EQUITY")) 

True, it works, and it is not functionally incorrect, but once again you're creating a new string (returned from ToUpper()) which then has to be cleaned up by the GC later!  If this is really a high performance order processor, that extra garbage may not kill you, but it certainly won't make you faster either!  There's an often overlooked overload on string.Equals() -- both instance and static -- that let's you specify case insensitive compare:

   1: if (orderType.Equals("equity", StringComparison.InvariantCultureIgnoreCase))

or, if you think it may be null, you can do the same thing using the static string Equals():

   1: if (string.Equals(orderType, "equity", StringComparison.InvariantCultureIgnoreCase))

Yes, it's a bit longer (I really wish MS had an EqualsIgnoreCase for conciseness -- though you can create your own extension!) but it is very explicit what it does, and it doesn't create extra temporary strings that just need clean up later.

3. using Statements

Hopefully everyone knows about the using statement (no, not the using directives at the top of your C# files, but the using statement) that will clean up IDisposable instances when they go out of scope by calling their Dispose() method.  Let's look at a piece of code that doesn't use the using statement:

   1: public IEnumerable<Order> GetOrders()
   2: {
   3:     var orders = new List<Order>(); 
   4:  
   5:     var con = new SqlConnection("some connection string");
   6:     var cmd = new SqlCommand("select * from orders", con);
   7:     var rs = cmd.ExecuteReader(); 
   8:  
   9:     while (rs.Read())
  10:     {
  11:         // ...
  12:     } 
  13:  
  14:     rs.Dispose();
  15:     cmd.Dispose();
  16:     con.Dispose(); 
  17:  
  18:     return orders;
  19: } 

Wow, number one it's kinda ugly, and number two if you have an exception anywhere between the first creation and the last Dispose(), you run the risk of not calling Dispose() on the other resources which may lead to connections that aren't properly freed immediately.  Yes, they will EVENTUALLY get garbage collected, but until then you are holding a valuable external resource open!

Well, we could guard against this with a finally block:

   1: public IEnumerable<Order> GetOrders()
   2: {
   3:     SqlConnection con = null;
   4:     SqlCommand cmd = null;
   5:     SqlDataReader rs = null; 
   6:  
   7:     var orders = new List<Order>(); 
   8:  
   9:     try
  10:     {
  11:         con = new SqlConnection("some connection string");
  12:         cmd = new SqlCommand("select * from orders", con);
  13:         rs = cmd.ExecuteReader(); 
  14:  
  15:         while (rs.Read())
  16:         {
  17:             // ...
  18:         }
  19:     } 
  20:  
  21:     finally
  22:     {
  23:         rs.Dispose();
  24:         cmd.Dispose();
  25:         con.Dispose();
  26:     } 
  27:  
  28:     return orders;
  29: }

But even this has issues!  What if the SqlCommand fails to create and throws, but the reader may still be null, in which case rs.Dispose() will throw and the connection will never get Disposed().  Now, of course we could guard all the disposes with if guards, but dang it the using statement makes it so easy:

   1: public IEnumerable<Order> GetOrders()
   2: {
   3:     var orders = new List<Order>(); 
   4:  
   5:     using (var con = new SqlConnection("some connection string"))
   6:     {
   7:         using (var cmd = new SqlCommand("select * from orders", con))
   8:         {
   9:             using (var rs = cmd.ExecuteReader())
  10:             {
  11:                 while (rs.Read())
  12:                 {
  13:                     // ...
  14:                 }
  15:             }
  16:         }
  17:     } 
  18:  
  19:     return orders;
  20: } 

Ahhhh, so much easier!  the using statement will call Dispose() on the instance immediately when scope is left either due to hitting the end of the block or due to an exception causing the block to leave prematurely!  Notice we don't have to make messy null checks and we don't have to have a big, ugly try/finally block and pre-declare our references to null so they'll exist in the finally block.  So much cleaner!

What's that you say?  You don't like the heavy indention?  Well, keep in mind that the using statement can be used in simple or compound form.  That is, if you don't put curly-brackets after the using statement, it assume the scope encapsulates the next statement.  So you could stack them like this:

   1: public IEnumerable<Order> GetOrders()
   2: {
   3:     var orders = new List<Order>(); 
   4:  
   5:     using (var con = new SqlConnection("some connection string"))
   6:     using (var cmd = new SqlCommand("select * from orders", con))
   7:     using (var rs = cmd.ExecuteReader())
   8:     {
   9:         while (rs.Read())
  10:         {
  11:             // ...
  12:         }
  13:     } 
  14:  
  15:     return orders;
  16: } 

The first using declares con (and yes, you can use var in usings to make very concise) and it is scoped (and will thus be disposed) after the next statement which is another using block and so on!  There's no heavy indention, and it looks nice, crisp, and concise!

4. static Class Modifier

Many people go along creating classes for the programs and either don't know about or don't use the static class modifiers.  This modifier can help make your code a little safer by restricting how your class is used and modified by other developers.

Let's say you're writing an XmlUtility class, and the goal of this class is to be able to serialize an object to a string of XML without having to do the encoding and serializing each time.  You may come up with something like this:

   1: public class XmlUtility
   2: {
   3:     public string ToXml(object input)
   4:     {
   5:         var xs = new XmlSerializer(input.GetType()); 
   6:  
   7:         using (var memoryStream = new MemoryStream())
   8:         using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))
   9:         {
  10:             xs.Serialize(xmlTextWriter, input);
  11:             return Encoding.UTF8.GetString(memoryStream.ToArray());
  12:         }        
  13:     }
  14: } 

This is just typical XML serialization code.  The problem is, we have to create this class to use it:

   1: var xmlUtil = new XmlUtility();
   2: string result = xmlUtil.ToXml(someObject);

That's not very elegant usage since the class instance has no state.  Of course you could avoid this by making the method static and a private constructor so that it can't be created:

   1: public class XmlUtility
   2: {
   3:     // create private constructor so this class cannot be created or inherited
   4:     private XmlUtility()
   5:     {
   6:     }
   7:     public static string ToXml(object input)
   8:     {
   9:         var xs = new XmlSerializer(input.GetType()); 
  10:  
  11:         using (var memoryStream = new MemoryStream())
  12:         using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))
  13:         {
  14:             xs.Serialize(xmlTextWriter, input);
  15:             return Encoding.UTF8.GetString(memoryStream.ToArray());
  16:         }        
  17:     }
  18: } 

Well, that prevents someone from incorrectly instantiating or inheriting our class, which is good.  But that empty private constructor is kind of ugly and forced, and there's nothing that prevents a modifier from adding a non-static method by mistake:

   1: public T FromXml<T>(string xml) { ... }

Since this was not declared static, but the constructor is not visible, this method can never be used.  Enter the static class modifier.  If you put the word static before the class keyword, it tells the compiler that the class must only contain static methods, and cannot be instantiated or inherited. 

   1: public static class XmlUtility
   2: {
   3:     public static string ToXml(object input)
   4:     {
   5:         var xs = new XmlSerializer(input.GetType()); 
   6:  
   7:         using (var memoryStream = new MemoryStream())
   8:         using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))
   9:         {
  10:             xs.Serialize(xmlTextWriter, input);
  11:             return Encoding.UTF8.GetString(memoryStream.ToArray());
  12:         }        
  13:     }
  14: }

So much shorter!  All we did was add a static keyword on the class itself, and now it cannot be instantiated and no one can come in later and accidentally add an instance method, property, constructor by mistake! 

Remember, anytime you can promote a potential logical error to a potential compiler error, you will get so much more of your sanity back!

5. Object and Collection Initializers

I know a lot of folks who tend to avoid the initializers as some sort of oddity in the C# world.  In truth, though, it can make some of your initialization code much more elegant, and in one case in particular, can actually make it more performant!

Now that last tidbit on performance may surprise you a bit, since basically the initializer syntax is mostly syntactical candy.  For those of you who don't know, the initializer syntax allows you to specify values for accessible fields and properties at the time of construction.  Let's look first at object-initializers with a simple example. 

Let's assume a typical Point struct:

   1: public struct Point
   2: {
   3:     public int X { get; set; }
   4:     public int Y { get; set; }
   5: } 

And now we'll create and initialize one:

   1: var startingPoint = new Point();
   2: startingPoint.X = 5;
   3: startingPoint.Y = 13;

Looks like a typical create-and-assign, right?  Well, with object initializers we can one-line this:

   1: var startingPoint = new Point() { X = 5, Y = 13 };

The key point, however, is that we've reduced the 3 lines down to one, which is nice, and possibly eliminated some typing.  Notice the curly brackets behind the constructor call where we create the Point, now the position of the curlies is a matter of taste, and technically when you are using Point's default constructor you can omit the empty parenthesis. 

This syntax is available to any type you create as long as it has accessible fields or properties and has an accessible constructor.

But now let's look at collection initializers by assuming we want to create and load a list with 5 integers:

   1: var list = new List<int>();
   2:  list.Add(1);
   3:  list.Add(7);
   4:  list.Add(13);
   5:  list.Add(42);

Using the collection initializer syntax, we could change this to:

   1: var list = new List<int> { 1, 7, 13, 42 };

Much more concise once again!  Also note that once again, first the constructor is being called, and then the Add() method is called four times on each item in the list.  Interestingly enough, you don't need to invoke a default constructor, for example since you know there's 4 items in the list, you could default the list capacity to avoid the potential resizing with each call to Add():

   1: var list = new List<int>(4) { 1, 7, 13, 42 };

So the constructor on List<T> that takes an int for capacity is called, and then Add() is called four times.  You can use this in your own collections yourself, all you need do is implement IEnumerable<T> and supply an Add() method.

You can also combine object and collection initializers, compare the following:

   1: var list = new List<Point>(); 
   2:  
   3: var point = new Point();
   4: point.X = 5;
   5: point.Y = 13;
   6: list.Add(point);
   7: point = new Point();
   8: point.X = 42;
   9: point.Y = 111;
  10: list.Add(point);
  11: point = new Point();
  12: point.X = 7;
  13: point.Y = 9;
  14: list.Add(point); 

Versus:

   1: var list = new List<Point>
   2:     {
   3:         new Point { X = 5, Y = 13 },
   4:         new Point { X = 42, Y = 111 },
   5:         new Point { X = 7, Y = 9 }
   6:     }; 

Which to you looks cleaner and more concise?  Personally I like the intializer syntax.  Even when spread out over multiple lines, it creates a very readable flow of code that is much less "dense" to the eye.

Now, I didn't forget that I teased you with a hint of a slight performance improvement.  Well, it's not in all cases, but look at the following two classes:

   1: public class BeforeFieldInit
   2: {
   3:     public static List<int> ThisList = new List<int>() { 1, 2, 3, 4, 5 };
   4: } 
   5:  
   6: public class NotBeforeFieldInit
   7: {
   8:     public static List<int> ThisList; 
   9:  
  10:     static NotBeforeFieldInit()
  11:     {
  12:         ThisList = new List<int>();
  13:         ThisList.Add(1);
  14:         ThisList.Add(2);
  15:         ThisList.Add(3);
  16:         ThisList.Add(4);
  17:         ThisList.Add(5);
  18:     }
  19: } 

Logically, these do the same thing: they both create a static field that will contain the numbers 1 through 5.  The difference is one of these has an explicit static constructor, and one does not.  For those of you who know C# in depth, you'll know that classes without an explicit static constructor may be marked with beforefieldinit which will inline initialization of the fields.

Let's look at a bit of the IL for each:

   1: .class public auto ansi beforefieldinit BeforeFieldInit
   2:        extends [mscorlib]System.Object
   3: {
   4: } // end of class BeforeFieldInit 
   5:  
   6: .class public auto ansi NotBeforeFieldInit
   7:        extends [mscorlib]System.Object
   8: {
   9: } // end of class NotBeforeFieldInit 

Notice that if a class has an explicit static constructor, then C# does not mark the class with beforefieldinit in the IL, and this means that before any static field is accessed, it must make a quick check to see if the static constructor has already been called (for more details just search on beforefieldinit).  This can add up to a minor performance hit. 

Now, you'd think that since the initialization syntax is not just a simple constructor call but also calls Add() that it would generate a static constructor behind the scenes to load the list.  And it does!  But because it's not an explicit static constructor, it can still be marked beforefieldinit and avoid the extra check. 

This can also clean up your constructor code (both static and instance) because you can initialize the collections at declaration (if the values are known) instead of having to load your constructors with a lot of mundane logic.

Summary

Well, that's five more little wonders, I've got enough I think for one more blog entry next week!  Hope you enjoyed them and learned something new or are able to pass it on to someone who does!  Thanks so much for all the positive feedback on the previous 5 wonders!

Print | posted on Thursday, September 2, 2010 6:20 PM | Filed Under [ My Blog C# Software .NET Fundamentals Little Wonders ]

Powered by: