James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 108, comments - 788, trackbacks - 0

My Links

News

Welcome to my blog! I'm a solutions architect at Scottrade Inc. in Saint Louis, MO. I've been doing C++/C# development for over 18 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

MCC Logo MVP Logo

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

Post Categories

C#: String Concatenation vs Format vs StringBuilder

I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software.

So the two standards in question were these, in paraphrase:

  • Prefer StringBuilder or string.Format() to string concatenation.
  • Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals().

Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in.

The first test was a pretty standard one.  When concattenating strings, what is the best choice: StringBuilder, string concattenation, or string.Format()?

So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below:

  • num – Number of iterations.
  • strings – Array of randomly generated strings.
  • results – Array to hold the results of the concatenation tests.
  • timer – A System.Diagnostics.Stopwatch() instance to time code execution.
  • start – Beginning memory size.
  • stop – Ending memory size.
  • after – Memory size after final GC.

So first, let’s look at the concatenation loop:

   1: // build num strings using concattenation.
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = "This is test #" + i + " with a result of " + strings[i]; 
   5: } 

Pretty standard, right?  Next for string.Format():

   1: // build strings using string.Format()
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 
   5: } 

 

Finally, StringBuilder:

   1: // build strings using StringBuilder
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:         var builder = new StringBuilder(); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

So I take each of these loops, and time them by using a block like this:

   1: // get the total amount of memory used, true tells it to run GC first.
   2: start = System.GC.GetTotalMemory(true); 
   3:  
   4: // restart the timer
   5: timer.Reset(); 
   6: timer.Start(); 
   7:  
   8: // *** code to time and measure goes here. ***
   9:  
  10: // get the current amount of memory, stop the timer, then get memory after GC.
  11: stop = System.GC.GetTotalMemory(false); 
  12: timer.Stop(); 
  13: other = System.GC.GetTotalMemory(true); 

So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations:

   1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000
   2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000
   3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000

 

Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The concat burns more memory than StringBuilder but less than string.Format(). 

This shows two main things:

  • StringBuilder is not always the panacea many think it is.
  • The difference between any of the three is miniscule!

The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster than StringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have preallocated my StringBuffer like so:

 

   1: for (int i = 0; i < num; i++) 
   2: { 
   3:         // pre-declare StringBuilder to have 100 char buffer.
   4:         var builder = new StringBuilder(100); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

 

Now let’s look at the times:

   1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000
   2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000
   3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000

 

Whoa!  All of the sudden StringBuilder is back on top again!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string concat (+) in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you.

But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is neglidgable at best. 

The key is to pick the best tool for the job.  What do I mean?  Consider these awesome words of wisdom:

  • Concatenate (+) is best at concatenating
  • StringBuilder is best when you need to building.
  • Format is best at formatting.

Totally Earth-shattering, right!  But if you consider it carefully, it actually has a lot of beauty in it’s simplicity.  Remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices.

The fact is, the concattenation operator (+) has been optimized for speed and looks the cleanest for joining together a known set of strings in the simplest manner possible.

StringBuilder, on the other hand, excels when you need to build a string of inderterminant length.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong.

String.Format seems to be the looser from the stats, but consider which of these is more readable.  Yes, ignore the fact that you could do this with ToString() on a DateTime. 

   1: // build a date via concatenation
   2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 
   3:     + (day < 10 ? string.Empty : "0") + '/' + year;
   4:  
   5: // build a date via string builder
   6: var builder = new StringBuilder(10);
   7: if (month < 10) builder.Append('0');
   8: builder.Append(month);
   9: builder.Append('/');
  10: if (day < 10) builder.Append('0');
  11: builder.Append(day);
  12: builder.Append('/');
  13: builder.Append(year);
  14: var date2 = builder.ToString();
  15:  
  16: // build a date via string.Format
  17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year);
  18:  

So the strength in string.Format is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format does.

So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization almost always bites you in the end because you’re sacrificing readability for performance, which is almost exactly the wrong choice 90% of the time.

I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them:

  1. For Beginners: Do not optimize.
  2. For Experts: Do not optimize yet.

It’s so true.  Most of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you nothing because it won’t be your bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, and only then, if you need that extra performance boost after profiling your code and exhausting all other options… then you can start to think about optimizing.

Print | posted on Monday, May 10, 2010 9:59 PM | Filed Under [ My Blog C# ]

Feedback

Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

Hi, I read your article and I have a question:

did you try the StringBuilder test with the ".AppendFormat" or not?

5/11/2010 10:28 AM | innovatel
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

The benchmark is very artificial since it doesn't measure the impact on the heap and garbage collection. Concatenation creates a lot of small object which fragment the heap and can cause more garbage collection cycles. In a real world multi-threaded applicacion, string concatenations in a loop can have a measurable or even severe impact.
5/11/2010 10:55 AM | Eddie Velasquez
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

Everyone has always known that appending short strings are fast. The real test is continually appending to the same string, as the immutable string operations time grow with the 4th power (I think, possibly 3rd) of the length of the string.

If you are doing something like building an HTML or XML file up that ends up being several 10k in total size, string builder will win by a massive amount.
5/11/2010 11:49 AM | Jason Coyne
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

@innovatel: No, I did not, that would be an interesting next step.

@Eddie: Actually it does take that into account. If you notice I print the memory each of the tests created and they create the SAME amount of memory on a single concatenation. Now as for looping, that's why I say in choosing the best tool for the job that StringBuilder is best for building strings in a loop when the ultimate size is unknown.

@Jason: Exactly, once again I agree StringBuilder is the best choice for building a string with multiple concattenations.

Essentially string concat is best for single line concattenations when you know all parts, string.Format is best for creating a formatted string (more readable), and StringBuilder is best when you need to construct a string through multiple, sepperate appends, usually in a loop.
5/12/2010 8:38 AM | James Michael Hare
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

There are two messages in this post, and unforetunately the first typically gets ignored with the second used as justification for ignoring it:
1. "The key is to pick the best tool for the job."
2. Do not overoptimise

I see a lot of code these days where developers just blindly use string.Format, when Concat or StringBuilder would be a more appropriate tool for the job, and when you question this the response is "I don't want to play the premature optimisation game".

Like you say "don’t look for the silver bullet! Choose the best tool."
8/26/2010 7:24 PM | Paul
Post A Comment
Title:
Name:
Email:
Website:
Comment:
Verification:
 
 

Powered by: