James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 166 , comments - 1517 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in the Seattle area, who has been performing C++/C#/Java development for over 20 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

.NET

CSharp

Little Wonders

Little Wonders

vNext

C# Fundamentals: String Concat() vs. Format() vs. StringBuilder

I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software.

So the two standards in question were these, in paraphrase:

  • Prefer StringBuilder or string.Format() to string concatenation.
  • Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals().

Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in.

The first test was a pretty standard one.  When concatenating strings, what is the best choice: StringBuilder, string.Concat(), or string.Format()?

So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below:

  • num – Number of iterations.
  • strings – Array of randomly generated strings.
  • results – Array to hold the results of the concatenation tests.
  • timer – A System.Diagnostics.Stopwatch() instance to time code execution.
  • start – Beginning memory size.
  • stop – Ending memory size.
  • after – Memory size after final GC.

So first, let’s look at the concatenation loop:

   1: // build num strings using concattenation.
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = "This is test #" + i + " with a result of " + strings[i]; 
   5: } 

 

Pretty standard, right?  Next for string.Format():

   1: // build strings using string.Format()
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:     results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 
   5: } 

 

Finally, StringBuilder:

   1: // build strings using StringBuilder
   2: for (int i = 0; i < num; i++) 
   3: { 
   4:         var builder = new StringBuilder(); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

 

So I take each of these loops, and time them by using a block like this:

   1: // get the total amount of memory used, true tells it to run GC first.
   2: start = System.GC.GetTotalMemory(true); 
   3:  
   4: // restart the timer
   5: timer.Reset(); 
   6: timer.Start(); 
   7:  
   8: // *** code to time and measure goes here. ***
   9:  
  10: // get the current amount of memory, stop the timer, then get memory after GC.
  11: stop = System.GC.GetTotalMemory(false); 
  12: timer.Stop(); 
  13: other = System.GC.GetTotalMemory(true); 

 

So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations:

   1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000
   2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000
   3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000

 

Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The Concat() burns more memory than StringBuilder but less than string.Format().

This shows two main things:

  • StringBuilder is not always the panacea many think it is.
  • The difference between any of the three (in the context of a creating a string in a single statement) is miniscule!

The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster than StringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have pre-allocated my StringBuilder like so:

   1: for (int i = 0; i < num; i++) 
   2: { 
   3:         // pre-declare StringBuilder to have 100 char buffer.
   4:         var builder = new StringBuilder(100); 
   5:         builder.Append("This is test #"); 
   6:         builder.Append(i); 
   7:         builder.Append(" with a result of "); 
   8:         builder.Append(strings[i]); 
   9:         results[i] = builder.ToString(); 
  10: } 

 

Now let’s look at the times:

   1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000
   2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000
   3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000

 

Whoa!  All of the sudden StringBuilder is back on top again for this example code!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string.Concat() – or operator + of course - in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you.

But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is negligible at best.

The key is to pick the best tool for the job you are trying to do.  What do I mean?  Consider these words of wisdom:

  • Concatenate (+) is great at concatenating several strings in one single statement. 
  • StringBuilder is great when you need to building a string across multiple statements or a loop.
  • Format is great at performing formatting of strings in ways that concatenation cannot.

Just remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices available to us, but each has their purpose and each has times when they outshine the others, so do not take this as “string concat is always faster” because that is not true, nor take this as “a sized StringBuilder is always faster” because again that is not always true!  The salient point, which I can’t stress enough, is that each performs a certain job well and the key is to know which tools does which job best.

So, in general, the string.Concat() is clean and often optimal for joining together a known set of strings in a single statement. 

StringBuilder, on the other hand, excels when you need to build a string across multiple statements or in a loop.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong.

Finally, String.Format() seems to be the looser from the stats, but consider which of these is more readable:

   1: // build a date via concatenation
   2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 
   3:     + (day < 10 ? string.Empty : "0") + '/' + year;
   4:  
   5: // build a date via string builder
   6: var builder = new StringBuilder(10);
   7: if (month < 10) builder.Append('0');
   8: builder.Append(month);
   9: builder.Append('/');
  10: if (day < 10) builder.Append('0');
  11: builder.Append(day);
  12: builder.Append('/');
  13: builder.Append(year);
  14: var date2 = builder.ToString();
  15:  
  16: // build a date via string.Format
  17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year);
  18:  

So the strength in string.Format() is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format() does.

So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization can often bite you in the end if you sacrifice more readable code for the sake of a performance gain that may or may not exist.  This is not to say feel free to write ill-performing code, you should still understand the complexity of the code you are writing and of course prefer linear algorithms to quadratic ones and so on, but make sure before you optimize code that you understand what gains (if any) you are going to get.

I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them:

  1. For Beginners: Do not optimize.
  2. For Experts: Do not optimize yet.

Many of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you little because it won’t be your biggest bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, if you need that extra performance boost after profiling your code and finding the true bottleneck you can optimize away.

Print | posted on Monday, May 10, 2010 9:59 PM | Filed Under [ My Blog C# Software .NET Fundamentals ]

Feedback

Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

Hi, I read your article and I have a question:

did you try the StringBuilder test with the ".AppendFormat" or not?

5/11/2010 10:28 AM | innovatel
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

The benchmark is very artificial since it doesn't measure the impact on the heap and garbage collection. Concatenation creates a lot of small object which fragment the heap and can cause more garbage collection cycles. In a real world multi-threaded applicacion, string concatenations in a loop can have a measurable or even severe impact.
5/11/2010 10:55 AM | Eddie Velasquez
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

Everyone has always known that appending short strings are fast. The real test is continually appending to the same string, as the immutable string operations time grow with the 4th power (I think, possibly 3rd) of the length of the string.

If you are doing something like building an HTML or XML file up that ends up being several 10k in total size, string builder will win by a massive amount.
5/11/2010 11:49 AM | Jason Coyne
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

@innovatel: No, I did not, that would be an interesting next step.

@Eddie: Actually it does take that into account. If you notice I print the memory each of the tests created and they create the SAME amount of memory on a single concatenation. Now as for looping, that's why I say in choosing the best tool for the job that StringBuilder is best for building strings in a loop when the ultimate size is unknown.

@Jason: Exactly, once again I agree StringBuilder is the best choice for building a string with multiple concattenations.

Essentially string concat is best for single line concattenations when you know all parts, string.Format is best for creating a formatted string (more readable), and StringBuilder is best when you need to construct a string through multiple, sepperate appends, usually in a loop.
5/12/2010 8:38 AM | James Michael Hare
Gravatar

# re: C#: String Concatenation vs Format vs StringBuilder

There are two messages in this post, and unforetunately the first typically gets ignored with the second used as justification for ignoring it:
1. "The key is to pick the best tool for the job."
2. Do not overoptimise

I see a lot of code these days where developers just blindly use string.Format, when Concat or StringBuilder would be a more appropriate tool for the job, and when you question this the response is "I don't want to play the premature optimisation game".

Like you say "don’t look for the silver bullet! Choose the best tool."
8/26/2010 7:24 PM | Paul
Gravatar

# re: C# Fundamentals: String Concat() vs. Format() vs. StringBuilder

**UPDATE**

Addressed a concern where I didn't emphasize enough that string.Concat() is most appropriate when constructing a string in a single statement. I had thought I had brought that up, but later I made a blanket statement that muddied the issue.

So, it's all cleared up now, sorry for the confusion!
8/17/2012 9:06 PM | James Michael Hare
Post A Comment
Title:
Name:
Email:
Comment:
Verification:
 

Powered by: