The C#/.NET Fundamentals series is geared towards examining fundamental concepts in using C# (and .NET in general) to produce effective solutions.
There are times when we are writing a method that returns a sequence of items, that it occasionally becomes necessary in base-class, interface implementation, error, or default conditions to return a sequence of only one or even zero items. There are many ways to do this, of course, which begs the question of which way is best, in terms of readability, maintainability, and performance.
The Situation
For our situation, we are talking about returning a sequence from a method that returns IEnumerable<T>. If your method directly returns List<T> or T[] then your choice on how to do that will be a bit obvious, but in the case of just returning a sequence of, IEnumerable<T> we have a lot of flexibility.
There are many reasons we may want to do this, perhaps we have a method that returns IEnumerable<T> in a virtual method, and perhaps in some sub-class implementations we want it to return an empty (or 1-length) sequence.
Similarly, we could be implementing an interface that has a method that returns a sequence, and our implementation needs to return only a default empty (or 1-length) sequence.
Alternatively, perhaps we have a method that returns a sequence of data from a database, but that it’s possible that the database may be offline or the data in the table is optional, in which case we don’t want to throw, but want to return an empty (or 1-length) sequence.
Regardless of the use case, there are times when a zero or one-length sequence is desirable. For the purposes of the examples below and in the interest of brevity, let’s assume that we are dealing with the former case where we want to provide an empty (or 1-length) sequence for an interface implementation, such as the one below:
1: public interface IParser
2: {
3: // a sequence of items that can start a comment
4: IEnumerable<string> CommentStarters { get; }
5:
6: ...
7: }
In the interface above, perhaps there is a property called CommentStarters that returns a sequence of characters that can tell the parser to ignore the line. For the purposes of our example, we’ll hard code the sequences, but obviously the sequences of one item could be determined at run-time as necessary.
Returning an Empty Sequence
So, for whatever reason in our design we’ve decided that if certain situations exist in our code, we want to return an empty sequence instead of an error, etc. Of course, one could argue for returning null, though sometimes this option isn’t always desirable because it adds a level of checking that must be done every time you wish to query the contents.
Array:
So how can we create empty collections to represent an IEnumerable<T>? Perhaps one of the easiest ways is to simply return a zero-length array:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return new string[0];
9: }
10: }
11:
12: ...
13: }
The nice thing about returning zero-length arrays is that it’s very easy to do, and relatively performant. In addition, a zero-length array is immutable by nature, it can not have its size or contents changed.
List<T>:
Of course, we could just as easily return a List<T> of zero length:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return new List<string>();
9: }
10: }
11:
12: ...
13: }
This works as well, but List<T> is not immutable and can be changed if the caller happens to know it’s a List<T> and casts it back:
1: // if we know it returns a List<string>, we can cast it back...
2: var commentStarters = (List<string>)p.CommentStarters;
3:
4: // which lets us mutate, this could be a problem...
5: commentStarters.Add("///");
Also, the List<T> is a heavier construct, which would add more overhead than needed.
Iterator:
Another way to return zero items in a sequence is to use an iterator. Thus a simple method that returns no items would be a simple yield break:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: yield break;
9: }
10: }
11:
12: ...
13: }
This is very clean, and is immutable, but it does create an iterator, which can be a bit heavier. In addition, if you are wanting to return zero item sequence inside a method that also returns a sequence directly, this will not work since you can’t mix iterator returns and non iterator returns in a method.
Enumerable.Empty<T:
Finally, there is a handy static method off of the Enumerable class called Empty<T>() which returns a singleton stand-in for an empty enumerable sequence of the given type. This is nice in that it will not allocate memory every time you need to return an empty sequence:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return Enumerable.Empty<string>();
9: }
10: }
11: }
This has the benefit of being immutable, and being a singleton. In other words, the caller can’t modify what you return, and it will not allocate any new memory every time this is called (it creates the empty singleton for the sequence of type T once).
Results:
So how do these measure up, performance wise? Let’s perform all of these over 1 billion iterations:
Array | List | Iterator | Empty |
6,654 ms | 11,107 ms | 9,522 ms | 4,926 ms |
So as you can see, the fastest solution is to use the Enumerable.Empty<T, which also has the most benefits:
- It’s the fastest.
- It doesn’t generate as much memory overhead.
- It is extremely readable in it’s intention.
As always, keep in mind that these time measurements are over 1 billion iterations, and thus the actual time for any single one of these calls is somewhat negligible. However, they can be useful for comparing the relative performance. In the end, I’d say stick with Enumerable.Empty<T not just because it is the fastest, but also because it returns an immutable sequence and is easy to read and understand.
Returning a Sequence of One
Now, what if our method is to return a sequence that contains only one item? Unfortunately, there’s no implicit way to just convert a single item into a sequence of one item, but there are several options, very similar to producing empty sequences…
Array:
Just as we can easily create an empty array, we can also just as easily create an array of a single item:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return new[] { "//" };
9: }
10: }
11:
12: ...
13: }
We don’t have to explicitly type the array, since it can tell from the initialization list that the type is string. This is a very simple way to create a single-item sequence, however it does have an issue in that the contents of the first item are mutable if cast back to an array:
1: var p = new Parser();
2:
3: // if we know it returns a string[], we can cast it back...
4: var commentStarters = (string[])p.CommentStarters;
5:
6: // which lets us mutate, this could be a problem...
7: commentStarters[0] = "/*";
This mutability of the array contents could be an issue, so let’s explore other options…
List<T>:
Of course, we could just as easily return a List<T> of one item:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return new List<string>(1) { "//" };
9: }
10: }
11:
12: ...
13: }
Note that for efficiency, we pre-size the List<string> to 1 in the constructor to avoid unnecessary over-allocation. The main problems with List<T> for this purpose is that it’s heavier than T[] and a little less safe, because once again if they can cast it back to List<T> then they can not only change the contents at an index, but they can add or remove items:
1: // if we know it returns a List<string>, we can cast it back...
2: var commentStarters = (List<string>)p.CommentStarters;
3:
4: // which lets us mutate, this could be a problem...
5: commentStarters[0] = "/*";
6: commentStarters.Add("///");
Iterator:
Just as you can use an iterator to return no items in the sequence, you can also use it to return a sequence of only one item by performing a yield return on the item directly:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: yield return "//";
9: }
10: }
11:
12: ...
13: }
This is very clean, and is completely immutable, but it does create an iterator, which can be a bit heavier. In addition, as mentioned before you cannot mix yield return and return in a method, so if you wanted to use this trick in a method that also has a return you’d need to create a helper method to return a singleton sequence, like:
1: public static class EnumerableExtensions
2: {
3: public static IEnumerable<T> AsEnumerable<T>(this T value)
4: {
5: yield return value;
6: }
7: }
With this utility extension method, you could take advantage of this solution even in methods that use return.
Enumerable.Repeat<T>():
While there isn’t a direct way in the Enumerable class to generate a single-item sequence, the Repeat<T>() method comes very close when you pass a repeat value of 1:
1: public class Parser
2: {
3: // a sequence of items that can start a comment
4: public IEnumerable<string> CommentStarters
5: {
6: get
7: {
8: return Enumerable.Repeat("//", 1);
9: }
10: }
11: }
Once again, since this uses iterators under the covers, it is a bit heavier than the arrays, but it is immutable. As a side note, though, using Repeat() with a count of 1 is probably less readable and could be seen as more of a “trick” than an improvement.
Results:
So how do these measure up, performance wise? Let’s perform all of these over 1 billion iterations:
Array | List | Iterator | Repeat |
7,093 ms | 14,502 ms | 9,848 ms | 10,847 ms |
So here, the fastest solution is the T[] array, however, it is also mutable which is a concern. Given that I would probably hedge towards using the very close second and third place iterator or Enumerable.Repeat() to gain the safety of immutability. Of all of these, it seems like the iterator is the most readable, especially if used with something like the AsEnumerable() extension method proposed.
What About Creating a Singleton?
Now, you could create a singleton for a single-item sequence. The concern with doing this is that you really have to make the singleton immutable or altering it becomes even worse, as it allows you to alter what every subsequent call produces. You could make the singleton immutable by returning it by iterator (via Skip(0)), but this is slower than just using the iterator directly.
You could also wrap the singleton in a ReadOnlyCollection<T> class, but this starts to get less maintainable, though it would prevent future allocations and thus would probably have an overall smaller memory footprint and be very fast (timed at 3,153 ms over 1 billion iterations).
In either case, this does not help in the case where you want to return single-item sequences that can vary at run-time. That is, all of the examples above used a fixed, hard-coded sequence for simplicity, however it’s possible to have cases where the single-item sequence to return may vary based on the call or other logic.
Summary
So what does this all mean? Really I was just experimenting to find out which way would be best, and really there’s a ton of options. It seems quite clear the Enumerable.Empty<T is the best choice for zero-item sequences in terms of readability, performance, etc.
However it’s less clear-cut on one-item sequences. In that case I’d probably say create an extension method like AsEnumerable() that takes advantage of the iterator method (or invoke yield return directly), though you can certainly keep a singleton if the value of the one-item list will always be the same.
Whether or not mutability of the returned sequence is a concern to you is purely a matter for your design consideration. It’s much more a consideration if you are caching singletons for these sequences, but if you are generating the sequences at every call, it may not be as much of a concern.
TweetTechnorati Tags: C#, CSharp, .NET, Fundamentals, IEnumerable<T>, sequence, List<T>, array, Enumerable, Repeat, Empty, yield return, yield break