Works on my machine

Get rid of deep null checks Mar 26

Have you ever had to write a code like this one?

var store = GetStore();
string postCode = null;

if (store != null && store.Address != null && store.Address.PostCode != null)
     postCode = store.Address.PostCode.ToString();

I’m sure you have. We’ve all been there.
The goal is to retrieve (or compute) a value, but in order to do that, we have to get through many intermediate objects, which could be in a default state. The aforementioned example is pretty easy. Sometimes, along the way, there are additional method calls, “as” conversions or collections. We have to handle properly each of them, which bloats further the code, making it less readable. There’s got to be a better way.

 

Conditional extensions

Many developers found a solution in conditional extensions. If you google “c# deep null check” you will find several implementations. Their names vary, but the idea behind them is the same:

public static TResult IfNotNull<TResult, TSource>(
    this TSource source,
    Func<TSource, TResult> onNotDefault)
    where TSource : class
{
    if (onNotDefault == null) throw new ArgumentNullException("onNotDefault");

    return source == null ? default(TResult) : onNotDefault(source);
}

As you can see, the extension returns a value of TResult type. The source object is required to obtain the result. If the former is null, we are unable to do that, thus the method returns default value of TResult type. If it’s not the case, we can safely invoke the onNotDefault delegate.
After adding IfNotNull extension to our project, the first example can be rewritten in the following way:

var postCode =
    GetStore()
        .IfNotNull(x => x.Address)
        .IfNotNull(x => x.PostCode)
        .IfNotNull(x => x.ToString());

It really pays off in more complex scenarios, i.e. when we have to call some methods along the way and store the results in temporary variables.

The IfNotNull extension can be improved in several ways. I would like to focus on 2 of them:

  • The aforementioned extension deals only with reference types. We can amend it to work with value types too.
  • Have you thought about string type? It’s often not enough to ensure that a string variable is not null. We want to work with not empty (or even not whitespace only) strings.
    The same thing applies to collections. It’s not enough that it’s not null. It has to have at least one element in order to compute an average (or to get the biggest element).

To address these issues we can transform the IfNotNull extension into the following IfNotDefault extension:

public static TResult IfNotDefault<TResult, TSource>(
    this TSource source,
    Func<TSource, TResult> onNotDefault,
    Predicate<TSource> isNotDefault = null)
{
    if (onNotDefault == null) throw new ArgumentNullException("onNotDefault");

    var isDefault = isNotDefault == null
        ? EqualityComparer<TSource>.Default.Equals(source, default(TSource))
        : !isNotDefault(source);

   return isDefault ? default(TResult) : onNotDefault(source);
}

It’s not much different from the original implementation. I’ve got rid of the where TSource : class constraint to support structs. After doing that I could no longer use simple null equality comparison (because structs aren’t nullable) so I had to ask EqualityComparer<TSource>.Default to do the job.
And there is also optional predicate, if we’re not happy with the default comparison.

Let me show you a couple of usage examples:

1. Performing some operation on string if it’s not empty:

return person
        . IfNotDefault(x => x.Name)
        . IfNotDefault(SomeOperation, x => !string.IsNullOrEmpty(x));

 

2. Computing average. This nicely works as a part of LINQ chain:

var avg = students
        .Where(IsNotAGraduate)
        .FirstOrDefault()
        .IfNotDefault(s => s.Grades) // let’s assume that Grades property returns int array 
        .IfNotDefault(g => g.Average(), g => g != null && g.Length > 0);

Notice one thing. The Average method returns double. If it’s not been possible to compute the average, then the avg variable will be set to 0. Sometimes it’s a desired side effect, other times it’s not.
If it is the latter, we can fix this easily with nullable cast:

        …
        .IfNotDefault(g => (double?)g.Average(), g => g != null && g.Length > 0);

 

3. Value types

Sometimes we can take advantage of the default struct values.
For example, the default value of double (and other numeric types) is 0.
We can use that fact to implement a division method:

public static double? Div(double dividend, double divisor)
{
    return divisor.IfNotDefault(_ => (double?)dividend / divisor);
}

The same thing is possible with booleans. Instead of writing:

return TrueCondition
    ? ComputeSomething()
    : null; 

we can write:

return TrueCondition.IfNotDefault(_ => ComputeSomething());

It’s up to you to decide if it’s overcomplicated or not. For me it’s just a useful tool that suits my needs in some cases.

 

Better future

There are good odds that in the future the IfNotNull extension will no longer be needed.
It may be replaced by a new Safe Navigation Operator. It’s supposed to be working in the following way:

//member access
obj?.Property; //instead of obj.IfNotNull(x => x.Property);

//method invocation
obj?.Method(); //instead of obj.IfNotNull(x => x.Method());

//indexing
obj?[0]; //instead of obj.IfNotNull(x => x[0]);

You can read more about it on the Visual Studio User Voice, where Mads Torgersen (the C# Language PM) said: “We are seriously considering this feature for C# and VB, and will be prototyping it in the coming months.”

 

Regards,
Jakub Niemyjski

Proxy Swarm Mar 08

From time to time people ask me to vote. They send me a link and say: "vote for my school", "vote for my recipe in a cooking contest", "vote on my child's drawing" etc. I'm sure it happens to you too. Personally, I don’t care, but it led me to an interesting thought – since it is possible to bypass IP restrictions through proxy gates, it should be also possible to make votes automatically. I only need to get a list of proxy servers. Luckily, I found a pretty big one under this link.


The goal

The goal was to create an application that performs as many asynchronous proxy connections as possible and keeps the UI responsive at the same time. Of course, the first part isn’t universal – it mainly depends on what the actual task does and what is the network bandwidth, but the point was to write an efficient app that doesn’t have bottlenecks.


Testing

After a couple of days of coding my app was ready. The only thing that was left was to pick a proper site to test it. Google told me to try pollcode.com. It was a perfect choice for me, because it imposes IP restriction, so that only one vote from one IP address can be made (per day/week/month/year – depending on the configuration). I created my own poll and here you can see the test result:



Implementation

Proxy Swarm is a WPF application that has been built in the spirit of MVVM architectural pattern. All HTTP connections are made by HttpClient class, which is available since .NET 4.5. Each connection is handled as a separate task and all these tasks run in a dedicated AppDomain.

The following external libraries were used:

  • Fody/PropertyChanged - injects INotifyPropertyChanged code into properties at compile time.
  • TPL Dataflow - provides dataflow components to help increase the robustness of concurrency-enabled applications. Here is a great introduction to Task-based Asynchronous Pattern, which also describes Dataflow library.

Download

Project website and source code: https://github.com/nabuk/ProxySwarm
Readme: https://github.com/nabuk/ProxySwarm#readme

If you have any questions, please feel free to leave a comment or drop me an email (you can find it on my GitHub profile).

Regards,
Jakub Niemyjski

NSlice v. 1.2 Jan 26

New version of NSlice has been released. What’s new?

  • SliceDelete extension method for enumerable types.

(You can read what SliceDelete does in the previous post)

Like the Slice extension for enumerable types, SliceDelete works in a full lazy fashion, buffers the minimum number of elements required to perform the requested operation and disposes the source enumerator as soon as it's possible.

 

What’s next?

If you look at http://docs.python.org/2.3/whatsnew/section-slices.html you can notice that there is one thing that hasn’t been implemented yet – assigning a collection to a slice:

>>> a = range(3)
>>> a
[0, 1, 2]
>>> a[1:3] = [4, 5, 6]
>>> a
[0, 4, 5, 6]

But there’s a problem: “Extended slices aren't this flexible. When assigning to an extended slice, the list on the right hand side of the statement must contain the same number of items as the slice it is replacing:”

>>> a = range(4)
>>> a
[0, 1, 2, 3]
>>> a[::2]
[0, 2]
>>> a[::2] = [0, -1]
>>> a
[0, 1, -1, 3]
>>> a[::2] = [0,1,2]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: attempt to assign sequence of size 3 to extended slice of size 2

So the implementation would have to throw exceptions, but I don’t want to do that. The whole idea behind lazy slicing was to allow a developer to do it without prior knowledge of the collection size.
At the moment I don't know how to tackle this issue.

 

P. S.

Thank you for staring the NSlice project on GitHub. It means a lot to me. If you have any questions, suggestions or want to discuss something, drop me an email (you can find it on my GitHub profile).

 

Download

Available on NuGet – type “nslice”.
Project website and source code: https://github.com/nabuk/NSlice
Wiki: https://github.com/nabuk/NSlice/wiki


							
						

Code review Jan 05

I’m going to review 3 C# code snippets I have recently come across. At first glance they all may seem to be fine - they do their job and they can be understood easily. So what is wrong about them? I will try to answer this question thoroughly, but before I do that, I would like to ask you to do that too. Don’t rush into reading the review, think what you would change, how you would implement that. Ideally, write down your solution, then compare.

 

1. String concatenation

This is a pretty common scenario. You have a list of strings and you want to aggregate them into one. You also want to separate original strings with some delimiter like coma, colon, new line or something else.

Original solution

string result = ""; 
foreach(var item in stringArray) 
    result += item + ","; 
result = result.TrimEnd(',');

Review

First of all, strings are immutable, so whenever you concatenate two strings with a plus sign, a new string is created. Generally you want to use StringBuilder class for this kind of job:

var sb = new StringBuilder(); 
if (stringArray.Length > 0) 
    sb.Append(stringArray[0]); 
foreach (var item in stringArray.Skip(1)) 
{ 
    sb.Append(","); 
    sb.Append(item); 
} 
var result = sb.ToString();

Notice that I didn’t write sb.Append("," + item) because that would create additional intermediate string for each item. Same rule applies to formatted strings – don’t write sb.Append(string.Format( … )), use sb.AppendFormat( … ) instead.

In fact using StringBuilder for this job is unnecessary. This scenario is so common, that the framework provides a special method for it: string.Join. So our original code could have been written just in one line:

var result = string.Join(",", stringArray);

 

2. Enumerating files

The task was to enumerate all view files in a Views directory and all its subdirectories (MVC project).

Original solution

protected virtual IEnumerable GetAllCompilableFiles(string directory, bool recurse) 
{ 
    foreach (string str in 
        Enumerable.Union<string>(Enumerable.Union<string>(
            (IEnumerable<string>)Directory.GetFiles(directory, "*.cshtml"), 
                (IEnumerable<string>)Directory.GetFiles(directory, "*.master")), 
                    (IEnumerable<string>)Directory.GetFiles(directory, "*.ascx"))) 
                        yield return (object)str; 
    if (recurse) 
    { 
        foreach (string directory1 in Directory.GetDirectories(directory)) 
        { 
            foreach (object obj in this.GetAllCompilableFiles(directory1, true)) 
                yield return obj; 
        } 
    } 
}

Review

Firstly, let’s clear those IEnumerable<string> explicit castings because they are unnecessary.
Then let’s use extension methods instead of calling them explicitly, so that IEnumerable.Union<string>(first, second) will be changed to first.Union(second).
Since we are dealing with generic IEnumerables, let’s also change the return type to generic one, so that the calling methods won’t have to cast it back.
And I will also get rid of those braces – I am more into indent style than bracing.
Our rewritten code now looks like this:

protected virtual IEnumerable<string> GetAllCompilableFiles(string directory, bool recurse) 
{ 
    foreach (string str in Directory.GetFiles(directory, "*.cshtml") 
                            .Union(Directory.GetFiles(directory, "*.master")) 
                            .Union(Directory.GetFiles(directory, "*.ascx"))) 
        yield return str; 
    if (recurse) 
        foreach (string directory1 in Directory.GetDirectories(directory)) 
            foreach (string str in this.GetAllCompilableFiles(directory1, true)) 
                yield return str; 
}

I have to give the code’s author a credit for using recursion. I have seen too many code snippets where their authors must have thought: “Well, it is not going to be more than 3 levels deep, so I have to write 3 nested for loops”.
The very first thing that brought my attention to this code were the unions. This is a common mistake – a developer wants to unite two collections and uses Union extension, where in fact he or she should use Concat extension. The difference between those two is that Union returns only the distinct values, which in turn results in time and memory penalties. It’s redundant here, because the author has already used different file extension filers, so we won’t get duplicates.

But we don’t even have to concatenate results. Since we are enumerating files in a Views folder, then it’s expected that majority of them (if not all) would be views, right? So we can call Directory.GetFiles once and then filter out all non-view files:

var extensions = new[] { ".cshtml", ".master", ".ascx" }; 
Directory.GetFiles(directory).Where(file => extensions.Any(file.EndsWith));

Two things to fix still remain.
First – notice that Dictionary.GetFiles has an overload, which takes a SearchOption enum, which tells the method if we want to search for files recursively or at the top directory level only.
Now we know, that all that recursive code has never been necessary. There is a great lesson from that – when you do something trivial, google it or check for overloads if you have already picked a proper framework method, but still need to do some work around it.
Second – because our method returns files in a lazy fashion we shouldn’t use Directory.GetFiles (that works eagerly), but Directory.EnumerateFiles (that works lazily and it’s been available since .NET 4.0).

Finally, our code should look like the following:

protected virtual IEnumerable<string> GetAllCompilableFiles(string directory, bool recurse) 
{ 
    var extensions = new[] { ".cshtml", ".master", ".ascx" }; 
    var searchOption =
        recurse ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly; 
    return Directory.EnumerateFiles(directory, "*", searchOption) 
        .Where(file => extensions.Any(file.EndsWith)); 
}

Notice one thing – I could write:

.Where(file => new[] { ".cshtml", ".master", ".ascx" }.Any(file.EndsWith));

but that code would create a new string array for each enumerated file path, so I have chosen to use a captured variable instead.

 

3. Displaying products

The following Razor view was created to display all products with an information if they were selected by the user.
(If you are not familiar with Razor, don’t worry. Just read the code as you would normally do, keeping in mind that @ is a special character that tells the compiler that there will be C# code after it.)

Original solution

@{ 
    IEnumerable<IProduct> products = new List<IProduct>(); 
    if (Model.ProductRepository != null) 
    { 
        products = Model.ProductRepository.GetProducts().Cast<IProduct>(); 
    } 
    if (products != null && products.Any()) 
    {
        for (var i = 0; i < products.Count(); i++) 
        { 
            <label class="checkbox"> 
                <input 
                    type="checkbox" 
                    name="SelectedProductsIDs" 
                    value="@products.ElementAt(i).Id" 
                    @(Model.SelectedProductsIDs.Contains(products.ElementAt(i).Id) ? "checked" : "") /> 
                @products.ElementAt(i).Name 
            </label> 
        } 
    } 
}

Review

Before I get to crucial things I want to make this code more consistent.
Since we are dealing with IEnumerable<>, we should stick to that and change the first line into:

var products = Enumerable.Empty<IProduct>();

Let’s also get rid of the second if check. It is not necessary. The products variable will never be null and checking if there is at least one item is redundant because if that’s the case, then the for loop won’t run.

Ok, cut to the chase. I don’t like two things about this code.
First - the for loop has expected O(n2) complexity, whereas it should have O(n). To express that by numbers: 100 products would turn into 10 thousand operations (not exactly that number, it’s just an order of magnitude).
Second - the variable products is evaluated more than once.
If Model.ProductRepository.GetProducts() returns some long running operation wrapped into IEnumerable<> (reading the products from HDD, database or from web service), then given 100 products, the above code would perform that long-running operation 401 times - 101 for Count() and 3*100 for ElementAt().

How can we correct that code? All we have to do is change the for loop into foreach loop and then we can safely get rid of those ElementAt extensions:

@{ 
    var products = Enumerable.Empty<IProduct>(); 
    if (Model.ProductRepository != null) 
    { 
        products = Model.ProductRepository.GetProducts().Cast<IProduct>(); 
    } 
    var selectedProductsIDs = new HashSet<int>(Model.SelectedProductsIDs); 
    foreach (var p in products) 
    { 
        <label class="checkbox"> 
            <input 
                type="checkbox" 
                name="SelectedProductsIDs" 
                value="@p.Id" 
                @(selectedProductsIDs.Contains(p.Id) ? "checked" : "") /> 
            @p.Name 
        </label> 
    } 
}

I also replaced Model’s SelectedProductsIDs property with a HashSet variable. The reason for that is Contains method (or LINQ’s extension) which normally has expected O(n) complexity (for arrays, lists, enumerables) whereas HashSet’s Contains method has O(1) complexity. If I didn’t do that, the loop would still have O(n2) complexity (I assumed that SelectedProductsIDs is an array, list or enumerable).

NSlice v. 1.1 Dec 23

New version of NSlice has been released. What’s new?

  • SliceDelete extension method for indexed and string types.

SliceDelete returns exactly opposite result than Slice. The result set is ordered in the same way as the original one. Look at the following example that returns first 2 and last 2 elements:

var collection = Enumerable.Range(0, 10).ToList();
collection.SliceDelete(2, -2); 
//or 
collection.SliceDelete(-3, 1, -1);

Result: { 0, 1, 8, 9 }

You can read more about NSlice in one of the previous posts: Slicing for .NET


Download

Available on NuGet – type “nslice”.
Project website and source code: https://github.com/nabuk/NSlice
Wiki: https://github.com/nabuk/NSlice/wiki

Slicing for .NET Dec 01

One year ago I had a little crush on Python. Although I consider myself as a hardcore C# coder I think it is a good thing to try other languages. Even if we do not plan to change our major, it is nice to grasp other techniques and practice new ways of thinking, which can be later incorporated into our favorite language.

First odd thing I read about Python was that a developer can use negative array indices. My reaction was: Why would I even want to get an IndexOutOfRangeException? That is just silly. But then I read what they actually do. They are just like the normal ones. The only difference is that they index the array backwards, for example: -1 means last, -2 means one before last, and so on. That is really handy. I cannot remember how many times I wrote count-1 or count-i.

After that I discovered that Python has an even cooler feature called array slicing. It is something like a quick for loop version. Let me show you an example:

array = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print array[1:9:2]

The above code prints [1,3,5,7].

So formally speaking slicing has the following form (each argument can be negative, all arguments are optional):

[ start_index : exclusive_boundary : iteration_step ]

This is a really powerful though succinct syntax. Look at the following two examples:

array[::-1] #reverses the array
array[1:-1] #skips first and last element

Then I stared to think about those features - how can I port them to C# ?

NSlice

I decided to give it a try and created a .NET library. At the moment there are 3 extension methods:

  • Slice – performs slice on a passed collection.
  • At - works like an ElementAt extension. The only difference is that the index argument passed to an At extension can be negative.
  • AtOrDefault - works like an ElementAtOrDefault extension, but it also accepts negative indices.

But there is more into it. NSlice was written to allow slicing 3 types of .NET collections in the most efficient manner:

  • Indexed collections (IList<> implementations) - slicing is performed eagerly and instantly. It does not matter whether the collection has 10 elements or 10 million elements. This is because the result is not created by copying elements, but by creating a transparent view over source collection.
  • Enumerables (IEnumerable<> implementations) - slicing is performed lazily. Each possible slice scenario was implemented separately to achieve the best speed performance and least memory footprint. It fits nicely into the LINQ model and could be even used to slice a stream, if the latter was wrapped into IEnumerable<> implementation.
  • Strings - slicing is performed eagerly and a new string is returned as a result.


Download

Available on NuGet – type “nslice”.
Project website and source code: https://github.com/nabuk/NSlice
Wiki: https://github.com/nabuk/NSlice/wiki

Overuse of “as” keyword Dec 22

Recently I realized that I am overusing "as" keyword. It is probably because its syntax is more fluent for me than ordinary cast. I see this overuse even in Microsoft examples. Let the following code show you what I mean:

object x = "1.0";
Version v = x as Version;
Console.WriteLine(v.Major);

and just for reference reasons ordinary cast:

object x = "1.0";
Version v = (Version)x;
Console.WriteLine(v.Major);

When you run first snippet it will crash on third line throwing NullReferenceException.
Second snippet on the other hand will crash on second line throwing InvalidCastException.

Now tell me, which gives you more info ?

For me the real problem with first one is that it can misguide in more complex scenarios. Imagine a method that returns some reference type and the last action is cast. What would be your first thought when your consuming code throws NullReferenceException ? That was my scenario and frankly I lost too much time analyzing good code. If I had only got InvalidCastException in proper line …

Beware the IEnumerable<T> Oct 31

If you see IEnumerable<T> as read only collection crippled brother, this post is for you.

Many times I found myself loosing couple of hours looking for a bug everywhere, but not in a place where it was. From time to time the bug is caused by my simplified perception of IEnumerable. The thing is, if we do not know the mechanism that serves specific IEnumerable elements, we cannot be sure that each iteration will return equal collections with same objects.

It is not a big deal if it hosts immutable objects. With mutable however, we must be careful.

Let’s analyze following properties:

public static IEnumerable<StringBuilder> ByArray
{
    get
    {
        return new StringBuilder[] { new StringBuilder("foo") };
    }
}

public static IEnumerable<StringBuilder> ByYield
{
    get
    {
        yield return new StringBuilder("foo");
    }
}

and code that operates on them:

var enumerable = ByArray;
enumerable.First()[0] = 'b';
Console.WriteLine(enumerable.First());

enumerable = ByYield;
enumerable.First()[0] = 'b';
Console.WriteLine(enumerable.First());

This will write 2 lines. Can you predict each of them ? If not, I strongly advise you to run the code.

But that is not all. Analyze those two properties:

public static IEnumerable<StringBuilder> ByDynamicArray
{
    get
    {
        return new StringBuilder[] { new StringBuilder("foo") };
    }
}

public static IEnumerable<StringBuilder> ByStaticArray
{
    get
    {
        return fooArray;
    }
}

private static StringBuilder[] fooArray =
    new StringBuilder[] { new StringBuilder("foo") };

and their usage:

ByDynamicArray.First()[0] = 'b';
Console.WriteLine(ByDynamicArray.First());

ByStaticArray.First()[0] = 'b';
Console.WriteLine(ByStaticArray.First());

Notice that now we are operating directly on property, not on a local variable like in previous example. This is also important and can cause different output.

So be careful and think twice before writing code that operates on IEnumerable<T>.

Linq’s ZIP for adjacent items Oct 30

I found that Linq’s ZIP is really great for adjacent item computations.

For example, let’s have a collection of dates:

var dates = new DateTime[]
{
    new DateTime(2000,1,1),
    new DateTime(2000,1,2),
    new DateTime(2000,1,5)
};

How would you compute time difference of adjacent items ?

I like to use Zip for this kind of job:

dates.Zip(dates.Skip(1), (d1, d2) => d2 - d1);

As you might expect, the result will be: { 1 day, 3 days }