Geeks With Blogs
.NET Nomad What I've learned along the way
Back Links

LINQ Overview, part zero

LINQ Overview, part one (Extension Methods)

 

NOTE: This article is dedicated to Keith Elder...even if he never sent me a bologna sandwich.

Apparently, two months is my definition of "very soon".  Let's continue.

Since .NET 1.1 we've had the concept of delegates.  They are the constructs that allow us to call methods on objects via reference such as:

delegate int AddFunc(int x, int y);
public static class MathOps
{

   public static int Add(int x, int y)
   {

      return x + y;

   }
} 
class Program
{
   static void Main(string[] args)
   {
           
      AddFunc f = new AddFunc(MathOps.Add);

      Console.WriteLine("Delegate: 2 + 2 = {0}", f(2, 2));
      Console.ReadLine();

   }
}

There is nothing new and exciting about delegates as calling a function via pointer has been around for a very long time.  In fact, delegates are actually somewhat annoying in terms of syntax.  They must be declared in a class, you must wrap them in an object, etc.  Why can't we have a simpler syntax? After all, most of the time delegates are used to respond to relatively simple events or act as part of a strategy pattern (e.g. in a sort).

Anonymous Methods

In honor of Bill Gates, .NET 2.0 decided to give us a kindler and gentler delegate syntax.  The main method above could easily be rewritten as:

class Program
{
   static void Main(string[] args)
   {
           
      AddFunc f = delegate(int x, int y) { return x + y; };

      Console.WriteLine("Anonymous Method: 2 + 2 = {0}", f(2, 2));
      Console.ReadLine();

   }
}

As is common on the .NET platform the delegate keyword was overloaded to give it additional meaning.  Now one could assign to a delegate variable directly, in the current scope.  The new anonymous method syntax was similar to a method declaration.  The differences are pretty obvious, but I'll list the major ones.  Firstly, anonymous methods don't require identifiers, hence the terms anonymous methods.  Secondly, anonymous methods do not need to specify a return type.  This is due to some rudimentary type inference built into the compiler.  In essence, if we already know that we are assigning to a delegate of type "AddFunc" whose return type is "int", it should be obvious to the compiler that as long as the return statements in the delegate's body return an "int" then our anonymous delegate matches the signature of "AddFunc".  The counterintuitive aspect of this is that we still have to specify the types of the anonymous method's arguments.  After all, shouldn't the compiler be smart enough to also assume the types of our "x" and "y" based on the delegate type we are assigning to?  It should be, but unfortunately it is not.

There is something else I want to say about anonymous methods before moving on. This is something I come across all the time and some developers just don't get: anonymous methods allow for lexical closures. 

 

Lexical Closures

There is a lot of bickering on the net about "does .NET support 'true' closures?"  Well, based on my understanding and in my opinion, they support lexical closures or at least something close enough that for most practical purposes it doesn't matter.  I'll leave the 100% correct definition to the language lawyers and just give a quick example and some reasons why a lot of developers get caught in the lexical closure trap.

delegate int Increment();
static void Main(string[] args)
{
   
   Increment AddOne = AnonInc(0, 1);
   Increment SubOne = AnonInc(10, -1);

   for (int i = 0; i < 10; ++i)
   {

      Console.WriteLine("{0},{1}", AddOne(), SubOne());

   }
   Console.ReadLine();
}
static Increment AnonInc(int start, int by)
{

   return delegate { return start = start + by; };

}

The output of the about code should be:

1,9
2,8
3,7
4,6
5,5,
6,4,
7,3
8,2
9,1
10,0

First, take a look at our delegate "Increment".  It takes no arguments and returns an "int".  The idea is that delegates will somehow increment "a value" and return the next value in the sequence. 

Next, look at the method "AnonInc".  Does it return a delegate? That's crazy!  Further, it returns a delegate that makes use of something commonly referred to as "up values" or "outer variables" depending on the person/system/said person's mood.  An outer variable is simply a variable that exists in the scope that contains the delegate.  In this case, our delegate's scope is the "AnonInc" method in which the "start" and "by" arguments are implicitly defined local variables. 

Now, based on the definition of the delegate returned by "AnonInc" and the output of the program we can tell something interesting is going on here.  The question you should be asking right now is, "How is it that we are modifying the value of a local variable inside a delegate and it is keeping track of the change?"

If you recall delegates, and therefore anonymous methods, are represented by objects.  These objects are instances of classes that are automatically generated for you at compile time.  They have funny, mangled names and you can not really do too much with them.  The thing that one needs to know is that any outer variables used by an anonymous delegate become attributes of this auto-generated class.  So, in our case if we look at the assembly generated by the above program using a tool like Reflector we should find a class like:

[CompilerGenerated]
private sealed class <>c__DisplayClass7
{
    // Fields
    public int by;
    public int start;

    // Methods
    public int <AnonInc>b__6()
    {
        return (this.start += this.by);
    }
}

As you can see, the above class has two attributes with the same names as our outer variables and a method that accesses them.  Looking at the code this way kind of takes the magic out of anonymous methods and we being to realize that it is sort of like what I said about extension methods, it is just syntactic sugar.  Handy, but not magical.

So, what is this trap I was talking about?  Well, it has to do with the garbage collector.  As we all know, in .NET an object lives in memory until it is explicitly disposed of or goes out of scope.  In general perhaps "goes out of scope" is best thought of as "until no other object holds a reference to it".  With lexical closures happening more or less behind the scenes it is very easy to create a memory leak such as the following:

public class ResourceWrapper
{

    public void OpenOnClick(Button btnOpen, string resourcePath)
    {

        SomeResource res = new SomeResource(resourcePath);

        btnOpen.Click += delegate(object sender, EventArgs e) { res.Access(); };

    }
    
}

public class SomeResource
{

    public SomeResource(string path) { }

    public void Access() { }

}

Granted, this example is contrived, but you see similar things all the time.  So, what's going on here? Basically if we look at "OpenOnClick" we can see that an anonymous method is being registered as the Click event for a button.  Further, the anonymous method is using an outer variable "res".  This means that the following class gets generated for us:

[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
    // Fields
    public SomeResource res;

    // Methods
    public void <OpenOnClick>b__0(object sender, EventArgs e)
    {
        this.res.Access();
    }
}

Normally, we'd just assume that since "res" is a local variable in the "OpenOnClick" method that it'd die as soon as it ran out of scope, i.e. at the end of the method.  However, since our anonymous delegate is holding a reference to it, the object "res" is referencing will live until the anonymous delegate itself goes out of scope.  One can easily see how this kind of situation can go bad quickly.  To avoid this situation, be careful to unregister your anonymous methods when you use them as event handlers!

Alright, so why did I get into all of this anonymous method stuff if the post is supposed to be about Lambda Expressions? Well, because Lambda Expressions in C# are just an evolutionary step beyond anonymous methods.  Let's chip away at some of the sugar...

 

Our first Lambda

It is difficult to describe the syntax of a lambda expression since it is very ambiguous and depends on multiple factors.  With that in mind let's look at a quick example:

AddFunc f = (x, y) => x + y;

The above snippet declares a new AddFunc delegate and assigns a lambda expression to it.  Everything to the right of the = operator is the lambda definition. 

Some questions:

  1. Where is the return type?
  2. Where is the identifier?
  3. Does (x, y) denote the parameter list?
  4. What does the => do?
  5. Why isn't there a return statement?

Some answers:

  1. Lambda expressions do not need an explicit return type.  Just like with anonymous methods the compiler is smart enough to infer the return type based on the type of delegate it is being assigned to. In this case AddFunc returns an int, and so the lambda implicitly returns and int.  Obviously it is a compiler error if the lambda does not.
  2. Lambda expressions are by definition anonymous.  They do not have identifiers.
  3. Yes.  Further, you should note that lambda parameters do not need to explicitly state their type.  This, like the return type, is inferred by the compiler based on their order compared to the delegate's parameters list. You can, however, state the types explicitly.  (int x, int y) is a valid lambda expression parameter list.
  4. The new => operator is the start of the expression's body.  Everything after => defines what the lambda expression does.
  5. Lambda expression don't require an explicit return statement.  When a return isn't provided the return value is assumed to be whatever the lambda expression evaluates to.

So, let's take a look at a few other valid ways to write lambda expressions:

(int x, int y) => { return x + y; };

The above is the most explicit way.  We've specified types for the parameters and a real return statement.  Notice how when we use an actual return expression we have to use the { } brackets? This same syntax allows us to create multi-line lambdas and lambdas that declare local variables.

(x, y) => { return x + y; };

This one keeps the return statement and just drops the optional types in the parameter list.

() => x + y;

In the above, we've specified a lambda with an empty parameter list.  In this case we are assuming the existence of x and y as outer variables (yes, lambda expressions support lexical closures just like anonymous methods).

 

A Lambda is what you assign it to

So far we've seen that lambda expressions are compatible with delegates in the sense that you can assign a lambda directly to a delegate, but there are other interesting uses. Take a second and think about writing a program in a text editor.  To the text editor, or for that matter to the compiler, the lines of code your write are just data.  The compiler doesn't execute your program, it simply translates data from one format to another.  It is natural then to ask, "If I can store a program as data, can I load a program as data at run time and then execute it?" With lambda expressions the answer is yes.

If we assign a lambda expression to a delegate it becomes a delegate of that type.

If we assign a lambda expression to an appropriately typed Expression Tree it gets converted at compile time to equivalent Expression objects.

For example:

Expression<Func<int, int, int>> exp = (x, y) => x + y;

This statement simply says, "Convert this lambda expression into an expression tree equivalent to a method that takes two integer parameters and returns the sum as an integer".

There is no resulting compilation of this tree and no execution of code as a result of this statement.  If at runtime we need to execute the function the tree represents, we must say:

Expression<Func<int, int, int>> exp = (x, y) => x + y;
var func = exp.Compile();
Console.WriteLine("{0}", func(1, 1));

Now, it isn't inherently obvious why this is cool so I'll spell it out: If the compiler can represent executable code using Expression objects, so can we.  In fact, we will do exactly that by the end of this series.

As funny as it may sound, this is all you really need to know about lambda expressions.   You can use them in place of anonymous delegates (and you should), they forced the .NET team to provide C# with something approaching real type inference, and they allow us to represent code as data in a statically type checked way.

 

LINQ Tie In

Awesome. How are Lambda Expressions useful in LINQ?  Well, by now you've read the basic LINQ syntax somewhere else as I asked so I'll just show a couple of quick examples:

static void UseLINQ()
{

    var names = new List<GenderedName> { 
        new GenderedName { Name="Bob", Gender=Gender.Boy }
        , new GenderedName { Name="Sally", Gender=Gender.Girl }
        , new GenderedName { Name="Jack", Gender=Gender.Boy }
        , new GenderedName { Name="Sarah", Gender=Gender.Girl }
        , new GenderedName { Name="Philbert", Gender=Gender.Boy }            
    };

    var boyNames = names.Where((n) => n.Gender == Gender.Boy).Select((n) => new { n.Name });

    foreach (var name in boyNames)
        Console.WriteLine("{0}", name.Name);

}

This above function queries a list of names for those that are traditionally used for boys.  In order to make use of the actual lambda expression syntax I used the method based approach to querying with LINQ.  In fact, there are two lambdas in our code:

(n) => n.Gender == Gender.Boy

This lambda is for our selection criteria and simply compares the given name, n, to see if it is used for boys. 

(n) => new { n.Name }

In this expression we are returning a new anonymous type that just contains the Name property of the GenderedName that has passed our selection criteria.

We can simplify, or rather pretty up, this method by using the new LINQ keywords as so:

static void UseLINQ()
{

    var names = new List<GenderedName> { 
        new GenderedName { Name="Bob", Gender=Gender.Boy }
        , new GenderedName { Name="Sally", Gender=Gender.Girl }
        , new GenderedName { Name="Jack", Gender=Gender.Boy }
        , new GenderedName { Name="Sarah", Gender=Gender.Girl }
        , new GenderedName { Name="Philbert", Gender=Gender.Boy }            
    };

    var boyNames = from n in names
                   where n.Gender == Gender.Boy
                   select new { n.Name };

    foreach (var name in boyNames)
        Console.WriteLine("{0}", name.Name);

}

It doesn't look like we are using lambda expressions here, but we really are.  It is just that the compiler needs to turn our pretty code into the same method calls that we just used, and therefore ultimately into an Expression Tree for later execution.

I just want to be very explicit here and point out something.  When we are using LINQ we use lambda expressions as delegates.  We know this because the parameters of the Where method accept arguments of the Func<T> variety.  The Func series of generic types are actually generic delegates.  For example, MSDN has the following definition for Func<T, TResult>:

public delegate TResult Func<T, TResult>(
    T arg
)

This usage of delegates and expression trees is what allows LINQ to support Lazy Evaluation.

Posted on Tuesday, January 29, 2008 9:41 AM General .NET , LINQ | Back to top


Comments on this post: LINQ Overview, part two (Lambda Expressions)

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Article is interesting but horrible style (black background, white text with some blue text)
Left by Reader on Jan 29, 2008 12:48 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
seriously its great. no words to say thank you so much
Left by dinesh kumar on Jun 19, 2009 9:00 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
This was a great introduction, as well as a peek behind the smoke and mirrors. Thanks for all the useful information!
Left by Coop on Jul 17, 2009 10:50 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Very nice work, amazing fluency in explaining difficult topics! Which planet are you?
Left by Noble on Sep 03, 2009 3:39 AM

# I know its a blog but...
Requesting Gravatar...
Hi,
Just a suggestion. It would be very nice to have a section on every article named related Articale. For example this page is Part 2 but I do not have the link to Part 1. I can search it though but it would still be nice to have a direct link.
Great Article by the way.
Left by Anks on Oct 26, 2009 12:06 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
@Ank

As requested, I added some forward/backward links throughout this series.

@Noble

Planet Earth for now, but my realestate agent just offered me a nice plot on Jupiter ;)

@everyone

Thanks for the feedback, I've let this blog go over the last year and all this positive feedback is making me regret that!
Left by Newman on Oct 26, 2009 5:19 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Fantastic article. Its really worth reading this. Thank you very much for explaining us LINQ. The article was very well organised.
Left by Supreetha on Nov 05, 2009 1:50 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
A really good, and inspiring, article. Thank you very much. It's demystified a lot - tied a lot of loose ends. Thanks.
Oh! Don't stop!
Left by debo on Dec 11, 2009 10:49 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Thank you so much... This is a pretty good document on Lambda - LinQ I have seen so far.

I got just what I wanted.

Kudos.
Left by Firoz Ozman on Apr 07, 2010 9:07 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Its fentastic, excellent, superb and all. There is no word to tell you how much help you gave me. May God bless you always.
Left by AbdulAleem on May 07, 2010 4:04 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
I am totally new to this lambda and extensions and your article made it so simple that it takes just 15 mins to get a clear understanding! tons of thanks!
Left by Meens on May 26, 2010 3:13 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Incredible work! you have explained everything except the last new {n.Name} who throws out a new IEnumerable<Anonimous Type> that leaves me with the need to read more about it somewhere :P

thanks for all
Left by clasificado on Jul 09, 2010 2:16 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
The Best.
Left by Sudeep Srivastava on Jul 23, 2010 10:14 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
one of the best article on the above topic...thanks mate..
Left by Karan on Apr 07, 2011 1:09 PM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
good for begginers who dont know what lambda expression is.thanks for the articla
Left by namratha on May 27, 2011 6:18 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
superb stuff. Thanks.
Left by tomtom on Jan 12, 2012 11:31 AM

# re: LINQ Overview, part two (Lambda Expressions)
Requesting Gravatar...
Thanks! easiest way to understand lambda expression linq.
Left by Sanket Joshi on Jan 12, 2012 6:28 PM

Your comment:
 (will show your gravatar)
 


Copyright © newman | Powered by: GeeksWithBlogs.net | Join free