.NET Nomad

What I've learned along the way

  Home  |   Contact  |   Syndication    |   Login
  11 Posts | 0 Stories | 24 Comments | 0 Trackbacks

News

Archives

Post Categories

Wednesday, February 13, 2008 #

Wow. Turns out this thing might be useful to more than just me.  Anyway, I've added P/Invoke calls to allow access to the WinPCap functions that are specific to Windows as well as the packet filtering functions.  I've added a port of the packet filtering example as well.  I can't say I've tested all the calls yet, regardless, at this point I am going to start moving up to a higher level.  I think we should now have access to almost all the calls available in the native WinPCap, but feel free to leave a comment to let me know otherwise.

Download Solution - Nomad.Net.PacketCapture.zip


Thursday, January 31, 2008 #

As I stated in my last post, I am currently geeking out on packet capture software.  The WireShark network analysis tool is pretty awesome and is built using the WinPCap library which is itself a port of libpcap to the Win32 environment.  The unfortunate part, or at least for us .NET developers, is that there is no (IMHO of course) good .NET binding to WinPCap.  Well, I've gone ahead and spent some time getting an initial wrapper done using P/Invoke.

I've taken a different initial strategy from the other project I found and I think it makes more sense in the long run.  The other project attempts to provide a more elegant binding by wrapping the native functions of WinPCap in a couple of simple objects.  This isn't necessarily bad, however the author decided to hide the actual P/Invoke calls, which means if you (like me) don't like his class structure, you can't simply use the P/Invokes directly to build your own.  So, I went ahead and I wrote my own set of P/Invoke calls to bind to WinPCap.

I tried to stick to the following rules:

  1. All P/Invoke method name exactly match the native function names
  2. All managed type names used in marshalling exactly match the native struct names
  3. All method parameters in the P/Invokes have the same name as in their native functions
  4. Marshal parameters as closely as possible (i.e. don't just use IntPtr for everything)

Of course, I couldn't stick to the rules 100% due to some technical limitations, but in most cases it is obvious where I deviated.

We should see the following advantages from this:

  1. All documentation already in existence for WinPCap is still 100% relevant.
  2. All C/C++ code samples using the native WinPCap can be ported (more or less) directly into C# code.
  3. Developers now have the freedom to build their own tools using WinPCap in .NET using whatever class structure they'd like

The Library

Aside from the supporting structs that are used to marshal the unmanaged types, there are only two classes you need to be concerned with in the library.

WinPCapConstants - Holds the constants translated from the #defines of the pcap.h header file

WinPCapDriver - Static class that holds all P/Invoke declarations used to bind to WinPCap.  All functions except those that are listed in the WinPCap docs as Windows specific or dealing with packet filtering are currently available.  The Windows specific functions and packet filtering functions will be added as the project progresses.

 

There is also one handy extension method that I added to help with marshalling the various data structures used by WinPCap.  It is defined as follows:

namespace Nomad.Net.PacketCapture.Interop
{
    public static class IntPtrExtensions
    {

        public static TStruct AsStruct<TStruct>(this IntPtr ptr)
        {

            return (TStruct)Marshal.PtrToStructure(ptr, typeof(TStruct));

        }

    }
}

Normally, we'd have to write something like the following to marshal a pcap_if structure from unmanaged to managed code:

pcap_if nic = (pcap_if)Marshal.PtrToStructure(nicPointer, typeof(pcap_if));

This is pretty heinous...In C we would have been able to just cast or dereference the pointer.  With the extension method above, the code becomes more readable:

pcap_if nic = nicPointer.AsStruct<pcap_if>();

It isn't all that much shorter in terms of length, but it reads much more like English.

 

Going Forward

My next step is to continue porting the original WinPCap examples from C to C#.  This will help me test the P/Invoke calls and learn more about WinPCap and how it is supposed to operate.  After that I will create my own higher-level wrapper using C# to abstract the details.  Again, the best part of all this is that if you don't like my high level code then you should still at least be able to access WinPCap via .NET, albeit in a more raw form.

The following code should work, but you first need to install WinPCap.  I also recommend you download the developer pack so you can compare the native C examples with the couple of .NET ones I've done so far. The differences are relatively minor, but important since they mostly have to do with features/limitations of mixing C and .NET.  Also, the HTML based documentation for WinPCap is still valid, and I have yet to put much in the way of comments in my code. 

Consider the following downloads as Alpha quality at best:

Download Solution - Nomad.Net.PacketCapture.zip


Wednesday, January 30, 2008 #

In my copious amount of free time I've been messing around with network analysis and security.  I've always been generally interested in networking technology, but have never really had much practical exposure to it.  Sometimes, however, it is nice to be able to analyze a network and see what kind of information is actually coming across the wire.  In my last article I mentioned a tool called WireShark which is a free, open source network analyzer aka packet sniffer.

WireShark is a great tool and has its own set of extension points, but I wanted lower level access to the packets being captured.  My understanding of the politics and genesis is lacking, but it seems like the WinPCap library is the Windows version of the libpcap packet capture library from the *NIX world.  Naturally, WinPCap is coded in C and even though I have some background in it, the tool I am looking to develop requires a lot of UI work.  Instead of stepping back into the land of MFC/Win32, I tried to locate a Managed version of WinPCap.  The closest thing I could find was this Ancient Project on CodeProject.com.  It hasn't been updated since 2003 and isn't a "fully" managed wrapper (also, the source code in the download is just to the example, not the wrapper).

I figured, "If this guy can do PInvoke, so can I".  Thus, I downloaded the WinPCap developer pack and attempted to open the example solution in Visual Studio 2008.  Visual Studio 2008 alerted me to the fact that I had to upgrade the project (which was actually a VS 6.0 .dsw file) and I happily agreed.  The upgrade went smooth so I attempted to compile the solution, but received the following error:

"error C3163: '_vsnprintf': attributes inconsistent with previous declaration    c:\program files\Microsoft visual studio 9.0\vc\include\stdio.h    358    savedump"

Crap. Apparently this is a common problem when compiling older C++ code with the Visual Studio 2008 C++ compiler.  Now, I didn't find a solution for this on the net specific to WinPCap, but several forum posts across other projects lead me to the following solution.

First, find the pcap-stdinc.h file on your system. It should be located in: "...\WpdPack_4_0_2\WpdPack\Include"

Next, locate the following code near the bottom of your header:

#define snprintf _snprintf
#define vsnprintf _vsnprintf
#define inline __inline

The problem, as we can tell from the compiler error, is that the "#define vsnprintf _vsnprintf" causes some incompatibilities with what is already in stdio.h.  Modify your code to the following and save the header:

#define snprintf _snprintf

#if !defined( __MINGW32__ )
# if _MSC_VER < 1500
    #define vsnprintf _vsnprintf
# endif
#endif

#define inline __inline

You should now be able to compile all the examples in the solution!

All that we've done is check the version of the compiler at compile time.  If the version is prior to MSC++ 9.0 then we go ahead and do the #define.  Otherwise, we don't do the #define and rely on what is in stdio.h.

This solution is general in nature, i.e. anything that defines _vsnprintf may exhibit this issue, but specific in the sense that the exact location of the code to modify will vary by project.  In the case of WinPCap, everything is groovy at this point.  Now I just need to learn everything I can about PInvoke : )

 

Tuesday, January 29, 2008 #

Download Solution - OfflineHtml.zip

So, one of the cool controls available to us in WinForms is System.Windows.Forms.WebBrowser.

The WebBrowser control is essentially a managed wrapper around some COM interfaces that bind to Internet Explorer and provides us with several interesting capabilities.  First of all, one can use WebBrowser to easily display a web page in a WinForms application.  All you have to do is set the WebBrowser.Url property and the control takes care of getting the assets from across the wire and rendered on the screen.

WebBrowser also exposes some interesting events that allow a programmer to react when a document is loaded, navigation is peformed, etc.  There are probably a ton of places, including MSDN, where you can get that kind of information so I won't go over it here.  Instead, I am going to show something that isn't immediately obvious, but that I believe I found a clean solution to.

 

The Task

What I want to do is load an HTML page that is on my local computer without causing any network traffic, e.g. it won't load images on the page.  Similar to say, loading a web archive in Internet Explorer.  For our purposes let's use the Google Home Page as an example.

 

The First Attempt

I immediately set upon this task thinking it would be pretty easy.  From what I had gathered on MSDN, after loading a page the WebBrowser control's Document property is populated with an HtmlDocument object.  Similar to System.Xml.XmlDocument, HtmlDocument is a tree like representation of the web page's HTML DOM and it exposes some handy properties for manipulating the HTML elements rendered by the WebBrowser control.  For example, the following code demonstrates setting all of the "src" attributes of the HtmlDocument's img tags to the empty string:

public HtmlDocument StripImageLoading(HtmlDocument document)
{

    foreach (HtmlElement image in document.Images)
       image.SetAttribute("src", string.Empty);
            
    return document;

}

Iterating over the various HtmlElementCollection objects exposed through HtmlDocument's properties allows one to alter, and even add, HTML elements. 

This is great, but how do we actually get the WebBrowser control to load an HtmlDocument for us?  There are three primary methods, each of which I'll demonstrate with a code snippet.

 

Setting WebBrowser.Url:

public void LoadPage()
{

    WebBrowser browser = new WebBrowser();
    browser.Url = new Uri("http://www.google.com");

}

The "primary" way to load a page is to set WebBrowser.Url to a valid Uri object.  When this is done the WebBrowser will get all required data for the page via HTTP and render the results into our HtmlDocument (accessible via the WebBrowser.Document property).

 

Setting WebBrowser.DocumentText:

public void LoadPage()
{

    WebBrowser browser = new WebBrowser();
    browser.DocumentText = @"<html><img src=""http://www.domain.com/someimage.gif""</html>";
}

This is the first method that would enable us to achieve our offline viewing goal.  We simply set WebBrowser.DocumentText with a string of HTML and the control uses that to render the page.  The issue with this method is that any HREFs or SRC attributes will be resolved by the WebBrowser control.  In otherwords, in the above example the image file referenced in our <img> tag will actually be downloaded and rendered into the page on the screen.  This, is clearly not what we want.

 

Setting WebBrowser.DocumentStream:

public void LoadPage()
{

    WebBrowser browser = new WebBrowser();
    FileStream source = new FileStream(@"C:\page.html", FileMode.Open, FileAccess.Read);

    browser.DocumentStream = source;

}

This method allows us to access our page as a Stream.  The WebBrowser control will load the data from the Stream and again, render it into an HtmlDocument object.  Like the DocumentText property, however, it will resolve any HREF or SRC attributes and get the resources from the web.

 

The Hurdle

What we need to do at this point should be clear: we need to some how modify the HtmlDocument prior to the WebBrowser control rendering it on the screen.  I figured there would be an event exposed for this, seemingly obvious, desire.  I looked into the following events, hoping for a quick solution:

WebBrowser.DocumentCompleted - This event is fired AFTER the page is fully rendered, so it unfortunately doesn't help up.  We can still modify the HtmlDocument at this point, but since any referenced resources have already been downloaded, it is of little value in our situation.

WebBrowser.ProgressChanged - This event is fired as the page and its resources are being gathered.  It is fired asynchronously, so be very careful when using it.  That being said, I figured initially that I could wait for progress to be 100% and then I'd modify the document.  Unfortunately, this too did not work.

WebBrowser.FileDownload - Aside from DocumentCompleted, this seemed the most promising.  After all, perhaps I can check to see if the file being downloaded is an image, and if so, simply cancel the download.  No, that won't work because the FileDownload event simply takes an "EventArgs" parameter and therefore gives us no meaningful state on which to operate.

 

So, at this point we have no way of using events to accomplish our task.  We have to find another way.  As most developers do, I scanned the net to find out if this problem had already been cracked.  I didn't find an exact solution, but I did find something that helped at least spark my imagination.  I point you now to the blog of Jim Holmes.  I kind of know Jim a little from when I lived in Ohio and went to a few Dayton .NET Users Group meetings (of which Jim was/is the President).  Now, Jim is a very smart guy (in fact he has a great O'Reilly Book out right now) so I'm not sure what happened, but in his article I think he makes a few mistakes about how the WebBrowser control works and I will point those out when we come to them.  Like  I said though, his article at least sparked something in my mind: How do I get an empty HtmlDocument without going through the WebBrowser control?

 

The Solution

What we want is to load an HTML page from the local system without causing any actual network traffic.  To make our example more simplistic let's just say we don't want images to load at all.  My solution, for .NET 3.0/3.5 at least, is to introduce an Extension Method for the WebBrowser control that allows us to arbitrarily "filter" the HtmlDocument prior to loading it.  The entire solution is available for download at the beginning of this article, so I've chunked it up a bit for display purposes:

public static class WebBrowserExtensions
{

    /// <summary>
    /// Load an HTML document from a Stream and pass the text through a filter before the page is
    /// rendered in the WebBrowser control.
    /// </summary>
    /// <param name="browser">control that renders the filtered HTML</param>
    /// <param name="source">Stream containing the content to filter and render</param>
    /// <param name="filter">Delegate used to filter the source Stream</param>
    public static void ProcessRequest(this WebBrowser browser, Stream source, Func<HtmlDocument, HtmlDocument> filter)

 

As we know, Extension Methods must be defined in static classes, as public static members.  You can see the prototype for the ProcessRequest extension about.  It takes two parameters a Stream object that contains the "source" of the page and a delegate that takes an HtmlDocument and returns a modified HtmlDocument.

    using (WebBrowser tempBrowser = new WebBrowser())
    {

        //all data from the source as a string
        string sourceText = string.Empty;

        try
        {

            //read all the data from the source Stream
            using (StreamReader sourceReader = new StreamReader(source))
            {

                sourceText = sourceReader.ReadToEnd();

            }

        }
        catch (IOException ex)
        {

            throw new Exception("Could not read data from source stream", ex);

        }

It is important to note that the WebBrowser control is an absolute resource hog, so please use a using statement or other disposal pattern to property clean it up.  Also, we could have performed all of the operations in this method using the WebBrowser control we were given, but the drawback to that is the control would fire any registered event handlers.  We want our manipulation of the HtmlDocument to be as seamless as possible, and thus we operate on a temporary WebBrowser control. 

The above chunk of code also performs the mundane task of reading the entire Stream into a string and propagating any exceptions up the stack.

        //process any text we read from the source Stream
        if (!string.IsNullOrEmpty(sourceText))
        {

            HtmlDocument tempDocument = null;
            HtmlElement htmlRoot = null;            
            
            //navigate to "about: blank" to initialize an empty document
            tempBrowser.Navigate("about: blank");

Now, the above code contains something that Jim tells us to do which is navigate our browser to "about: blank".  As Jim states, correctly, this causes the HtmlDocument object to be created and initially empty.  Exactly what we want, in fact.  However, Jim also seems to imply that this step is always necessary prior to setting either the WebBrowser.DocumentText or WebBrowser.DocumentStream properties.  As the MSDN Documentation for DocumentText points out, WebBrowser will automatically navigate to "about: blank" each and every time either of these properties is set.

The reason that we are doing this is that we don't WANT to set DocumentText.  Remember, that will cause all of our resources to be loaded!  All we are trying to do is get an empty HtmlDocument object!

            //load the sourceText into the document.
            tempBrowser.Document.Write(sourceText);

Now that we have navigated to "about: blank", we can use the WebBrowser.Document property to access an empty HtmlDocument.  Further, we can use the HtmlDocument.Write method to populate the document with our HTML.  This is looking pretty nice so far!

            //now filter the document if a filter was specified
            if(filter != null)
                tempDocument = filter(tempBrowser.Document);

            //if the filter did not return a document, or no filter was specified, use the original document
            if (tempDocument == null)
                tempDocument = tempBrowser.Document;

The code from here on out is pretty standard.  We are applying any filter we've been given and keeping track of our temporary HtmlDocument object as it is being modified.

 

            //find the root HTML element, there can be only one!
            var htmlElements = tempDocument.GetElementsByTagName("html");

            if (htmlElements != null && htmlElements.Count > 0)
                htmlRoot = htmlElements[0];

            //now, extract the text and set it on the actual browser
            browser.DocumentText = htmlRoot.OuterHtml;

To wrap this method up, we get the root <html> tag and then set the WebBrowser.DocumentText property of the WebBrowser control we were given to the <html> tag's OutHtml (i.e. everything in the document including HTML tags and content).

By setting the DocumentText property, we are forcing the WebBrowser control to load our modified document.  We have accomplished our goal.  We can now modify the HtmlDocument BEFORE it gets rendered.

The Final Bits

For the sake of completeness, let's use the StripImageLoading method we created earlier to modify a "local" page:

public partial class MainForm : Form
{
    public MainForm()
    {

        InitializeComponent();

        //get google's home page
        FileStream source = new FileStream(@"C:\Development\VS2008\OfflineHtml\google.html", FileMode.Open, FileAccess.Read);

        //process the request
        mainBrowser.ProcessRequest(source, StripImageLoading);

    }

    public HtmlDocument StripImageLoading(HtmlDocument document)
    {

        foreach (HtmlElement image in document.Images)
            image.SetAttribute("src", string.Empty);
        
        return document;

    }

}

The above class opens a saved HTML file that contains the source HTML of the Google home page.  It then uses our ProcessRequest extension method to filter the HtmlDocument using the StripImageLoading method as its delegate.  The result when you run the code should be a missing image on the page. If you want to, go download a network analyzer like WireShark to confirm that no HTTP requests are being made as a result of rendering the page.

 

Summary

WebBrowser control is pretty cool.  It has a lot of useful features out of the box and is quite extensible.  In this article you've seen its basic usage and a slightly more advanced scenario for which new .NET 3.5 capabilities provide an extremely clean solution.  In fact, it is probably the first time I've really gotten an "Oh yeah! This feels right" when using extension methods outside of LINQ.  Of course, pretty much the same code will compile and work in a .NET 2.0 environment, you'll just have to comment out the "this" modifier in front of the first parameter of the extension method along with any code that uses it, i.e. turn ProcessRequest into a vanilla static method.


A couple of readers (at least one of which was thankfully vocal) complained about the blog's style.  I agree it is/was/will be pretty lame.  I am using the templates provided by geekswithblogs and I don't really have the time to create my own yet.  Let's see how things progress with this new updated style.

-Newman


NOTE: This article is dedicated to Keith Elder...even if he never sent me a bologna sandwich.

Apparently, two months is my definition of "very soon".  Let's continue.

Since .NET 1.1 we've had the concept of delegates.  They are the constructs that allow us to call methods on objects via reference such as:

delegate int AddFunc(int x, int y);
public static class MathOps
{

   public static int Add(int x, int y)
   {

      return x + y;

   }
} 
class Program
{
   static void Main(string[] args)
   {
           
      AddFunc f = new AddFunc(MathOps.Add);

      Console.WriteLine("Delegate: 2 + 2 = {0}", f(2, 2));
      Console.ReadLine();

   }
}

There is nothing new and exciting about delegates as calling a function via pointer has been around for a very long time.  In fact, delegates are actually somewhat annoying in terms of syntax.  They must be declared in a class, you must wrap them in an object, etc.  Why can't we have a simpler syntax? After all, most of the time delegates are used to respond to relatively simple events or act as part of a strategy pattern (e.g. in a sort).

Anonymous Methods

In honor of Bill Gates, .NET 2.0 decided to give us a kindler and gentler delegate syntax.  The main method above could easily be rewritten as:

class Program
{
   static void Main(string[] args)
   {
           
      AddFunc f = delegate(int x, int y) { return x + y; };

      Console.WriteLine("Anonymous Method: 2 + 2 = {0}", f(2, 2));
      Console.ReadLine();

   }
}

As is common on the .NET platform the delegate keyword was overloaded to give it additional meaning.  Now one could assign to a delegate variable directly, in the current scope.  The new anonymous method syntax was similar to a method declaration.  The differences are pretty obvious, but I'll list the major ones.  Firstly, anonymous methods don't require identifiers, hence the terms anonymous methods.  Secondly, anonymous methods do not need to specify a return type.  This is due to some rudimentary type inference built into the compiler.  In essence, if we already know that we are assigning to a delegate of type "AddFunc" whose return type is "int", it should be obvious to the compiler that as long as the return statements in the delegate's body return an "int" then our anonymous delegate matches the signature of "AddFunc".  The counterintuitive aspect of this is that we still have to specify the types of the anonymous method's arguments.  After all, shouldn't the compiler be smart enough to also assume the types of our "x" and "y" based on the delegate type we are assigning to?  It should be, but unfortunately it is not.

There is something else I want to say about anonymous methods before moving on. This is something I come across all the time and some developers just don't get: anonymous methods allow for lexical closures. 

 

Lexical Closures

There is a lot of bickering on the net about "does .NET support 'true' closures?"  Well, based on my understanding and in my opinion, they support lexical closures or at least something close enough that for most practical purposes it doesn't matter.  I'll leave the 100% correct definition to the language lawyers and just give a quick example and some reasons why a lot of developers get caught in the lexical closure trap.

delegate int Increment();
static void Main(string[] args)
{
   
   Increment AddOne = AnonInc(0, 1);
   Increment SubOne = AnonInc(10, -1);

   for (int i = 0; i < 10; ++i)
   {

      Console.WriteLine("{0},{1}", AddOne(), SubOne());

   }
   Console.ReadLine();
}
static Increment AnonInc(int start, int by)
{

   return delegate { return start = start + by; };

}

The output of the about code should be:

1,9
2,8
3,7
4,6
5,5,
6,4,
7,3
8,2
9,1
10,0

First, take a look at our delegate "Increment".  It takes no arguments and returns an "int".  The idea is that delegates will somehow increment "a value" and return the next value in the sequence. 

Next, look at the method "AnonInc".  Does it return a delegate? That's crazy!  Further, it returns a delegate that makes use of something commonly referred to as "up values" or "outer variables" depending on the person/system/said person's mood.  An outer variable is simply a variable that exists in the scope that contains the delegate.  In this case, our delegate's scope is the "AnonInc" method in which the "start" and "by" arguments are implicitly defined local variables. 

Now, based on the definition of the delegate returned by "AnonInc" and the output of the program we can tell something interesting is going on here.  The question you should be asking right now is, "How is it that we are modifying the value of a local variable inside a delegate and it is keeping track of the change?"

If you recall delegates, and therefore anonymous methods, are represented by objects.  These objects are instances of classes that are automatically generated for you at compile time.  They have funny, mangled names and you can not really do too much with them.  The thing that one needs to know is that any outer variables used by an anonymous delegate become attributes of this auto-generated class.  So, in our case if we look at the assembly generated by the above program using a tool like Reflector we should find a class like:

[CompilerGenerated]
private sealed class <>c__DisplayClass7
{
    // Fields
    public int by;
    public int start;

    // Methods
    public int <AnonInc>b__6()
    {
        return (this.start += this.by);
    }
}

As you can see, the above class has two attributes with the same names as our outer variables and a method that accesses them.  Looking at the code this way kind of takes the magic out of anonymous methods and we being to realize that it is sort of like what I said about extension methods, it is just syntactic sugar.  Handy, but not magical.

So, what is this trap I was talking about?  Well, it has to do with the garbage collector.  As we all know, in .NET an object lives in memory until it is explicitly disposed of or goes out of scope.  In general perhaps "goes out of scope" is best thought of as "until no other object holds a reference to it".  With lexical closures happening more or less behind the scenes it is very easy to create a memory leak such as the following:

public class ResourceWrapper
{

    public void OpenOnClick(Button btnOpen, string resourcePath)
    {

        SomeResource res = new SomeResource(resourcePath);

        btnOpen.Click += delegate(object sender, EventArgs e) { res.Access(); };

    }
    
}

public class SomeResource
{

    public SomeResource(string path) { }

    public void Access() { }

}

Granted, this example is contrived, but you see similar things all the time.  So, what's going on here? Basically if we look at "OpenOnClick" we can see that an anonymous method is being registered as the Click event for a button.  Further, the anonymous method is using an outer variable "res".  This means that the following class gets generated for us:

[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
    // Fields
    public SomeResource res;

    // Methods
    public void <OpenOnClick>b__0(object sender, EventArgs e)
    {
        this.res.Access();
    }
}

Normally, we'd just assume that since "res" is a local variable in the "OpenOnClick" method that it'd die as soon as it ran out of scope, i.e. at the end of the method.  However, since our anonymous delegate is holding a reference to it, the object "res" is referencing will live until the anonymous delegate itself goes out of scope.  One can easily see how this kind of situation can go bad quickly.  To avoid this situation, be careful to unregister your anonymous methods when you use them as event handlers!

Alright, so why did I get into all of this anonymous method stuff if the post is supposed to be about Lambda Expressions? Well, because Lambda Expressions in C# are just an evolutionary step beyond anonymous methods.  Let's chip away at some of the sugar...

 

Our first Lambda

It is difficult to describe the syntax of a lambda expression since it is very ambiguous and depends on multiple factors.  With that in mind let's look at a quick example:

AddFunc f = (x, y) => x + y;

The above snippet declares a new AddFunc delegate and assigns a lambda expression to it.  Everything to the right of the = operator is the lambda definition. 

Some questions:

  1. Where is the return type?
  2. Where is the identifier?
  3. Does (x, y) denote the parameter list?
  4. What does the => do?
  5. Why isn't there a return statement?

Some answers:

  1. Lambda expressions do not need an explicit return type.  Just like with anonymous methods the compiler is smart enough to infer the return type based on the type of delegate it is being assigned to. In this case AddFunc returns an int, and so the lambda implicitly returns and int.  Obviously it is a compiler error if the lambda does not.
  2. Lambda expressions are by definition anonymous.  They do not have identifiers.
  3. Yes.  Further, you should note that lambda parameters do not need to explicitly state their type.  This, like the return type, is inferred by the compiler based on their order compared to the delegate's parameters list. You can, however, state the types explicitly.  (int x, int y) is a valid lambda expression parameter list.
  4. The new => operator is the start of the expression's body.  Everything after => defines what the lambda expression does.
  5. Lambda expression don't require an explicit return statement.  When a return isn't provided the return value is assumed to be whatever the lambda expression evaluates to.

So, let's take a look at a few other valid ways to write lambda expressions:

(int x, int y) => { return x + y; };

The above is the most explicit way.  We've specified types for the parameters and a real return statement.  Notice how when we use an actual return expression we have to use the { } brackets? This same syntax allows us to create multi-line lambdas and lambdas that declare local variables.

(x, y) => { return x + y; };

This one keeps the return statement and just drops the optional types in the parameter list.

() => x + y;

In the above, we've specified a lambda with an empty parameter list.  In this case we are assuming the existence of x and y as outer variables (yes, lambda expressions support lexical closures just like anonymous methods).

 

A Lambda is what you assign it to

So far we've seen that lambda expressions are compatible with delegates in the sense that you can assign a lambda directly to a delegate, but there are other interesting uses. Take a second and think about writing a program in a text editor.  To the text editor, or for that matter to the compiler, the lines of code your write are just data.  The compiler doesn't execute your program, it simply translates data from one format to another.  It is natural then to ask, "If I can store a program as data, can I load a program as data at run time and then execute it?" With lambda expressions the answer is yes.

If we assign a lambda expression to a delegate it becomes a delegate of that type.

If we assign a lambda expression to an appropriately typed Expression Tree it gets converted at compile time to equivalent Expression objects.

For example:

Expression<Func<int, int, int>> exp = (x, y) => x + y;

This statement simply says, "Convert this lambda expression into an expression tree equivalent to a method that takes two integer parameters and returns the sum as an integer".

There is no resulting compilation of this tree and no execution of code as a result of this statement.  If at runtime we need to execute the function the tree represents, we must say:

Expression<Func<int, int, int>> exp = (x, y) => x + y;
var func = exp.Compile();
Console.WriteLine("{0}", func(1, 1));

Now, it isn't inherently obvious why this is cool so I'll spell it out: If the compiler can represent executable code using Expression objects, so can we.  In fact, we will do exactly that by the end of this series.

As funny as it may sound, this is all you really need to know about lambda expressions.   You can use them in place of anonymous delegates (and you should), they forced the .NET team to provide C# with something approaching real type inference, and they allow us to represent code as data in a statically type checked way.

 

LINQ Tie In

Awesome. How are Lambda Expressions useful in LINQ?  Well, by now you've read the basic LINQ syntax somewhere else as I asked so I'll just show a couple of quick examples:

static void UseLINQ()
{

    var names = new List<GenderedName> { 
        new GenderedName { Name="Bob", Gender=Gender.Boy }
        , new GenderedName { Name="Sally", Gender=Gender.Girl }
        , new GenderedName { Name="Jack", Gender=Gender.Boy }
        , new GenderedName { Name="Sarah", Gender=Gender.Girl }
        , new GenderedName { Name="Philbert", Gender=Gender.Boy }            
    };

    var boyNames = names.Where((n) => n.Gender == Gender.Boy).Select((n) => new { n.Name });

    foreach (var name in boyNames)
        Console.WriteLine("{0}", name.Name);

}

This above function queries a list of names for those that are traditionally used for boys.  In order to make use of the actual lambda expression syntax I used the method based approach to querying with LINQ.  In fact, there are two lambdas in our code:

(n) => n.Gender == Gender.Boy

This lambda is for our selection criteria and simply compares the given name, n, to see if it is used for boys. 

(n) => new { n.Name }

In this expression we are returning a new anonymous type that just contains the Name property of the GenderedName that has passed our selection criteria.

We can simplify, or rather pretty up, this method by using the new LINQ keywords as so:

static void UseLINQ()
{

    var names = new List<GenderedName> { 
        new GenderedName { Name="Bob", Gender=Gender.Boy }
        , new GenderedName { Name="Sally", Gender=Gender.Girl }
        , new GenderedName { Name="Jack", Gender=Gender.Boy }
        , new GenderedName { Name="Sarah", Gender=Gender.Girl }
        , new GenderedName { Name="Philbert", Gender=Gender.Boy }            
    };

    var boyNames = from n in names
                   where n.Gender == Gender.Boy
                   select new { n.Name };

    foreach (var name in boyNames)
        Console.WriteLine("{0}", name.Name);

}

It doesn't look like we are using lambda expressions here, but we really are.  It is just that the compiler needs to turn our pretty code into the same method calls that we just used, and therefore ultimately into an Expression Tree for later execution.

I just want to be very explicit here and point out something.  When we are using LINQ we use lambda expressions as delegates.  We know this because the parameters of the Where method accept arguments of the Func<T> variety.  The Func series of generic types are actually generic delegates.  For example, MSDN has the following definition for Func<T, TResult>:

public delegate TResult Func<T, TResult>(
    T arg
)

This usage of delegates and expression trees is what allows LINQ to support Lazy Evaluation.


Monday, November 26, 2007 #

I was getting used to the four hour round trip commute from Jersey everyday so I decided to change things up a bit and move to New York.  We found a place on Long Island that fits our needs to a T.  Three bedrooms total with two upstairs and one downstairs.  The upstairs has the kitchen, main living room, a full bath, and a den type area.  Downstairs is another bedroom, living area, full bath, and utility room with washer & dryer.  The downstairs also has hookups available for a second kitchen if we wanted, but I don't think we are going to invest in a second set of major appliances at this point.  The place is newly renovated with hardwood floors all throughout the upstairs and wall to wall carpet downstairs.  The bathrooms have new tiles and fixtures.  Not much else we could really ask for at this point.  Lauren and I were both amazed it was still available, but I guess sometimes things just work out.

Anyway, my new commute is much shorter than my previous one.  I can make it from Massapequa to Penn Station in about thirty five to forty minutes.  I still have to take the subway to get to a lot of clients, but at the end of the day life is looking up.  I may actually get some time to spend with my family in the evenings, which is entirely the point of this shift in my nomadic pattern.

Regardless.  You don't care. Why would you? I wouldn't if I were in your position.  I owe you an article about Lambda Expressions/Anonymous Methods as part of my ongoing series on LINQ.  You'll get it.  You'll get it soon.


Monday, November 12, 2007 #

In part zero I stated my intentions, now it is time to act.

If you've ever programmed in C (no, I didn't forget the #) you may have had a function prototype laying around similar to:

int Deposit(struct account *acct, double amnt);

If one were to rewrite this today in C# you'd probably have a class to represent accounts and your method definition would just be:

public void Deposit(double amount)

You would have dropped the int because you can throw an exception if there is an issue and you no longer need to pass in a pointer (i.e. reference) to the account because our account object will contain all the state for us.  In reality what is the difference here? Well, it more or less comes down to syntax.  To execute this code in C I would have to say:

Deposit(&acct, 350.75);

In C# I'd write:

acct.Deposit(350.75);

The great thing about the C# version in terms of syntax is that there are no funky operators to deal with and it is more English like in terms of reading left to right.  In terms of flexibility, however, I have to give the edge to the C version.  Why? Well, because if I need to add a new operation on the account data type in C I can do it anywhere that I want.  All I need to have access to is the prototype of the account struct and I can introduce the following:

int Withdraw(struct account *acct, double amnt);

In C#, I simply can't add a "third party" method to a class that I don't have the source code to.  Sure, partial classes allow me to add methods, but they must be in the same namespace and assembly in order to work since they are a compile time construct.  I could also just take the C approach and say something like:

static void Withdraw(ref Account acct, double amnt)

That however, doesn't clean up the syntax issue from the caller's perspective as they'd now have to pass in the reference to an Account object like so:

Helper.Withdraw(ref acct, 350.75);

Crap! It is actually more to type than the C version.

Enter extension methods.  An extension method is a static method whose first parameter is decorated with the overloaded 'this' keyword.  For example:

public static void Withdraw(this Account acct, double amnt)

Alright, still not much shorter for me as the extension method's author, but how does it look when a piece of client code calls it?

acct.Withdraw(350.75);

As you can see, the call syntax is exactly the same as with the C# version of Deposit.  Under the hood all that is going on is the compiler is seeing our use of 'this' in front of the first parameter and saying, "OK, I know now that I can allow this to be called on Account objects".  Further, if Account was a base class of another type, e.g. BusinessBankAccount, then the extension method would also work on BusinessBankAccount objects.  Do I need to mention that the same holds true for extensions defined to operate on interfaces?  Heck, you can even make an extension method that makes use of Generics!

Another cool aspect of extension methods is that in the Visual Studio 2008 (and even Visual Studio 2005 if you have the .NET 3.0 CTP) environment they are fully supported by intellisense.

There are a couple of restrictions that should be obvious, but I'll list them here anyway.  First, since the extension method is still technically a member of a different class it will only have access to the public members exposed by the class being extended.  In other words, our Withdraw extension method can not access the private and protected members of the Account class.  This restriction places a definite limitation on what can be achieved through extension methods, but is necessary to avoid violating encapsulation.  Second, a consumer of the extension method needs to reference the assembly in which the extension lives or the compiler won't see it.

LINQ Tie In

Now that we know what extension methods are let's take a look at how they are utilized by LINQ.  LINQ is designed to extend query capability to .NET types using extension methods.  The standard query operators of LINQ operate on any type that implements IEnumerable<T>.  There are other technologies in the LINQ family that provide sets of extensions methods, for example LINQ To Dataset provides the same extensions, but for types derived from Dataset. 

If we focus for now on vanilla LINQ, i.e. the one that operates on in memory collections, we can think of at least three ways to construct it.

  1. methods that accept IEnumerable<T> parameters using existing, i.e. pre 3.0, syntax
  2. add methods to the IEnumerable<T> interface for things like Select, Join, etc
  3. provide extension methods for the IEnumerable<T> interface

Item one will probably work, but requires the clunky syntax we saw before.  It could possibly be hidden behind new C# keywords, but it may have required more work at the compiler level and would have made doing any type of dynamic LINQ an undue burden on the developer.

Item two is obviously out the window.  For starters, changing such a core interface like IEnumerable<T> would require so many rewrites not only in the framework, but third party code as well, that Microsoft would have had a developer mutiny on their hands.

Item three is what they ultimately went with and is basically the same as item one now that we know how extension methods actually work.  The advantage is the cleaner syntax offered to developers.

Naive, LINQ-like Extension

There are probably a million blogs out there that have the information you've already seen here so far.  So, I am not going to go over the actual LINQ syntax right now.  Instead I am going to show the method by which LINQ is constructed in a very limited, naive case. 

Let's say LINQ doesn't exist and your team was on a project where you were constantly searching through collections of objects using lots of basic criteria.  We could introduce an extension to the IEnumerable<T> type that allows us to specify our criteria and get back a new collection of items that match it.  Our method might look something like:

public static IEnumerable<TResult> SelectWhere<TResult>(     
     this IEnumerable<TResult> source,     
     Func<TResult, bool> filter)
{

     var results = new List<TResult>();

     foreach(var s in source)
          if(filter(s))
               results.Add(s);

     return results;

}

The above code defines an extension method named SelectWhere that accepts a generic argument called TResult.  TResult is used to flesh out the method's remaining arguments as well as its return type.

The return type and first parameter (the one decorated with 'this') are obvious to us at this point.  The second parameter, Func<TResult, bool>, is simply a generic delegate type.  For a method to match the delegate's signature it must accept a single TResult parameter and return a bool.

If we examine the implementation it is pretty straightforward.  We simply iterate over the entire collection and call the delegate for each item to determine if we need to save it.

Now, how can we call this from client code?  There are a few ways, each of which is legal so let's start with the most explicit and work our way towards the sugar.

First we need a method that matches the delegate to act as our filter criteria:

public static bool PersonIsRich(Account acct) 
{

     return acct.Balance >= 10000D;

}

Your definition of a rich person may be different than mine, but I am sure you get the idea.  We return true if the account has 10,000 or more.

Next we call the method on a collection of Accounts:

var richPeople = SelectExtension.SelectWhere<Account>(allPeople, PersonIsRich);

As we can see the above is our more verbose usage and takes us back to the C function from before.  We need to pass in both the collection we are operating on as well as our filter.

var richPeople = allPeople.SelectWhere<Account>(PersonIsRich);

In this call we've used the extension method syntax and are just specifying the generic argument and the filter.

var richPeople = allPeople.SelectWhere(PersonIsRich);

At this point, we are beyond extension methods.  What just happened above is that the C# compiler is smart enough to figure out what our generic argument has to be in order to satisfy the extension method and doesn't bother making us put it there ourselves.  Logically speaking, if I am calling SelectWhere on an IEnumerable<Account>, then there is only one type that TResult can be, i.e. Account.  This is called 'type inference' and is the same reason that I can use the 'var' keyword instead of specifying the type of richPeople statically.  The compiler determines the type for me at compile time.

Summary

We have now seen what an extension method is and how it relates to LINQ.  We also now know that they are not magic and that we could get the same behavior from a normal, static method.  The new syntax helps add clarity though, and in development, clarity is a good thing.

As a teaser for the next part, take a look at this:

var richPeople = allPeople.SelectWhere(acct => acct.Balance >= 10000D);

This is the final magic.  Where did our delegate function go? What is that crazy syntax?  Well, that is a lambda function and they will be the topic of my next post.

Download Solution - LinqOverview.zip


Friday, November 09, 2007 #

I realize that these next few posts may be late to the game.  The LINQ CTPs for .NET 3.0 have been out for quite a while and everyone already knows about the massive amount of improvements Microsoft made in the Beta1 & Beta2 releases of .NET 3.5.  Further, we are supposedly less than a month away from seeing the official Visual Studio 2008 release.  All that being said I am going to spend a little time introducing the foundational pieces of LINQ so that I can lay the ground work for a more ambitious series.

First, it is important to understand that LINQ is not magic.  It is purely syntactic sugar and is built from the ground up using some new primitives introduced in .NET 3.0/3.5.  The three most important are:

  1. Extension Methods
  2. Lambda Expressions
  3. Anonymous Types

To tie it all together the LINQ team added several new keywords to the .NET language pool, but don't be mistaken, these are just conveniences like property getters and setters or the event keyword.  Everything done using the keywords can be done using raw code.

My plan is to go from the ground up and demonstrate each new language feature briefly.  Doing so will allow us to examine compile time LINQ usage with all the cards on the table and nothing hidden behind syntax.

While I take a day to put together part one, you might find the following links interesting.  Some of them are a bit heavy, but they are all valuable.  You'll also want to have access to some variant of Visual Studio 2008.  The Express versions are available free from Microsoft.

Visual Studio 2008 Beta2 - The Express editions are always going to be free and the downloads are a lot smaller, I tend to stick with the Professional versions as I get them via MSDN.

Official LINQ Project website - Contains basic documentation and downloadable samples.

Matt Warren's Blog - High upper on the LINQ Project team, invaluable.


Thursday, November 08, 2007 #

Software development is complicated. Everyone has their own opinion on how it should be approached and from time to time you get little clusters of folks that follow the same mantra, whether it be “agile methods”, “extreme programming”, “waterfall model”, etc. The underlying argument tends to be whether one thinks of programming and development as a science or an art. Before we get into the good stuff I thought it’d be nice if I could express my view on the matter and set some expectations when it comes to my code and even the topics I choose to post about. To me, software development isn’t an art or a science. It lives in that happy middle ground I commonly call a craft.  That's right, I'm a digital basket weaver.

First let's talk about my view on science.  Science is a well defined method of making observations on laws and systems that you did not create and therefore have no control over.  If you understand the system well enough you can influence it and that is what I like to think of as application of scientific principles, or engineering.  One thing is very important here: scientists don't create or produce anything.  For example, Newton didn't create gravity he discovered it.  Further, an engineer building a bridge doesn't defy or change gravity to make the bridge stay up he simply offsets its effect with other forces.  Eventually gravity wins and the bridge falls, the engineer just applied scientific knowledge to delay the inevitable.

Given my definition of science software development doesn't really fit in there.  Sure, you can take a look the basic principles of the theory of computation and say, "Look, there's the science!" You'd be right of course, that is science, Computer Science in fact which isn't what we are discussing. 

Software development isn't exactly engineering either, is it? One could argue that development is just the application of those scientific principles that Computer Science gives us and in some cases I think that line of reasoning would bear fruit.  What makes software development different from traditional engineering and allows us to call it a craft boils down to two things.

  1. Software development introduces the concept of easy way and hard way
  2. Software development introduces the concept of beauty

As to item one, the theory of computation has the potential to tell us exactly what is and is not computable (it hasn't given us that yet, but I digress).  Further, it can tell us the optimal way to compute something.  It can not, however, tell us the optimal way to build the computation in terms of time, materials, and customer satisfaction.

Item two is most likely the biggest digression from engineering.  Bridges are functional.  Machines are, or can be made to be, ergonomic.  Rarely is beauty considered and when it is, it is approached solely from the perspective of pure art.  That is, what the human mind and eye see as beautiful.

In software development beauty has some of those elements, predominantly in the user interface, but it also has a structural element.  Even in architecture who is looking at the beauty of the steel girders holding up the building? In software  development, the girders are supposed to be beautiful too.

No method can provide that beauty with any form of guarantee in the same way that calculations can be performed to build a functional bridge.  We are victims of our own minds.  In software development there is no golden ratio.

So, like a basket weaver, we as developers must produce using nothing but the most basic of principles and to our own (or our client's) perception of beauty.  This typically requires long apprenticeship under more senior craftsmen and a whole lot of trial and error.

Methodologies, or perhaps more accurately philosophies, of software design and implementation like extreme programing don't map to science because they don't provide a universal system that works in all cases.  Science is predicated on a single methodology called the scientific method.  Employing the scientific method guarantees that a scientist's work is valid. 

There is no such guarantee when a developer uses any of the prevailing design philosophies because even if you end up with a working product it may not satisfy the client.  Nearly everything is subjective and the developer is at the mercy of the client, the product either isn't fast enough, not pretty enough, or not easy enough to use and it isn't going to matter to the client whether it was built using extreme programming or agile methods.  This doesn't occur with science as we can see with another Newton example.  I doubt anyone pointed at Newton's work and said, "Gravity isn't elegantly constructed or convenient to use, we are opting to wait for Gravity 2.0".   

Alright, so what is my philosophy and is it universal?  I don't really have one, so no it isn't universal.  I try and organize my architecture in the way that best allows me to execute on my client's wishes and I try to keep my client's involved as much as possible by delivering relevant iterations of the software.  The relevant part is the key as a typical client isn't going to care that in this version we structure our stored procedures thusly while blah blah blah.  A client only cares about the parts they care about (makes sense right?) and our job as developers is to show them that.

I am not saying that extreme programming or any of the other philosophies don't work, I am sure there is enough evidence out there to show they do in fact work.  What I am saying is that you don't know if they are going to work, only that they have.  Methodologies are developed from a set of project post-mortem's so they are only going to have good coverage for projects that you've already completed and maybe a project you think might be close enough to the others you've already done for this particular approach to work. 

Well, as Forest said, that's all I have to say about that.  From here on out it will be