Justin a.k.a. The Code Monkey

Code Monkey [kohd muhng'-kee] n : 1. Creature known for it's ability to transform caffeine into code. 2. Justin Jones
posts - 10 , comments - 27 , trackbacks - 0

Monday, April 8, 2013

The Hero’s Journey, Originality, and Other Musings

This post isn’t about code, but I’m rolling with it anyway. Feel free to skip to the next code related post. Nobody said every post had to be tech related. Did they?

I was having an interesting conversation with my son tonight. Somehow we started on art and meandered over in to the subject of story patterns. I was encouraging him to write, and making sure that he wasn’t holding himself to unrealistic standards of thinking he was going to write a masterpiece the first time he wrote a story. That’s the kind of thing that takes a lot of time and a lot of writing to master. His frustration came from the fact that he was having trouble writing a story that was completely original. As we discussed this, it occurred to me that I don’t think that any story is completely original. We took, as an example, the comic book character I created in high school. Somewhere along the way I more or less envisioned the outline of a movie script involving the same comic book character. I realized, thinking it through, that my entire story had more or less no original elements, but everything about the story was borrowed from something else.

To expound: The character’s name is David Death. Hey, it seemed like an awesome comic book character name when I was in High School. The movie script more or less follows the Revenge Quest story line, which has been done countless times by countless action movie stars; nearly every Steven Seagal movie, for instance. If you took Batman, The Punisher, Darkman, The Crow, Buckaroo Banzai, John McClaine, any Jet Li character, and Jason Vorhees and put them all in a blender, out would come David Death. There was almost nothing original about the story. The idea that I might still be able to scratch together a script and sell it to Hollywood is based solely on the fact that Hollywood seems to have completely run out of ideas, and might actually jump at the chance to make something that at least appears to be original. We’ll see if I ever actually get around to that or not.

Then we discussed Harry Potter. The fact is, Harry Potter is hardly original either. The story follows The Hero’s Journey and is basically Star Wars with a secret wizard school in a hidden world of magic. Rowling’s genius came not in the originality of the plot, but in A) a unique mixture in which to put the plot and B) a gift for compelling writing. No matter how much you actually have in common with Harry Potter, you in some way identify with the character and actually care what happens to him. This also, from what I can tell, highlights the main difference between Harry Potter and Twilight, but that’s neither here nor there. Someone (for some reason I’m thinking it was Asimov) once said that good science fiction is about the story, not the technology that makes up the backdrop.

Star Wars itself is blatantly an implementation of The Hero’s Journey. George Lucas has said as much, having studied the work of Joseph Campbell in crafting the story. There have been countless implementations of this pattern, dating back to (I believe) the ancient Greek myth of Perseus and probably further. I’ve seen several recent fantasy implementations of it as well, including Legend of the Seeker (on TV), Eregon, and The Belgariad. Star Wars, for all it’s perceived originality, has very little actually. The Jedi were based on samurai, the Force is based on the Tao, the battle sequences were based on World War II movies, and the entire thing has an “Old West” feel about it, being most likely inspired by old serial westerns. The genius of George Lucas, again, is the ability to recombine all of these things in a new way and to write a compelling story.

It occurred to me that I can’t think of a single story that doesn’t have at least some kind of predecessor, and a myriad of influences. Perhaps a lot of us frustrated writers who never tried to actually publish anything are holding ourselves to an unrealistic standard. Perhaps every story ever told is just simply a retelling of another story, going back generations to primitive days when the first stories were probably just embellishments on recounting of actual events.

I don’t know, I’m just musing. It’s a blog, not a research paper. Anyway, thoughts and input are welcome. Back to code next time.

Posted On Monday, April 8, 2013 12:14 AM | Comments (0) |

Sunday, March 24, 2013

Variations on a Repository Pattern: Part II

In my previous post I laid out my current pet project and began showing the framework I set up to isolate CSLA business objects from the data access code. I listed (sans code comments for brevity) the static factory class and interfaces for a data access abstraction. You can find this post here: http://geekswithblogs.net/TheCodeMonkey/archive/2013/03/24/variations-on-a-repository-pattern-part-i.aspx.

Now I need something more concrete to the project at hand. Still without any concrete implementations, I define another set of interfaces explicitly for the project’s data. There’s only a handful of entities in the project, so I can show most of it here.

The project is a relatively simple application that takes video or audio files and makes a podcast feed out of them. That’s all it really does, actually. I’ve simplified the model a little bit for brevity. In the data layer code I define the following XML file:

<?xml version="1.0" encoding="utf-8"?>

    <Entity name="Feed">
        <Property name="Id" type="int" key="true" />
        <Property name="Name" type="string" />
        <Property name="FeedUrl" type="string" />
        <Property name="Image" type="byte[]" />
        <Property name="ImageName" type="string" />
    <Entity name="FeedItem">
        <Property name="Id" type="int" key="true" />
        <Property name="Title" type="string" />
        <Property name="Description" type="string"/>
        <Property name="Date" type="DateTime?"/>
        <Property name="Path" type="string"/>
        <Property name="FeedId" type="int"/>
        <Property name="EnclosureType" type="string"/>
        <Property name="Enabled" type="bool"/>
        <Property name="Length" type="long"/>
        <Property name="Duration" type="long?"/>

You’ll notice that there’s a few scalability problems with the way I’ve approached this. That’s fine for my project because it has a user base of 1 (me). It doesn’t need to scale, but if you want to adopt this approach for your own project, you’ll want to address these issues. My scaled down model basically has two entities in it, Feed and FeedItem. The properties match the database schema, which makes some of the implementation issues easier to surmount, but it certainly doesn’t have to match. Keep in mind though, we’re defining data objects, not business objects. To me it makes logical sense for data objects to follow the database schema. However, what we define here is all that the business layer is going to know of the data layer.

The next step could certainly be done by hand, but I decided to use a T4 template to generate the next set of code. Here’s a sample of the template:

    public partial interface I<#= name #>Dto
<#        foreach(var field in entity.Elements("Property"))
            var fieldName = field.Attribute("name").Value;
            var type = field.Attribute("type").Value;
#>        <#= type #> <#= fieldName #> { get; set; }

    public partial class <#= name #>Dto : I<#= name #>Dto
<#        foreach(var field in entity.Elements("Property"))
           var fieldName = field.Attribute("name").Value;
          var type = field.Attribute("type").Value;
#>        public <#= type #> <#= fieldName #> { get; set; }

    public partial interface I<#= name #>Dal : IDal<I<#= name #>Dto,<#= keyType #>>

Without going into too much detail about how T4 templates work, this generates the basic interfaces and DTOs that I need. The generated code looks like so:

public partial interface IFeedDto
        int Id { get; set; }
        string Name { get; set; }
        string FeedUrl { get; set; }
        byte[] Image { get; set; }
        string ImageName { get; set; }

    public partial class FeedDto : IFeedDto
        public int Id { get; set; }
        public string Name { get; set; }
        public string FeedUrl { get; set; }
        public byte[] Image { get; set; }
        public string ImageName { get; set; }

    public partial interface IFeedDal : IDal<IFeedDto,int>

And again for the Feed Item. This provides the business object with an interface that it can talk to for all CRUD operations, and interface for the DTO, and a simple concrete implementation that it can use when interacting with the interface. At this point, as far as unit testing is concerned, the data layer is done. I’ll cover some concrete implementations in the next post, but first there’s something missing. To be efficient about, for instance, a fetch operation, you want to filter the FeedItems returned to just the items related to the feed you’re instantiating. Notice that the generated Dal interface is partial. This allows me to add operations to it at the project level, so in another file I can add this:

public partial interface IFeedItemDal
   IEnumerable<IFeedItemDto> GetForFeed(int feedId);

Any concrete implementation of IFeedItemDal is now required to implement this method as well. In entity framework or linq to sql, you would accomplish this by writing a linq query against the model which in turn generates a sql statement to run against the database. This is still possible, but I’ve abstracted that down to the implementation. It’s certainly possible to do that, but a custom Linq provider was more than I wanted to get into for this project. Basically I’m taking the approach that the interface provides predefined queries and the implementation can accomplish them any way it wants to, including linq statements against EF providers. This is certainly extensible and could include custom types defined at this level for more complex queries.

So now my DataPortal_Fetch method is as simple as this:

private void DataPortal_Fetch(SingleCriteria<int> criteria)
    using (BypassPropertyChecks)
        var item=DalFactory.GetManager().GetProvider<IFeedDal>().Retrieve(criteria.Value);

The Mapper.Map line is AutoMapper. If you’re not using AutoMapper, check it out. It’s a life saver on this project. In this case, the DTO closely resembles the actual business object, so I can create a map without any custom mapping code. We’ve probably all written mapping code at some point in our career. I had one of my own that matched up the two objects based on property names and types and only copied the matches. AutoMapper does pretty much the same thing, but it caches the map and doesn’t do it every time which is a huge performance gain.

So obviously FeedItemCollection has the following code


private void Child_Fetch(int feedId)
    var items=DalFactory.GetManager().GetProvider<IFeedItemDal>().GetForFeed(feedId);
    RaiseListChangedEvents = false;
    AddRange(from item in items
             select DataPortal.FetchChild<FeedItem>(item));
    RaiseListChangedEvents = true;

And FeedItem has this:


private void Child_Fetch(IFeedItemDto item)

Now I can start my unit tests with this setup code:


var repository=new MockRepository();
var dalManager=repository.StrictMock<IDalManager>();
var feedDal=repository.StrictMock<IFeedDal>();
dalManager.Expect(n => n.GetProvider<IFeedDal>()).Return(feedDal);
feedDal.Expect(n => n.Retrieve()).Return(new[]
        new FeedDto {Id = 1, FeedUrl = "test.rss", Name = "Test Feed"},
        new FeedDto {Id = 2, FeedUrl = "test2.rss", Name = "Test Feed 2"}

As you can see I pass my mocked IDalManager directly into ManagerInstance. What the business object calls will be the mocked object and get back whatever I tell it to return. The business layer is fully unit testable now. In my next post, I’ll cover some implementations of the data layer I’ve thrown together.

Posted On Sunday, March 24, 2013 7:14 PM | Comments (1) |

Variations on a Repository Pattern: Part I

Ok, Let’s try this again.

<ObligatoryIShouldBlogMoreSection Skip='DontCareJustShowMeTheCode'>

Way back when I decided to blog more often because, well, mainly someone told me I should. That didn’t work out, but I still like Scott Hanselman’s idea as a blog as a place to keep cool code ideas. Well, he says that’s why he started the blog anyway. As an aside, he also says you should maintain your own site and own your own content. I’m working on that, and already have a cool domain name picked out. While I scrap together the funds to put that together, I’ll continue to blog here in the meantime.

A little history and my plans going forward.

So for various personal reasons my life has been filled with personal drama for about three years or so now. This had the unfortunate effect of distracting me from keeping up with A) this blog and B) my skills. In IT terms that amount of time is a small eternity. Entire programing paradigms rise and fall in that amount of time. How software is made can completely change. JavaScript is cool now. Web Forms are not cool now. MVC is cool. .NET seems to be on it’s way out. JQuery is a requirement. So I find myself with my skills a bit stagnated and the monumental task ahead of me of making up for lost time. To look at it positively, this gives me a lot of blog fodder. I’ve resurrected an old pet project (that I actually use a fair bit on my own network) and begun the process of over-engineering the hell out of it so that I can play with some new technologies. It is, of course, interspersed with a lot of old standby technologies that I still love and feel are still relevant. As I’m making progress on it, my plan going forward is to take different areas that I’m working on and blog about them here. Mostly so that I can refer back to the patterns easily. If you get something out of it too, bonus for you. Winking smile Ok, enough meta-blogging, back to the code.



I’m a fan of CSLA, so of course I wrote the guts of this in CSLA. I’ve recently returned to web work, and for various reasons proposed using CSLA as the basis of a website rewrite. In fairness, the web is not always where CSLA shines, at least not in Web Forms, but it gives you a solid middle tier to work with. The problem, though, is it’s hard to unit test CSLA objects. The framework’s focus on mobile objects makes heavy use of the factory pattern, which makes dependency injection problematic at best. The solution I came up with was heavily based on code examples by Rocky Lhotka himself on one way of handling this problem.

So I started with an assembly which defines some basic interfaces for abstracting away the data layer. The idea here is that I can inject a mock data layer during unit testing so that I can test the objects without a database. I placed this in a framework assembly and posted it as a NuGet package in my own feed (more on that later). Bear in mind that this is a work in progress.

I start with a static class called DalFactory

    public static class DalFactory
        public static Type DalType { get; set; }

        public static IDalManager ManagerInstance { get; set; }

        public static IDalManager GetManager()
            if(ManagerInstance == null)
                    var dalTypeName=ConfigurationManager.AppSettings["DalManagerType"];
                        throw new NullReferenceException("No DalManagerType specified.");
                        throw new ArgumentException(string.Format("Type {0} could not be found",dalTypeName));
                ManagerInstance= (IDalManager)Activator.CreateInstance(DalType);
            return ManagerInstance;

Don’t h8 on my spacing conventions. I started out in C++.

Concurrency issues aside, what this does is allow me to specify the type of manager to return in the configuration file, or by passing in a Type or an actual instance to the factory. The instance will override the type or the configuration. To reset the factory, pass in an instance or null to the instance.

What did that get me? At unit test time I can create a mock of type IDalManager and pass it directly into ManagerInstance, giving me a type of dependency injection. At runtime I can specify the actual type in the configuration file and keep the code ignorant of the actual data implementation.

IDalManager looks like this:

    public interface IDalManager: IUnitOfWork
        T GetProvider<T>() where T : class, IDal;

The Unit Of Work idea was an afterthought, but it solved some logistical issues, mainly the ability to support transactions. I called it IUnitOfWork, but it’s not a strict implementation of the pattern, more of a variation on the pattern. IUnitOfWork looks like this:

public interface IUnitOfWork : IDisposable
    void Commit();
    void Rollback();
    event EventHandler Committed;
    event EventHandler RolledBack;


If you’re familiar with the pattern, you’ll see some things missing. The main thing I was after here was the ability to transact the entire process. The events at the bottom were another logistical addition, because child objects needed to subscribe to the transaction but wouldn’t get an updated Id until the entire thing was committed. The event allows them to come back afterwards and get the Id from the DTO they were working with.

At this point the code can call DalManager.GetProvider<> and get what effectively is an interface to a particular entity type’s repository. You’ll notice the IDal restriction on GetProvider. IDal has nothing defined in it, it’s simply to make sure that the return type is restricted to an implementation of IDal. IDal is the parent of IDal<T,K> which is what we really want to return here.

    public interface IDal<T,K> : IDal
        IEnumerable<T> Retrieve();
        T Retrieve(K key);
        bool Exists(K key);
        void Insert(T item);
        void Update(T item);
        void Delete(K key);


T is the type of entity we’re working with, K is the type of the key. I’m having a discussion with another developer over how appropriate it is to define the key type here, so I may change direction on this, but for now it works.

This is a good starting point. Next we need an actual implementation specific to the entities that exist in the project. This is rather involved in and of itself, so I’ll save that for the next post.

Comments and criticisms are welcome. This is basically what I came up with to solve a particular issue. If you have another idea, please post it in the comments.

Posted On Sunday, March 24, 2013 2:06 AM | Comments (2) |

Saturday, September 15, 2012

Remember way back when we had a free decompiler?

I, like probably so many of the rest of you, was mortified when Reflector was sold to RedGate. I knew where it was going. Suddenly you had to install it instead of just download and run it. I had a deep down feeling that one of the most useful tools in my arsenal was about to become a corporate product and no longer belong to the world of free tools. Sure enough it did. For a while now I’ve limped by without my favorite decompiler. This was made a little easier by the fact that you can now debug into the .net framework, but I still missed Reflector.

JetBrains, makers of the superawesome and well worth the cost ReSharper (no it’s not free) have made their own decompiler that is comparable with Reflector, and it’s free. It’s still a corporate product, and JetBrains isn’t exactly known for making free software, but for now we have an option back on the table until some other industrious developer makes the next Reflector.

dotPeek can be downloaded here.  http://www.jetbrains.com/decompiler/



Posted On Saturday, September 15, 2012 1:59 PM | Comments (2) |

How to write your unit tests to switch between NUnit and MSTest

On my current project I found it useful to use both NUnit and MsTest for unit testing. When using ReSharper for running unit tests, it just simply works better with NUnit, and on large scale projects NUnit tends to run faster. We would have just simply used NUnit for everything, but MSTest gave us a few bonuses out of the box that were hard to pass up. Namely code coverage (without having to shell out thousands of extra dollars for the privilege) and integrated tests into the build process. I’m one of those guys who wants the build to fail if the unit tests don’t pass. If they don’t pass, there’s no point in sending that build on to QA.

So making the build work with MsTest is easiest if you just create a unit test project in your solution. This adds the right references and project type Guids in the project file so that everything just automagically just works. Then (using NuGet of course) you add in NUnit. At the top of your test file, remove the using statements that refer to MsTest and replace it with the following:

using NUnit.Framework;
using TestFixture = Microsoft.VisualStudio.TestTools.UnitTesting.TestClassAttribute;
using Test = Microsoft.VisualStudio.TestTools.UnitTesting.TestMethodAttribute;
using TestFixtureSetUp = Microsoft.VisualStudio.TestTools.UnitTesting.TestInitializeAttribute;
using SetUp = Microsoft.VisualStudio.TestTools.UnitTesting.TestInitializeAttribute;
using Microsoft.VisualStudio.TestTools.UnitTesting;

Basically I’m taking the NUnit naming conventions, and redirecting them to MsTest. You can go the other way, of course. I only chose this direction because I had already written the tests as NUnit tests. NUnit and MsTest provide largely the same functionality with slightly differing class names. There’s few actual differences between then, and I have not run into them on this project so far.

To run the tests as NUnit tests, simply open up the project properties tab and add the compiler directive NUNIT. Remove it, and you’re back in MsTest land.


Posted On Saturday, September 15, 2012 1:17 PM | Comments (3) |

Tuesday, June 26, 2012

Exception Handling And Other Contentious Political Topics

So about three years ago, around the time of my last blog post, I promised a friend I would write this post. Keeping promises is a good thing, and this is my first step towards easing back into regular blogging. I fully expect him to return from Pennsylvania to buy me a beer over this. However, it’s been an… ahem… eventful three years or so, and blogging, unfortunately, got pushed to the back burner on my priority list, along with a few other career minded activities. Now that the personal drama of the past three years is more or less resolved, it’s time to put a few things back on the front burner.

What I consider to be proper exception handling practices is relatively well known these days. There are plenty of blog posts out there already on this topic which more or less echo my opinions on this topic. I’ll try to include a few links at the bottom of the post. Several years ago I had an argument with a co-worker who posited that exceptions should be caught at every level and logged. This might seem like sanity on the surface, but the resulting error log looked something like this:

Error: System.SomeException
Followed by small stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.

Error: System.SomeException
Followed by slightly bigger stack trace.


These were all the same exception. The problem with this approach is that the error log, if you run any kind of analytics on in, becomes skewed depending on how far up the stack trace your exception was thrown. To mitigate this problem, we came up with the concept of the “PreLoggedException”. Basically, we would log the exception at the very top level and subsequently throw the exception back up the stack encapsulated in this pre-logged type, which our logging system knew to ignore. Now the error log looked like this:

Error: System.SomeException
Followed by small stack trace.

Much cleaner, right? Well, there’s still a problem. When your exception happens in production and you go about trying to figure out what happened, you’ve lost more or less all context for where and how this exception was thrown, because all you really know is what method it was thrown in, but really nothing about who was calling the method or why. What gives you this clue is the entire stack trace, which we’re losing here. I believe that was further mitigated by having the logging system pull a system stack trace and add it to the log entry, but what you’re actually getting is the stack for how you got to the logging code. You’re still losing context about the actual error. Not to mention you’re executing a whole slew of catch blocks which are sloooooooowwwww………

In other words, we started with a bad idea and kept band-aiding it until it didn’t suck quite so bad.

When I argued for not catching exceptions at every level but rather catching them following a certain set of rules, my co-worker warned me “do yourself a favor, never express that view in any future interviews.” I suppose this is my ultimate dismissal of that advice, but I’m not too worried.

My approach for exception handling follows three basic rules:

Only catch an exception if

1. You can do something about it.
2. You can add useful information to it.
3. You’re at an application boundary.

Here’s what that means:

1. Only catch an exception if you can do something about it.

We’ll start with a trivial example of a login system that uses a file. Please, never actually do this in production code, it’s just concocted example. So if our code goes to open a file and the file isn’t there, we get a FileNotFound exception. If the calling code doesn’t know what to do with this, it should bubble up. However, if we know how to create the file from scratch we can create the file and continue on our merry way. When you run into situations like this though, What should really run through your head is “How can I avoid handling an exception at all?” In this case, it’s a trivial matter to simply check for the existence of the file before trying to open it. If we detect that the file isn’t there, we can accomplish the same thing without having to handle in in a catch block.

2. Only catch an exception if you can add useful information to it.

Continuing with the poorly thought out file based login system we contrived in part 1, if the code calls a Login(…) method and the FileNotFound exception is thrown higher up the stack, the code that calls Login must account for a FileNotFound exception. This is kind of counterintuitive because the calling code should not need to know the internals of the Login method, and the data file is an implementation detail. What makes more sense, assuming that we didn’t implement any of the good advice from step 1, is for Login to catch the FileNotFound exception and wrap it in a new exception. For argument’s sake we’ll say LoginSystemFailureException. (Sorry, couldn’t think of anything better at the moment.) This gives us two stack traces, preserving the original stack trace in the inner exception, and also is much more informative to the calling code.

3. Only catch an exception if you’re at an application boundary.

At some point we have to catch all the exceptions, even the ones we don’t know what to do with. WinForms, ASP.Net, and most other UI technologies have some kind of built in mechanism for catching unhandled exceptions without fatally terminating the application. It’s still a good idea to somehow gracefully exit the application in this case if possible though, because you can no longer be sure what state your application is in, but nothing annoys a user more than an application just exploding. These unhandled exceptions need to be logged, and this is a good place to catch them. Ideally you never want this option to be exercised, but code as though it will be. When you log these exceptions, give them a “Fatal” status (e.g. Log4Net) and make sure these bugs get handled in your next release.

That’s it in a nutshell. If you do it right each exception will only get logged once and with the largest stack trace possible which will make those 2am emergency severity 1 debugging sessions much shorter and less frustrating.

Here’s a few people who also have interesting things to say on this topic: 



I know there’s more but I can’t find them at the moment.

Posted On Tuesday, June 26, 2012 10:32 PM | Comments (2) |

Sunday, May 17, 2009

Remote Debugging across Domains made easy

Back from hiatus. 

I've never had an excuse to do remote debugging until recently, but I've always heard that it's a serious pain in the ***, hence, I avoided it.  Recently I really needed it to work, and of course, it didn't. 

Remote debugging seems to be a little easier than it used to be, and if you do enough searching you'll eventually find what you need.  Here's the short version:

Step 1: Go here: http://www.microsoft.com/downloads/details.aspx?FamilyID=440ec902-3260-4cdc-b11a-6a9070a2aaab&displaylang=en
This is the remote debugging service installer.  Assuming that you don't want a full Visual Studio installation on the remote machine, you can install this service on the remote machine.

Step 2: Add permissions.  The debugger service you downloaded in step 1 needs to run under an account that has permission to your local machine.  I think it may need admin rights on both boxes, but I'm not sure.  In any case, the accounts I used to get it to work were admins on the machines they were on.

Step 2.5: Fake it out.  Here's where it got wonky for me.  I couldn't add the permissions in step 2.  My dev box was in one domain, the box being debugged was in another domain (dev domain), and the domains did not have a trust relationship.  Therefore, I couldn't add my local account to the dev box, nor could I add the remote account to my local box.  Game over?  not quite.   There is a workaround.  Create a local account on your dev box with the same username/password as your domain account.  Do the same on the remote box.  Run the service as that account.  Magically, it all just works. 

Step 3: Miscellaneous debris.  I also was not able to resolve the remote machine's name through DNS, but when you connect to the service, it wants to resolve the name.  Get around this by adding the remote machine's name to your hosts file (at c:\windows\system32\drivers\etc\hosts).  It just works better.

Step 4: Do it.  Select Debug->Attach To Process.  Under Qualifier Type the remote machine's name (or IP address) or select Browse, if you have name resolution.  The processes from the other machine should show up.  Select the one you want.  (Don't forget to set a breakpoint). 

That's it.  Easy when you know how.  Serious pain when you don't.

Posted On Sunday, May 17, 2009 3:41 PM | Comments (9) |

Friday, January 23, 2009

Can you tell me the meaning of the word "Polymorphism"?

This is one interview question I ask in every interview, and I get a lot of grief for it.  I've done it for years.  I used to work for a guy who was primarily a Delphi developer, and he as much as ordered me to not ask that anymore.  I think it's a fair question.  People who work in IT using an object oriented language should have a basic grasp on what the three tenets of Object-Oriented Programming are.  If you work in an object-oriented language, you know what they are, but you may not know what they're called.

  • Encapsulation: This is basically information hiding.  All it means is that data internal to the class that other classes have no business messing with are inaccessible to the naughty classes trying to mess with them.  That's what the "private" and "protected" keywords are for.  If you make everything public, you might want to go pick up the old Bjarne Stroustrup book.  This is your craft, time to learn it.
  • Inheritance: This one is easy.  It's what comes after the colon in your class declaration (in C-style languages, anyway).  Most everybody knows what this is.  One hopes they do, anyway.
  • Polymorphism: This is the one everybody forgets.  It's also the hardest one to describe.  In layman's terms it means you have a virtual function/property/whatever declared in an ancestor class that is overridden in a descendant class.  For instance, if you have a Shape base class with descendants Square, Rectangle, and Circle, you might have an Area() method on the base class that is implemented differently by each descendant class.  A Shape pointer/reference that actually points to a Circle will calculate (Pi)r^2.  A Square will calculate W^2, and a Rectangle will calculate L*W.  The more common example is the Employee base class that calculates salary differently for hourly and salaried employees, without the need to know what type they are.  You just automatically get the right one because the method is virtual.

Most people are familiar by now with how virtual methods work in their favorite OO language, but most people don't know there's a computer science term associated with that behavior.  That's why I ask this question.  I don't expect most people to get it right, it's about the reaction to the question.

The results to this question are always fascinating.

At a previous job, one candidate answered that he wasn't sure if he had the right answer, but tried anyway.  He went on to describe Encapsulation.  I gave him credit, because he was A) honest that he wasn't sure, and B) I was pretty sure he knew the answer at one time, because describing Encapsulation means he actually learned the three tenets at some point.  We hired him.  He realized what a nuthouse he'd gotten himself into and left.  C'est la vie.

I told a friend of mine about this question and she used it in an interview.  The interviewee responded "Oh, that's just a buzz word".  That's the kind of thing that sets me off.  You can't possibly think I would ask a question I didn't know the answer to?  Don't bluff.  Calling one of the staples of computer science a "buzz word" shows that A) you're bluffing, B) you don't really know, and C) you've dismissed it as something not necessary to learn to do this grunt work thing called programming.  Any monkey can do this, right?  Calling polymorphism a buzz word is a bit like saying the X-Ray is a passing fad in the medical field.  This kind of answer tells me that coding is not a craft to this person, it's a 9-5 job.  Programmer is not what they are, it's just their job.  That doesn't make them a bad person necessarily, just not somebody I want to work with.

A more recent interviewee responded simply "I don't know".  Surprisingly, this is a good answer.  It shows honesty and integrity.  It tells me that I can count on a straight answer from you when I need it.  Besides, I was pretty sure he knew it, he just didn't know it's name.  There's a Taoist tenet that states "knowing gets in the way of learning."  This sounds absurd at first, as do most Taoist tenets, but someday it will click, if it hasn't already.  Admitting that you don't know means you're able to learn it.  That's a good thing.

You can learn a lot from an interviewee from this question.  Short of having them actually write code (which I think we should), it's one of the best ways to cut through to what kind of programmer this person is.

I've been talking with my boss recently that I think the interview process for developers is a bit flawed.  We hire people based on their ability to interview well, not their ability to write good code.  Jeff Atwood of Coding Horror wrote last year about the FizzBuzz test.  Jeff is one of my favorite bloggers.  He takes his craft seriously and has trouble with people who don't.  He defines the type of person I want to work with.  Luckily, most of the people I work with currently qualify.  I've suggested (seriously, even thought I think it wasn't taken as such) that we should administer the FizzBuzz test.  The test is simple. 

Write a program that prints the numbers from 1 to 100. But for multiples of three print "Fizz" instead of the number and for the multiples of five print "Buzz". For numbers which are multiples of both three and five print "FizzBuzz".

Since I work with good developers, I find it hard to believe that 199 out of 200 applicants could not pass this test.  I would hope most of the candidates I see could.  Maybe they can't.  This test would tell us a few things though.

A) Whether or not they can write code at all.  That's a good thing to know when you're hiring a developer at near those kinds of salaries.  Some people interview really well, and can't write code to save their mother.  That's not necessarily easy to glean in a 30 minute interview unless you ask them to write code.

B) How they code.  Truthfully, there's more than one way to solve the FizzBuzz test.  The "obvious" choice is to use the modulus operator, but there's always another way.  Say using counters.  How the applicant solves the problem is telling.  Also, seeing them code the answer gives you a clue to how they work, something you can't tell from a finished product, even if they get it right.

Anyway, there's no summary point to this, just a final suggestion.  Take your craft seriously.  It is a craft, it is an art.  Your college COBOL teacher was wrong.  It's what we are, not just what we do.  If you don't "go try that out at home" occasionally, if you don't have some sort of IDE installed on your home PC, you might reconsider is this is the right career for you.  On the other hand, if you're reading this blog, it's highly unlikely that you fall into that category.

Posted On Friday, January 23, 2009 12:05 AM | Comments (6) |

Sunday, December 21, 2008

To var or not to var, that is the question

I started this blog back in September with a particular purpose in mind.  Every yahoo and his brother has a blog these days, and by far the majority of them are absolute trash, but every so often there's a gem.  As developers, we seem to mostly agree on which ones are the gems.  Non-developers most likely have different lists, depending on their focus.  There's a long list of blogs I love to read, and couldn't possibly hope to be counted among them, but one day I realized something.  There's a lot of things I know that many others don't.  Sure, somebody somewhere knows that particular little piece of knowledge, but they may not have a blog of their own.  Fairly often when googling for the answer to a problem I can't find just the right answer I'm looking for.  Somebody somewhere knows the answer, but they didn't blog it.  That's when I saw the value of starting my own blog, so I created The Code Monkey (inside joke) for just that purpose.  It's a place to post those little tidbits of knowledge I've acquired that not everybody else knows.  Then life caught up with me and I got busy.  That's why I haven't posted since September.

There's been this little voice in the back of my head bugging me for weeks now.  I'm pretty sure it's Jeff Atwood.  I have a lot of respect for Jeff, and he talked about how difficult it is to maintain a blog in a DotNetRocks interview last year. In it he talked about how he made a decision to write X many times per week.  While that's very admirable, I think I'm going to have to set my sights a little lower to start with.  I'll aim for one post every two weeks or so.  Why am I telling you this?  So that you'll mock me if I fail to keep it up. 

Peer pressure, you can make it work for you too.

Just to top it off, a friend of mine challenged me and another friend to write two posts in two weeks.  The one who fails to has to buy the beer.  That's pretty motivating.  If you blog and have friends who blog, you should try it out.  Unfortunately I think I've already missed the two week deadline, but I can still beat the other guy and get free beer.  Helping out your fellow developer and getting free beer at the same time, you can't beat that.

So began my search for a blog topic.  Turns out it found me.  A couple of months back I was working for Magenic in Minnesota.  Check out the link, I was excited because I got to help a friend debug some of the Silverlight on the front page.  Anyway, Rocky Lhotka, creator of CSLA and Microsoft Software Legend for those who don't know, posted a question on the internal forum server which got me thinking.  I don't remember if I put my $.02 in or not, but out of curiosity I asked the same question on the forum server at my new job.  The results surprised me a bit, but they probably shouldn't have. 

The question was "Should var be used for variable declarations outside of using LINQ?"  This is a surprisingly polarizing question, and it's showed up a few other places recently.  One of the managers at Resharper, Ilya Rezhenkov, had this to say:

"Some cases where it seems just fine to suggest var are:

  1. New object creation expression: var dictionary = new Dictionary<int, string>();
  2. Cast expression: var element = (IElement)obj;
  3. Safe Cast expression: var element = obj as IElement;
  4. Generic method call with explicit type arguments, when return type is generic: var manager = serviceProvider.GetService<IManager>()
  5. Generic static method call or property with explicit type arguments, when return type is generic: var manager = Singleton<Manager>.Instance;"

I have to admit, ReSharper was a big influence in overcoming my aversion to the "var" keyword.  It's no secret that I'm a fan of ReSharper and think that it helps standardize and improve code.  It doesn't hurt that each subsequent version of Visual Studio implements a lot of the functionality that ReSharper already provided, forcing them to work hard to stay ahead of the curve to justify purchasing it. 

The main argument against the var keyword is that it reduces code readability.  Looking at the five listed scenarios above, I don't see how it reduces readability.  Take example one.  Is it really necessary to say "Dictionary<int, string>" twice?  I don't think so.  The same is true for 2-5.  Each example already states the type quite clearly, and var allows you to say it once. 

However, take this example.

var rdr = DataWrapperClass.ExecuteReader();

It's not immediately clear what type of variable rdr is.  I still think var is a good choice here, and I have backing reasons for it.  This is taken from an actual scenario I ran into in my current job.  In this case we had a wrapper class around all of our data access code, and it provided some useful functionality.  We use CSLA in our shop, and this method returned an instance of CSLA's SafeDataReader, which is itself a wrapper around the IDataReader interface.  However in the current project it fell a little short.  We're upgrading a rather dated codebase that was originally written in C++.  Much of the code dates back years.  Having started out in C++ myself, I remember when C++ didn't have a native boolean type.  What you used was integers.  0 was false, non-zero was true.  The database we were working with followed this same convention.  Many boolean values were stored as integers.  To help out with this, I extended SafeDataReader to allow reading an integer column as a boolean.  Next, I changed the return type of ExecuteReader to return my new class. All references to ExecuteReader explicitly declared the return value to be SafeDataReader.  There's a number of ways this can go.  The particular method I overrode happened to be virtual, so it turns out the code would work as is, but seeing this in the code

SafeDataReader rdr = DataWrapperClass.ExecuteReader();

can be even more confusing that var, because while you think you're working with SafeDataReader, you're not.  But what if the method I had overridden hadn't been virtual?  The code would still compile, but when you tried to read an integer column as a boolean, the code would blow up.  Why?  Because even though the actual instantiated type is my custom derived class, you are still accessing it with the SafeDataReader interface so it would call GetBoolean from SafeDataReader, which doesn't expect integers.  That becomes a runtime error, my friend, not a compile time error.

Here's another example.  In this same class I implemented a new reading function that would translate "O" and "P" to true/false.  Don't ask, but it actually happened.  This method is not accessible from SafeDataReader, it's only accessible if you use my derived class. 

Both of these examples can be shrugged off, with a "We'll fix them as we find them" attitude.  On the other hand, what if I had needed to implement my own class instead of inheriting from SafeDataReader, even though the interface is similar?  I now have to fix hundreds of lines of code before I can recompile.  If they had been declared with var instead of SafeDataReader, I could have done the same thing in a couple of lines of code. 

Dare Obasanjo responded to the ReSharper post with examples of how the codebase for RSS Bandit had become less legible as a result of ReSharper's suggested use of var.  I like Dare, but I have to take exception to some of his responses.  They're not in any way suggesting we should go back to Hungarian naming conventions, and I would sooner write Cobol than see that.  They're suggesting better names like "currentElement" instead of "e" or "element", which we should all be doing anyway.  He even invokes the holy Microsoft C# Language Reference to make his point. 

Jared Parsons also chimed in in response, and quite observantly grouped people into three camps on this topic. 

"There appear to be three groups in this debate.

  1. Whenever it's possible
  2. Only when it's absolutely clear what the type is
  3. Never, type inference is evil "

I'm still surprised at how many people, even in my own office, fall into category 3.  I'm somewhere between 1 and 2, but I lean heavily towards 1.  I'm still not convinced there's any value to

var someValue = 5;

But I can't quite articulate why that one bothers me and the others don't.  The main reason I like var, however, was listed on Jared's page rather than any of the Resharper team's reasons.

"Makes refactoring easier.  I re-factor, a lot.  I constantly split up or rename types.  Often in such a way that refactoring tools don't fixup all of the problems.  With var declarations I don't have to worry because they just properly infer their new type and happily chug along.  For explicit type cases I have to manually update all of the type names. "

Yep, as noted above, I ran into that one myself. 

I understand the readability remark, given that you're using Notepad to write your code.  It's not a problem for me because, apparently unlike the denizens of Camp #3, I use Visual Studio for my day to day work.  For those who can't afford VS, there's a free version as well.  It's got this really cool feature that Notepad doesn't give you.  If you move your mouse over the "var" keyword, you see this:


Yep, that's the compiler inferred type right there.  If you're not sure what type rdr is, all you have to do is move your mouse. 

I probably sound a little snide with this example, but really people!  We've moved on.  Code is too complex to be done in Notepad these days.  IDEs handle most of these kinds of problems out of necessity. It's even been suggested that the complexity of modern code could be mitigated by storing source in a custom format rather than text files.  We've been using text for decades and it might be time to upgrade.  That's a topic for another post, however.

As usual, I'm the Black Sheep in my company on this topic, and have taken a fair amount of ribbing about it.  My last parting thought to the citizens of camp #3: Check out C# 4.0.  It introduces the dynamic keyword.  It's coming people, get used to it or become the next Cobol programmers.

(My apologies to any Cobol programmers reading this.)

Posted On Sunday, December 21, 2008 9:16 PM | Comments (1) |

Wednesday, September 10, 2008

Delegates 101

It doesn't seem like all that long ago I was trying to get my head around exactly what a delegate was.  Honestly it wasn't that hard for me coming from a C++ background, since they were an awful lot like function pointers.  But still, the concept was just a little different. 

In the interest of making your UI more responsive, it's a good idea to thread some of the more intense work that occurs in your application.  As soon as you say "Threading", a lot of people blanch and go pale, but it's really not that difficult, in fact, one class makes it downright easy.

The BackgroundWorker component (System.ComponentModel.BackgroundWorker) is specifically designed to take some intensive work and make threading it easy.  It's a component, so you can drop it right on your form and just simply subscribe to the events DoWork and RunWorkerCompleted.  DoWork is the event that fires on the worker thread and is where your intensive work that would otherwise freeze the UI would occur, and RunWorkerCompleted is the event that fires back on the main thread (or more accurately the thread that initiated the work) and allows you to get the results of the threaded work.  RunWorkerAsync() is what tells the worker to spawn of the worker thread and do the work.

I like using this component, because it takes a lot of the complexity of working directly with with the Thread class away and handles it for you.  Less plumbing to maintain.

I recently ran into a case where I wanted a component to behave similarly, where you would simply call a method to say "go do your thing" and then fire an event when it was done letting the caller know it could get the data now.  The work would occur on a worker thread so that it would not freeze the UI, but the UI would not need to know anything about that.  As far as the UI was concerned, it called a method and later on got an event, without ever having called any threading code.

Here's how that played out.  I've created some demo code structured in the same manner, but it's been genericized.  First off I needed something for this code to do, so it will return an IEnumerable instance of DataClass.  DataClass is just a data class that I can bind to a grid on a form. 

I drop a DataGrid, a binding source pointed at DataClass, and a few buttons and I've got my framework done.  I created DataService to handle to complexities of retrieving the data, so that in the button click event it simply calls dataService.GetDataAsync() then the DataRetrieved event fires so that the form can retrieve the data from the event args and plug it into the binding source, updating the grid.

Internally to the DataService, is where the cool stuff happens.  Here's GetDataAsync()

public void GetDataAsync()
    using (BackgroundWorker worker = new BackgroundWorker())
        worker.DoWork += new DoWorkEventHandler(worker_DoWork);
        worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(worker_RunWorkerCompleted);

I could have dropped BackgroundWorker onto the design surface of DataService, since I made it a component, but I decided not to so that I can show some delegate goodness.  What you see here is DataService subscribing to DoWork and RunWorkerCompleted then kicking off the thread.  What we've done here is pass a delegate into the event handler.  DoWorkEventHandler and RunWorkerCompletedEventHandler are both predefined delegates that match a specific interface, allowing it to be passed to the event handler.  The parameter to the delegates simply take a pointer to the method.  If the method does not match the signature, a compile error will be thrown. 

This is how we had to do it in .NET 1.0, 1.1 days, and if you subscribe to events in the designer, this method is still used, and the code is generated and placed in InitializeComponent.  However, we can make it cooler.

.NET 2.0 gave us anonymous delegates, which were essentially a way of declaring an anonymous method.  The reason I like this approach is because it keeps all the code related to a given task in the same place rather than spread out among several methods.  This isn't always practical or the best design, but in a lot of cases it is.  That code would look like this:

public void GetDataAsync()
    using (BackgroundWorker worker = new BackgroundWorker())
        worker.DoWork += new DoWorkEventHandler(delegate(object sender, DoWorkEventArgs e) { e.Result = GetSomeData(); });
        worker.RunWorkerCompleted +=
            new RunWorkerCompletedEventHandler(delegate(object sender, RunWorkerCompletedEventArgs e)
                                                       IEnumerable<DataClass> data = (IEnumerable<DataClass>) e.Result;

I've taken the contents of worker_DoWork and worker_RunWorkerCompleted and moved them inline to an anonymous delegate.  The signature still needs to match the delegate, and Intellisense won't help you out much here.  What's important to realize is that the code will operate exactly the same way as the other code, so the code in the delegate wont actually execute until the event fires.  There are several resources that explain what the compiler actually does to make this work, so I won't cover it now. 

There's still some redundancy here.  The compiler is smart enough to know what the signature of the delegate is, so we don't necessarily have to define it.  We can actually write the same method like this:

public void GetDataAsync()
    IEnumerable<DataClass> data = null;
    using (BackgroundWorker worker = new BackgroundWorker())
        worker.DoWork += delegate { data = GetSomeData(); };
        worker.RunWorkerCompleted += delegate { OnDataRetrieved(data); };

Oops, I snuck in another change as well.  You'll notice that data is defined as a variable in the method.  No worries! The compile handles all this for you too.  Again, the classes and stuff that are generated here are discussed in plenty of other places, so I'll leave the details alone for now, but it does work.  What this allows us to do is to not box and unbox the data using the Result property of the event args which is of type object.  Since we no longer need event args, we don't even have to specify them.  The compiler knows that DoWork takes a DoWorkEventHandler, and that the anonymous delegate can fit that bill, so it allows it.  The code is just about as small as it was as it was initially, but we've refactored out two event handler methods.  That's pretty cool.

But I'm not done yet.  We can take it one step further.  In 3.5 we got Linq, and Linq gave us some pretty cool stuff, and a lot of it can be used outside of the context of Linq.  Specifically I'm talking about Lambda expressions. 

A Lambda expression is basically a syntax for specifying an anonymous delegate.  Again, the compiler is smart enough to know the signature of the delegate, so we don't always have to type that code out.  You've probably seen code like

data.Where(x => x.Beta < 100);

Since data is IEnumerable<DataClass> and Where takes a Predicate, it knows that the single parameter of the predicate should be a DataClass and the return value is a bool.  Therefore this code equates to

data.Where(delegate(DataClass x)
                   return x.Beta < 100;

We can also use Lambda expressions for all kinds of anonymous delegates.  Now we can write the GetDataAsync code like this:

public void GetDataAsync()
    IEnumerable<DataClass> data = null;
    using (BackgroundWorker worker = new BackgroundWorker())
        worker.DoWork += ((sender, e) => data = GetSomeData());
        worker.RunWorkerCompleted += ((sender, e) => OnDataRetrieved(data));
How cool is that? 

Posted On Wednesday, September 10, 2008 2:01 PM | Comments (2) |

Powered by: