Dave Chestnutt: SparklingCode and CodeGaffes

Writing better code; for fun and profit

  Home  |   Contact  |   Syndication    |   Login
  18 Posts | 1 Stories | 49 Comments | 21 Trackbacks

News

Tag Cloud


Article Categories

Archives

Post Categories

Saturday, March 28, 2009 #

We’ve all heard the mantra – Unit Testing is good for the soul.  We hear about the goodness of things like JUnit, NUnit, and TDD from other people.image

But you’re not convinced.

After all, it takes more time to write unit test code – and you’d rather get on with the next feature.  Besides, whether it’s true or not, you certainly feel like your progress is measured by how many features you crank out.

In my own journey, I found I went through three distinct stages to testing heaven.  Look at my “diary” for the first few unit tested projects I was on.

Stage 1: "I believe, it's stupid but I believe"

Unit Testing, huh?  Great, now I have to spend time writing these stupid tests before I can say I’m done. It’s not really hard, but it takes additional time to create these tests. Time that I think would be better spent moving the project along with more features.  I was told to do it, so I will, but I don't see or feel the benefit yet.  I write tests because I have to.

imageStage 2: OK, it’s good for the project

Hmm, I added a new feature today – and someone else’s test broke.  As I checked it out, I found that indeed, I had modified some code that no longer worked for another scenario.  So, for the first time I saw a benefit from the tests.  Of course, I still think my time is better spent adding new features – but I begin to see that they’re a good thing to have, especially if other people write them.  So now I’m voluntarily writing tests for some of the new code I write.  The project is better for it.

Stage 3: Hey, I can code faster

And gradually it happened.  Bit by bit the the second level effects of testing crept into my work.

  • I can write a small test to reproduce a bug, and I don’t need to get a whole test environment set up.  This lets me debug my solution much much faster.
  • I can write a small test to exercise some code that I can’t even get to from the main application yet.  This is cool, because I can implement a feature before someone else needs it.  And put it through its paces.
  • Now that I’ve gotten the hang of writing tests, I find it extremely easy to write lots of test cases for the same piece of functionality.  After writing the first test, it’s real easy to tweak it for many test cases.  We always talked about testing all the edge cases – now I can actually do so.
  • I can modify code (like redesign the innards of a class) with high confidence that I won’t break something that depends on it – because I have a set of tests to make sure I don’t break behavior.  This is where Unit Testing plus Refactoring have a real synergy.  The two combined offer a great advantage over either practice by itself.  Better quality code that’s easier to maintain.

So that’s how it finally happened. I actively use tests to develop and debug code.  In fact – TDD (writing the tests first) as a process kind of snuck up on me. I can’t imagine writing code without tests anymore.

So if you’re not there yet – don’t worry.  Especially if your boss is on your case.  It’ll come.  And when it does, you’ll wonder how you ever ever did without it.

Technorati tags: , , ,

Tuesday, March 13, 2007 #

The CLR is full of surprises.

 

Microsoft has a built-in System.IO.Path class, to deal with all the path parsing issues we often run across.  One of the most useful methods in there can get rid of all the special case code we write to combine path strings.  How many times have you written something like this to combine two path strings -- taking into account whether you need to add a slash between the path strings or not:

String path1 = @"C:\ABC";   // may or may not end in a slash

String path2 = @"DEF";      // may or may not start with a slash

 

String path = path1.TrimEnd('\\') + "\\" + path2.TrimStart('\\')

That one line of code works, but it creates 3 strings, calls two methods, and in general looks ugly.

 

Enter the Path class.

 

Path.Combine( path1, path2 )

 

This is a sweet routine - you use it to combine two strings and it takes care of figuring out whether you need a slash between the parts of the path or not.  But there's a hitch.

 

See if you can predict the output of these lines:

 

Console.WriteLine( Path.Combine( @"C:\ABC",  @"DEF" ));

Console.WriteLine( Path.Combine( @"C:\ABC\", @"DEF" ));

Console.WriteLine( Path.Combine( @"C:\ABC",  @"\DEF" ));

Console.WriteLine( Path.Combine( @"C:\ABC\", @"\DEF" ));

 

When you think you know the answer, read on...

 

The first two work as expected.  It correctly builds the path.

Path.Combine( @"C:\ABC",  @"DEF" ) -> C:\ABC\DEF

Path.Combine( @"C:\ABC\", @"DEF" ) -> C:\ABC\DEF

 

But here's the surprise.  If your second piece of the path starts with a backslash, it assumes you must want it to be the root.  So it ignores the first parameter!

Path.Combine( @"C:\ABC",  @"\DEF" ) -> \DEF

Path.Combine( @"C:\ABC\", @"\DEF" ) -> \DEF

 

The documentation for Path.Combine makes this clear in its remarks.  But really now, is this a useful feature?  Path.Combine would have been much more useful if it simply took two paths and concatenated them properly. For my money, having these four examples return C:\ABC\DEF would have been a better choice.

 

What Path.Combine IS useful for would be better documented like this:

Path.Combine( defaultDirectory, userPath );

If I had seen that as the description of the parameters, this annoying behavior would not have surprised me.  More importantly: I wouldn't have dreamed of using it for "combining" paths.

 

Ok, I'll stop dreaming.

 

Technorati tags: ,

Monday, December 11, 2006 #

I use a coding tool called ReSharper - and I was pleasantly surprised the other day when it pointed out a copy & paste bug that was waiting to explode.  You see, copy and paste programming is so very easy to do - that we all do it.  Even the best of us accidentally leave in duplicate code snippets from time to time.

 

Why Cut & Paste is so bad

There are two big reasons why this is an anti-pattern:

  1. If the code you cut and pasted has a bug in it (and it invariably will), what do you think are the chances that you'll find and fix all occurrences of it?
  2. If you're rubber-stamping code (copy and paste, paste, paste…) so that you can make minor edits to the pasted version, it's too easy to make a mistake - and the code will compile fine.  It won't fall over until a customer runs it.  Rubber-stamping typically happens in a switch statement or a batch or if/else-if statements (where the cases are nearly identical).

 

It's this second case that ReSharper showed me a bug that wouldn't have been found until run-time.  It surprised me because ReSharper doesn't look for duplicate code snippets.  But one of its rules can get triggered if you copy & paste code incorrectly.

 

ReSharper is...

This is an add-in to Visual Studio.  It does many things, but one thing it does well is point out flaws in your C# code via highlighting - so you can quickly see suspicious or questionable practices.  There are other tools that do the same for C#, and there are similar tools for Java. [Disclaimer: I do not work for JetBrains. I paid for my own copy out of my own pocket. I'm just a happy user.]

 

This is not a tutorial on using ReSharper - I'm just showing one tiny example of how it can help you.  Here is some legacy code opened in Visual Studio with ReSharper:

 

 

Notice the highlights on the right of the source window.  Those give an indication of where there might be problems in the source.  You can click those lines and it will jump to the place in the code with the problem.  It highlights the error and offers fixes, too.  The orange lines are approximately lined up with where the scroll bar would be when you view the code.  Orange means warning, and red means error.  Opening up legacy code often shows many many orange lines because ReSharper warns about suspicious or bad practices.  These are things that the compiler is happy with, but you'll be unhappy about in the long run.

 

Here's what ReSharper found

So, back to how I was pleasantly surprised.  Look at this code example.  It's pretty obvious at first glance that this code was rubber-stamped to handle many cases - with slight edits in each version.

 

This code will compile, but it won't work right.  Here's what it looks like in normal Visual Studio:

 

 

 

Do you see the problem?  It's in line 41.  When copying and pasting, the person forgot to modify the if statement to use htmlButtonMatch instead of htmlTextboxMatch.  It's clearly wrong, but it might not show up as a bug until months after someone wrote this.

 

Look at what ReSharper shows (it does additional highlighting of things like method names, etc.):

 

 

 

With this cool tool, I'm immediately alerted to the problem.    The tool tip for the grayed out variable warns me that it wasn't used.  From that, I can easily spot the error in line 41.  The important thing is that ReSharper alerted me that there was a problem.

 

This was some actual legacy code that I opened up one day - when I noticed the problem.  I was fix it and find out why unit tests didn't cover this case.

 

You know, if I didn't tell people I used ReSharper, they'd think I was really smart.

 

But I am smart enough to use good tools.

 

 

See also: http://en.wikipedia.org/wiki/Copy_and_paste_programming

 

Technorati tags: , ,

Thursday, November 09, 2006 #

Christopher Diggins has an interesting post on Intentional Programming.   This is a new way of applying top-down design.  It's a very good idea: you write your code nice and clear, with calls to sub-methods that "do the right thing" - sub-methods that you'll write later.  The advantage of this is that your code will be quite readable.  And since our code always lasts much longer than we ever intended, clear readable code is more maintainable (and likely to haveless bugs, too).  An idea well worth thinking about and adding to your bag of tricks.
 
See: http://www.artima.com/forums/flat.jsp?forum=106&thread=179611
 
Technorati tags: ,

Wednesday, November 08, 2006 #

Imagine hanging onto one of the solid rocket boosters when the Space Shuttle goes up.
 
This is a cool 4 minute video of a camera on one of the Atlantis flights.  It's on from takeoff - to separation - to splashdown.  There's no sound, but just try to imagine everything shaking around you...
 
http://mfile.akamai.com/18566/asf/etouchsyst2.download.akamai.com/18355/wm.nasa-global/sts-115/fd01/115_SRB_LeftSide.1.asx
 

Saturday, October 21, 2006 #

The other day, I had to fix a particularly nasty bug with a NullReferenceException.  During the process, I came up with a couple alternatives to fix it and decided to document why Checking for Null Fields is BAD
http://geekswithblogs.net/dchestnutt/articles/94744.aspx
 
Technorati tags: ,

Monday, January 16, 2006 #

In my company, I've been able to join various projects over the years.  As a result, I'm usually modifying or adding to legacy code.  Of course, in reality, all code is legacy code after about a week.
 
In looking at code (written by others as well as myself) I sometimes see bad coding practices.  These are always obvious in hindsight. 
 
In the spirit of blogging, I'm going to share these as CodeGaffes.   Look to understand the reason why these are bad, as well as how to fix or avoid them.  In a sense, these are like design patterns, only in miniature.  By and large, they apply to all three languages, C#, Java, and C++.
 
Hopefully you can be amused (as you recognize them) or instructed (by avoiding them).
 
I am certainly not the first (nor the last) to point out these coding problems.  You can find them mentioned elsewhere in blogs and books, I'm just pulling some of them together.
 
Feel free to add comments, whether you agree or not.  And, of course, we all love to hear war stories where you've seen the effects of such gaffes. 
 
Sparkling code - this is what you strive for.  Stuff to be proud of.

See recent posts: http://geekswithblogs/dchestnutt
 
Technorati tags: ,

One of the benefits of object oriented design, is that some problems show up during compile-time instead of at run-time.  And you know that run-time issues always show up at the worst possible time, like at a customer site.
 
This CodeGaffe happens when a programmer writes code that "enforces" something with comments, or Assertions.  Have you ever seen a method defined in a base class with a comment or Assertion telling you to override it?  Of course, if the subclass doesn't override it, you can be certain the method will be called at an inopportune time. This is a simple misunderstanding of an object oriented feature because there is a way to make the language work for you to force the subclass author to override the method.
 
So what's this CodeGaffe look like?

 
    public class TableBase
    {
        public virtual int CountLines()
        {
            Debug.Assert(false, "override me");  // *** Don't do this
            return 0;
        }
    }
 
    public class MyTableSubclass : TableBase  // subclass
    {
    }
 
    public static void Main(string[] args)        // here is code to use it
    {
        MyTableSubclass table = new MyTableSubclass ();
        Console.WriteLine(table.CountLines());  // prints 0 after assert
    }


 
See the problem?  As the original author of TableBase, you want to make sure people who write subclasses implement their own CountLines method.  But if they forget to override it, they won't know until the method is called at runtime.   Possibly at a customer site, but more likely during testing.  
 
Instead, define your base class this way:

 
    public abstract class TableBase           // *** Do use "abstract"
    {
        public abstract int CountLines();     // *** Do use "abstract"
    }
 
    public class MyTableSubclass : TableBase  // now this won't compile
    {
    }

 

Now, if someone extends your class, they'll find out at compile-time -- that's about as immediate as you can get without standing behind them, watching over their shoulder as they code.  And fixing the subclass at this time is trivial because the error points to the actual problem: a missing definition for CountLines.

 


 

    public abstract class TableBase
    {
        public abstract int CountLines(); // *** "abstract" forces your intent
    }
   

    public class MyTableSubclass : TableBase
    {
        public override int CountLines() // the new method definition
        {
            return 1;
        }
    }

 

    public static void Main(string[] args) // here is code to use it
    {
        MyTableSubclass table = new MyTableSubclass ();
        Console.WriteLine(table.CountLines());  // outputs "1", as it should 
    }
       


 

The moral?  If you find yourself creating (or maintaining) a method with comments or assertions telling people to override it -- make the method abstract.  And you'll enforce your intention.

 

Technorati tags: ,


Wednesday, January 18, 2006 #

I was fixing a bug the other day, when I ran across this comment (marked in red):

 
if (condition1)
{
    //… some useful code …
}
else if (condition2)
{
    //… some useful code …
}
else
{
    // this can never happen
}
 
My first thought was, if this can never happen then why not throw an exception in case it does? So I added a line to throw an exception.
 
if (condition1)
{
    //… some useful code …
}
else if (condition2)
{
    //… some useful code …
}
else
{
    // this can never happen
    throw new ApplicationException("this can never happen");
}
 
And you can guess what happened. The next night, some of our tests failed. Because of course "it" did happen.

 

There are a couple of reasons why this pattern appears in code.

  • Overconfidence. The programmer is sure that their prior code has caught all cases, so it won't happen – but they want to be good citizens and document that fact with a comment or assertion. And their comment is correct – at first. But of course, all code gets modified and unfortunately, over time the original assumptions no longer hold. Since the assumptions are just that – assumed – and not checked by the code, you end up with brittle code.
  • Time Crunch. The programmer, under the gun to fix a bug or finish a feature, doesn't really know what the code should do if this actually happens. There's no apparent recovery that the code can do, but they're pretty sure it won't happen. They're not 100% sure so they leave a comment or Debug Assertion behind.  In this scenario, unit tests are usually skipped, too.

So what's the right thing to do?

 

  1. Throw an exception. You can leave the comment or assertion in, but DO throw an exception so you see the problem exactly when it happens.  It will be much easier to spot the problem when you have an exception pointing directly at the offending line, than it is to figure out what's wrong when the code doesn't fall over until another thousand lines of code execute. When it breaks, as it did in my case, you can fix the code to handle the new situation and move on.
  2. Write recovery code.  Sometimes, there's a perfectly reasonable action you can do - so recover and let the code continue. For example, this code was checking values from a settings file, and if the file was ever corrupted, there was a perfectly reasonable default we could use.

 

In essence, this CodeGaffe is like neglecting to add a default case to a switch statement. Many times, you really should have a final else on your if statements. They won't all throw exceptions of course, but you should always think about what should happen if your execution did fall through your if statements. Then do the right thing with some actual code, not with clueless comments.  Ah, if only the compiler would do what I meant, not what I wrote.

 

Have you ever seen this gaffe in your projects?

 

Technorati tags: ,


Monday, January 30, 2006 #

One of the benefits of using modern editors is that it's really easy to navigate your code base. If you're looking at a method call, for example, you can use a keystroke combination to jump to the method definition.
 
But there's a drawback to this.
 
If you're not careful, as you add new code you can end up with monster methods, or monster-sized classes. Let me ask you this - what do you think the maximum size of a class should be? 100 lines? 500 lines? 10,000 lines? While we won't agree on an exact number, we can all agree that a class with 10,000 lines in it is much more likely to have bugs in it than one with 100 lines.
 
But no one sets out to write such a huge class. How does it get this way? Simple - over time, various people make little modifications to a class. How many times have you looked at a modification and thought to yourself, "Perfect, I'll just add these lines of code here, and it'll support the new feature." After enough people do that, your 500 line class grows and grows. And grows!
 
Here's a fairly innocuous bit of code. If I had just added this, without line numbers being displayed, I might not realize how big the class already is. But, as the screen shot on the right shows, I might be more inclined to realize that I shouldn't grow this class any further since it already has nearly 20,000 lines of code in it!
 
No Line Numbers Line Numbers show a problem!
     
 
Thus, my advice: turn on line number display in your editor. In Visual Studio, turn on Line Numbers by going to Tools | Options | Text Editor and check Line Numbers
 
 
Now, when you're fixing bugs or adding features, you'll see line numbers. And when you get near the bottom of a file, you'll have another reminder that the class is growing out of control. Then you can plan some refactoring to break up the monster method or class. And improve your code base.
 
Turn on line numbers in your editor. They're an early warning system.
 
Technorati tags: ,

Monday, February 06, 2006 #

This CodeGaffe covers two similar problems.  The first one involves Booleans, while the second covers any variable type. 

 

Here's some code that contains 2 bugs in one line.   This is a practice to avoid.  Can you spot the 2 bugs? 

 

if (m_condition = true) { // *** DON'T DO THIS!
  // do something
}

 

If you had any trouble spotting the problem, that's because you're normal.  And if you didn't have any trouble, that's because you were looking for a problem.  If you didn't know there was anything wrong with that code, you might just have easily read the code as the author undoubtedly intended:

 

if (m_condition == true) { // legal but error prone
  // do something
}

 

And while you wouldn't accidentally leave out one of the equal signs, it's a bear to find this problem when someone else does.  And this is a case where the compiler doesn't help you.  It's quite happy to compile this code.  Which is why we get two bugs: 1) it always sets m_condition to true and, 2) always enters the block of code.

 

Instead, let a Boolean be a Boolean and test it naturally:

 

if (m_condition) {    // natural test for bool
  // do something
}

 

This is a problem for all three languages, C#, Java, and C++.  But, test Booleans naturally and you'll avoid the problem.

 

Now, for the second half of the CodeGaffe.  What about other types besides Boolean, like integers?  Don't you have to test them this way?  Well it turns out, that because this was a source of bugs, the designers of Java and C# made the language prevent you from putting anything other than a Boolean expression inside an if statement.  So, dropping an equal sign on the floor will result in a compiler warning.  If you write this it won't compile.  The compiler will save you:

 

if (m_flag = 3) {    // won't compile in Java and C#
  // do something
}

 

But what about C++?  That code will compile, and suffer the same bugs as the first example with Booleans.  So, to be safe, get into the habit of doing this (I know, it looks weird at first):

 

if (3 == m_flag) {    // put constant first
  // do something
}

 

Now the compiler is your friend.  If you inadvertently forget the second equal sign, you'll get a compiler error instead of a run-time bug.  And that's much more productive!  I'll take a compiler error over a runtime bug any day of the week.

 

Technorati tags: ,


Thursday, February 16, 2006 #

Today's CodeGaffe is something I see all the time.  It creeps into your code over time.  Here's the scenario: Your class has grown from its humble beginnings, and there are now fields in your class that should not be used by some of your methods.  In essence, you want some of your data to be private from part of your class.  As in "really private".

 

If you've ever found yourself thinking, "Most of this class shouldn't access field foo,” then you've experienced this.

 

Here's an example I ran across recently - this class has two slightly different copies of an object, called a Thingee, but only one should be used by the majority of the class.  Which one would you use if you were adding new code?

 

/// 
/// This class has a Thingee that it works with.
/// But, in reality it contains 2 Thingee objects, 
/// and dynamically chooses which one should be used.
/// 
public class PrivatePrivateBad
{
    private object OurThingee
    {
        get {
            if (m_CatIsAround)
                return m_Thing1;
            else
                return m_Thing2;
        }
    }
 
    private bool m_CatIsAround;
    private object m_Thing1;
    private object m_Thing2;
 
    public PrivatePrivateBad()
    {
        m_CatIsAround = ...;
        m_Thing1 = ...;
        m_Thing2 = ...;
    }
 
    public void SomeMethod()
    {   
        // I can access both fields, but really shouldn't
        Console.Out.WriteLine(m_Thing1);
        Console.Out.WriteLine(m_Thing2);
        
        // Accessing the property is OK
        Console.Out.WriteLine(OurThingee);
    }
}

 

The most common solution to this problem is to dictate (in comments, where else?) that you must always use the property OurThingee instead of accessing the two fields directly.  And that will work - as long as everyone follows the rules.  But there's no way to enforce it.

 

Without enforcement, as the code evolves, eventually someone will not know about this convention.  They'll access m_Thing1 or m_Thing2 directly.   After all, both fields are private to this class, so it is reasonable for a new person on your project to use any of the fields as they add new code.  And in a large class, once someone does this, future developers are much more likely to follow the same example and reference stuff that should have been "really private".  This is how good code goes bad.

 

So, the intent of the original design gets lost.  And bugs appear.

 

Fortunately, there's a simple way to enforce your design. You need to create another class.  The new class will contain both Thingee objects, and hide the details of which Thingee you're supposed to use.  It will simply have a property or assessor method to return the proper object. The new class will enforce our design, so we'll call it a ThingeeEnforcer, like so:

 

/// 
/// This class hides the details of how we dynamically
/// choose which Thingee should be used.
/// 
public class ThingeeEnforcer
{
    public object CurrentThingee
    {
        get
        {
            if (m_CatIsAround)
                return m_Thing1;
            else
                return m_Thing2;
        }
    }
    private bool m_CatIsAround;
    private object m_Thing1;
    private object m_Thing2;
 
    public void SetThingee()
    {
        m_CatIsAround = ...;
        m_Thing1 = ...;
        m_Thing2 = ...;
    }
}

 

Refactoring our original class is actually pretty easy.  You need to move the relevant fields to the enforcer class, and change the OurThingee property to access our new class.  Wherever you set the initial values of the objects, you now delegate the work to the new enforcer class.  Here's what the refactored original class looks like.  Notice you can't possibly reference the wrong object anymore:

 

/// 
/// In this class, you can only access Thingee via
/// the ThingeeEnforcer, so there's no way to 
/// get the wrong one.
/// 
public class PrivatePrivateGood
{
    private ThingeeEnforcer te = new ThingeeEnforcer();
 
    private object OurThingee
    {
        get { return te.CurrentThingee; }
    }
 
    public PrivatePrivateGood()
    {
        te.SetThingee();
    }
 
    public void SomeMethod()
    {
        // This is the only Thingee we can access
        Console.Out.WriteLine(OurThingee);
    }
}

 

The steps for refactoring the Thingee stuff into a separate class are straightforward. 

 

  1. First, review the original class's code and make sure all code references the appropriate property or accessor method.
  2. You're likely to see only two or three places that truly need to access the fields directly:
    1. The constructor (or some initialization method) that sets their values
    2. The property, or accessor method that gets the right value
    3. Possibly a method that clears their values
  3. Create a new class; we called it ThingeeEnforcer as it enforced the rules of access.
  4. Move the fields to the new class
  5. Add method(s) to set/clear the values
  6. Add a property or accessor to get the correct value
  7. Convert all code in the original class to use the new methods/accessors from the enforcer class.  In our case, this was pretty simple since (in step 1) we made sure all the code was accessing our property.  The good thing is, by moving the fields (in step 4) the compiler will help you find every single reference that needs fixing.

 

Now, when people add new features to your class next time, they won't be tempted to access the "really private" fields.  They're safely tucked away in your enforcer class.

 

As I said at the outset, the key is to realize that as soon as you find yourself thinking, "Most of my class shouldn't access something" you've hit the point where you need another class to impose that restriction.

 

It's OK to create small classes like this to enforce some condition.  That's exactly what object oriented design is all about.  And your code will sparkle.

 

Technorati tags: ,


Wednesday, April 05, 2006 #

Microsoft added a new keyword to C# and VB for 2005 (CLR 2.0): partial

Don't use it.

partial is used to physically break up a class definition into multiple files. When the compiler sees the keyword partial it finds all the related partial files in order to compile the class. This makes it possible to split the code for a single class across multiple files. By and large, though, this is a bad idea. Let’s look at why that is.

If you find
yourself typing
"p-a-r-t-i-a-l"
Stop!

Reasons Given To Use the “partial” keyword

There are three main incentives I’ve heard for using partial, according to Microsoft and others:

  1. If a class is really large, break it into multiple files so more than one person can work on it at a time.
  2. If a class implements multiple interfaces, put the methods that implement each interface into a separate file.
  3. When the IDE generates code, it puts the code into a separate file so you don't see it or accidentally modify it.

Let's examine each of these reasons in detail.

1. Avoid using “partial” to break large files into many smaller files—so multiple people can work on them. This is a bad practice.

Why?

No matter how you try to group code in various partial files, you will never see the big picture for the class anymore. It’ll become brittle, easily broken. By only viewing a portion of the class, you can too easily break some assumption or dependency in the rest of the class. For example, you won’t see all the internal fields. If a field needs updating, and it’s not referenced in the portion of the class you’re editing, you’ll never even realize that you are breaking the class. People will make incompatible changes and no one will know until Humpty Dumpty is put back together again. Are your unit tests good enough to catch this?

So what's the right solution for really large files?

If you’re thinking of using partial because your class file is large, then consider it a warning sign that it's time to break the class up—not by using partial but by using object oriented design. Break your monster class into multiple classes, grouped by responsibility. That way, you can hide details inside classes, and it will be much harder for developers who work on the new smaller classes to make incompatible changes with each other. As a bonus, breaking it up will also make your code easier to read, test, and maintain.

2. Avoid using “partial” to organize your class file. Some people suggest putting the implementations of each interface into their own files. This sounds reasonable at first, but it contains a pitfall—the class implements 2 or more interfaces, but it is still one class that has its own private data (and all the assumptions that go with that). If you only read code in one file and make changes, you may not realize the side effects you introduce on other methods that are in other partial files. This is a bad practice for the same reason that it was for reason #1, large class files.

3. “partial” is ok if the compiler puts it there. This is the only safe reason to use partial. When the compiler adds partial to your class, it does so to hide the code that it generates. While there are other solutions to handling generated code, this is the one that Microsoft chose, and it works fine. You can't ever modify the generated code (well, not easily) so it's just as well that it's hidden in a separate file. Note the key phrase here: the compiler adds partial—not you.

Recommendations

  • If you find yourself typing "p-a-r-t-i-a-l" — stop! Don’t do it.
  • If the class you’re working on is marked partial, make sure that the other partial bits are compiler-generated. If they’re not, you should consider putting the class files back together and re-factoring the large class into smaller classes.

One final Note: in VB, since the partial keyword is optional, you can really confuse someone reading the code because they may not even realize that other code exists for this class. If you work in VB, be aware that your entire class may not be in the file you’re looking at.

For another viewpoint, see these articles:

Technorati tags: , , , ,

Tuesday, May 30, 2006 #

Visual Studio has a nifty feature called Pre-Build and Post-Build events. These are used to include extra DOS commands before or after the build.

But there's a gotcha! And it will bite you when you least expect it. In Visual Studio, there is NO ERROR CHECKING except at the end of an event. Any errors that happen prior to the final step are lost. Keep reading to see a workaround.

Setting Build Events

Build Events are accessed by right-clicking on a Project in the solution explorer and choosing "Build Events":

It looks like these were originally intended as one-line DOS commands to do tweaks in your build. But they obviously grew - and you can put any number of DOS commands in there to copy files, clean up directories, even run unit tests!

Before describing the gotcha, there are a few things you need to know if you've never used these before. One is that you can put any DOS commands in here. It's really nothing more than a place to save a batch file that's run before (or after) the project build. It also offers "macros". The build process actually creates a batch file from your commands by replacing the macro text.

In this example, MYDEPLOY is just a regular DOS environment variable. The advantage to macros is they take into account the various differences between developers (and their file directory structure) as well as handling Debug vs. Release builds.

For example, this command:

Copy $(OutDir)*.* %MYDEPLOY%

Gets converted to this, and placed in a batch file:

Copy bin\Debug\*.* %MYDEPLOY%

The Gotcha

So, what's the problem?

Well, VS puts these commands in a batch file and runs it. VS checks for build failures so it can tell you about them. However - it only checks for a failure of the batch file that it created - not for any failures within the batch file.

What does this mean? If you only place one command in the build event, VS will correctly report any errors that happen. But if you place multiple commands in here, the net effect is that VS will only detect an error in the last statement.

So, for example, if you put three copy commands in here, VS will only report an error if the final command fails.

Ouch.

The Workaround

So how can we fix this? Since VS is just creating a batch file, we use normal batch file commands to detect errors and report them. Here's what I do:

  • I add a check for error after every meaningful step:
if errorlevel 1 goto BuildEventFailed
  • I add code at the end to handle the error so VS will report it:
REM Exit properly because the build will not fail
REM unless the final step exits with an error code
goto BuildEventOK
:BuildEventFailed
echo POSTBUILDSTEP for $(ProjectName) FAILED
exit 1
:BuildEventOK
echo POSTBUILDSTEP for $(ProjectName) COMPLETED OK


Here's an example of one of my actual post build events:

echo POSTBUILDSTEP for $(ProjectName)
 
xcopy "$(TargetPath)" "$(SolutionDir)$(OutDir)" /i /d /y
if errorlevel 1 goto BuildEventFailed
xcopy "$(TargetDir)$(TargetName).pdb" "$(SolutionDir)$(OutDir)" /i /d /y
if errorlevel 1 goto BuildEventFailed
 
echo GAC of: "$(SolutionDir)$(OutDir)$(TargetFileName)"
gacutil.exe /nologo /i "$(SolutionDir)$(OutDir)$(TargetFileName)" /f
if errorlevel 1 goto BuildEventFailed
 
REM Exit properly because the build will not fail 
REM unless the final step exits with an error code
goto BuildEventOK
:BuildEventFailed
echo POSTBUILDSTEP for $(ProjectName) FAILED
exit 1
:BuildEventOK
echo POSTBUILDSTEP for $(ProjectName) COMPLETED OK

You don't have to be as fancy, with the extra "echo" statements, but do the error checking you need so errors don't slip through the cracks.

One final note: don’t put too much in these events. They’re pretty hard to debug when things go wrong. If you have something complex to do, put the commands in a batch file and reference the batch file yourself in the event.

But now at least, you can get notified if (make that when) something fails.

Technorati tags: , ,


Sunday, August 20, 2006 #

We all have old code snippets in our code base. Whether it’s a method that’s no longer used, or a few lines that we’ve replaced - our code has sections commented out. When should we remove them? How should we comment them out? If you’re not careful, commented out code can cause future problems. Read on.

Code Gaffe #1: Sneaky Commented out code

Commented out code should be, well, really commented out.

 

Really comment it out

Don't put a /* at the top and */ at the bottom of code you want to remove:

 

/*

code block line 1

code block line 2

*/

 

Why? If the code block is of any size, someone reading the file may not realize that the code is commented out. And you're simply causing them to waste time. Instead, comment out each line:

 

//

// code block line 1

// code block line 2

//

 

Of course, in modern editors with syntax coloring, this should not be an issue because the commented out code will be colorized as a comment. But we don't all have color printers, and sometimes we still use printouts. And we use other tools besides editors (like file differencing tools, change reports, etc.) that don't have syntax smarts.

 

Besides, if you have a modern editor (and even with some ancient ones), you can select the code block to comment out, press a key, and it'll add "//" to every line - so there's really no excuse to not do it this way.

 

I knew a guy who was very puzzled while trying to fix code in an overloaded method that he didn't realize was commented out. Due to the nature of the code, he couldn't use a debugger, he had to dump info to a log file for debugging (and he didn't have syntax coloring). After wasting a day doing this, he finally saw the comment markers on an early page, and with appropriate *&^*% language, found the correct overload.

 

CodeGaffe #2: Old Commented out Code

The real reason we can't bear to delete old code is that we spent so much time writing the code in the first place

Of course, the best tack by far is to delete the unused code. The code will rot when it's unused, and after a while, you wouldn't be able to simply uncomment it anyway. If you really can't bear to delete it, make it obvious that it's commented out.

 

What do I mean by "rot"? Once the code is commented out, it's never getting executed. As the rest of the code base evolves (adding new features, fixing bugs, refactoring), the code will be out of date—perhaps very subtly.

 

Delete the old code (I know, it’s hard to part with it )

The rule of thumb I use is to comment out code if I think I might need it again very soon. Then, the very next time I'm in the same module, I delete it. Thanks to the wonders of source control systems, it's now very easy to see that I intended to remove the code, since it'll be commented out in one version and gone in the next. That's a little more obvious than simply removing it.

 

This suggestion (comment then delete) is really for those of us who hate to delete the code outright, "because I might want it back someday". Usually though, the real reason we can't bear to part with it is that we spent a lot of time writing the code in the first place. But my advice is to delete it once it's no longer used.

 

So clean up that old code. Get to it!

 

Technorati tags: ,