Blog Stats
  • Posts - 36
  • Articles - 0
  • Comments - 14
  • Trackbacks - 26

 

Wednesday, January 09, 2008

Why Developers should like Refactoring!

The refactoring book http://www.amazon.com/exec/obidos/ASIN/0201485672 is really one of the best books that I have seen on programming.   If you are new to programming, I highly recommend giving the book a read.   Learning some of these refactors and thinking about them as you write code will help you write better code.   Being familiar with the various refactors will help you quickly identify ways to improve code.   The concept of code smells is pure genius.

The concept of refactoring is very liberating and freeing.  You don’t have to worry about getting everything right the first time.   You never will.  Refactoring provides a general framework for not only reversing the inevitable deterioration of good code into bad code, trust me it happens.  It also provides a framework to move code that started out not so good into better code.

No one sets out to write bad code, but still it happens.  

Sometimes requirements change during development.

Sometimes requirements are not well understood leading to a bad design from the outset.    

Sometimes you simply run out of time and do the best you can knowing that it is not as you would like to write, but having to do it anyway.  

Sometimes you are learning a new technology and simply don’t know any better.    It’s dangerous writing software while best practices are still emerging.

Sometimes multiple people are working on the same thing and not everyone understands the design, assuming that there was an original design.  

Sometimes new functionality is “bolted on” awkwardly.  

Sometimes incremental changes skew the code base in awkward directions.

There are many reasons bad things can happen to code software.   When you see software that has gone bad, be careful casting blame.  There is no telling how bad your code may look over time.  Let’s face it; most code simply doesn’t age well.

Refactoring helps us limit how long we have to live with our mistakes, helps us improve code over time, and stop the deterioration.

Future blogs will explore some of the refactors that covered in this pivotal book.   What are some of your experiences with refactoring?

Generic CopyTo Function

One of the requirements for the MVP pattern is to keep everything out of the UI (View) that does not have to be in the View.   Data mapping is one of those things that we would like to move out of the UI.   I tend to think of it along these lines, “If I wanted to expose this same functionality in a different UI what do I not want to have to rewrite?”

If the View defines an interface exposing read write properties for all of the data needed, which it should, then the interface could be simply passed to the Presenter and the presenter handling the data mapping.   The View does not have to get involved.

The Model should also define an interface exposing read write properties for all of the data that it has.   The presenter deals with the Model and the View exclusively through these interfaces.   This results in code similar to this:

private void MapData()
{
    View.Field1 = Model.Field1;
    View.Field2 = Model.Field2;
    View.Field3 = Model.Field3;
    View.Field4 = Model.Field4;
    ...
    View.FieldN = Model.FieldN;
}

This is very straightforard.   All of the nuances of how to display the mapped data is handled by the View as it should be.  All of the details for how to get the data is handled by the model as it should be and the Presenter is left tieing the two together.

But this looks like tedious code.   Tedious code is error prone code and code that we would like to avoid.   If both interfaces,the one exposed for the View and the one exposed for the Model, require the same set of properties, then surely we can use reflection to handle writing this tedious code for us.

Consider this code:

public static void CopyTo(object sourceObject, object targetObject)
{
    object value = new object();
    object[] param = new object[0];

    foreach (PropertyInfo propertyInfo in sourceObject.GetType().GetProperties())
    {
        MethodInfo getMethod = propertyInfo.GetGetMethod();
        // Ensure that there is a get method for the source object.
        System.Reflection.PropertyInfo targetPropertyInfo =
             targetObject.GetType().GetProperty(propertyInfo.Name);
        if ((getMethod != null) && (targetPropertyInfo != null))
        {
            // Ensure that there is a set method in the target object.
            if (targetPropertyInfo.CanWrite )
            {
                value = getMethod.Invoke(sourceObject, param);
                targetPropertyInfo.SetValue(targetObject,
                  ConvertValueType(value, propertyInfo.PropertyType),
                                             null);
            }
        }
    }
}

This function expects two objects to be passed in.  We loop through the properties of the first object looking for matching properties in the second object.   If the second object has a matching property and the second object’s property is not read only, we will get the value from the first objest’s property and set it to the second object’s property.   The only thing that is required is that the any properties tha the two objects have in common must be of the same type.

Armed with such as CopyTo method, our MapData method from earlier can be reduced to:

public void MapData()
{
    CopyTo(Model, View);
}

Regardless of how many properties are involved this one call will handle the data mapping.   All properties that are in common will be copied.   Properties in the Model that are not in the View will be ignored.   Properties that are in the View that are not in the Model will be uninitialized.

Additionally, a generic update method might look similar to this:

public void Update()
{
    Model = Model.GetCurrent();
    CopyTo(View, Model);
    Model.Save();
}
Here we make sure that we have the current model data.  We then overwrite any common properties with the values currently in the View.   Finally, we save the Model.   This works regardless of the number of properties involved.   Properties in the Model that are not in the View will not be affected.  Properties in the View that are not in the Model will not be persisted.

The presenter makes no assumptions about how the Model gets its current data or what happens when the Save is called.   The presenter makes no assumptions about how the View displays its data.   The seperation between the UI, business logic, and data storage is maintained.   Common, redundnat, potentially error prone data mapping logic is moved out of all three components into a reusable library component called by the presenter.

Teasing apart Business Logic from Data Access Logic


In an earlier post, we started with logic similar to this:

  IDataReader dr;
    try
    {
        dr = DAL.GetSomeData(filter);                                                                                
        while (dr.Read())
        {
          if ( dr["field"] != System.DBNull.Value)
            {
                DateTime convertedDate = Convert.ToDateTime(dr["field"].ToString());
            }

         if ( dr["nextField"] != System.DBNull.Value)
            {
                Double convertedDouble = Convert.ToDouble(dr["nextField"].ToString());
            }
              . . . .

  catch (Exception ex)
    {
        // Handle exception
    }
    finally
    {
        dr = null;
    }
}

And showed how to reduce the Complexity to something similar to this:

IDataReader dr;
    try
    {
        dr = DAL.GetSomeData(filter);                                                                                
        while (dr.Read())
        {
            DateTime convertedDate = (DateTime) RetrieveDataFromDataReader (dr, typeof (DateTime), “field”);
            Double convertedDouble = (Double) RetrieveDataFromDataReader (dr, typeof (Double), “nextField”);
           
              . . . .

         }

   }

  catch (Exception ex)
    {
        // Handle exception
    }
    finally
    {
        dr = null;
    }
}

This is good in that it reduces the Complexity to be a constant instead of making it dependent on the number of fields being retrieved.   We are also doing something nice worth pointing out.   By using IDataReader instead of SQLDataReader or OracleDataReader, we have removed a direct tie to a particular database platform.

The problem comes in when business logic is dependent on the field names.   This dependency in essence decentralizes the data mapping logic.  Also, there is no compiler support for getting the field names right.    If you get the name of the field wrong, you get a run time error instead of a compile time error.   We prefer compile time errors.   If the name of a field changes for any reason and this mapping logic is embedded with business logic, then there are multiple places that may need to be changed.

A better approach is to encapsulate this data mapping logic in a proxy object.   Business logic is then written against this proxy object and can be oblivious to any data mapping changes.   We bragged earlier that using an IDataReader we removed a direct dependency to SQLDataReader or OracleDataReader.   By using a proxy object, we remove ourselves from the dependency that a database is even used.

If the proxy object exposes a strongly typed property for every field in the underlying data source, then the business logic can be implemented against them and never have to worry about data mapping logic or data type conversion logic.   If the underlying data source changes, only the proxy object has to change.   The business logic that was written against the proxy object stays the same.

With such a strong separation between the data access logic and business logic, the business logic won’t get lost in the “noise” of the data access logic.    The mechanics of data type conversion and data mapping can be safely hidden away allowing the business logic to focus on business rules.    This will make the business logic easier to follow, less likely to include bugs, and be more easily maintained.

Reducing Software Complexity

In an earlier blog, we discussed the need to reduce complexity and some of the problems that complex code can cause.  We even discussed a reliable quantifiable metric for evaluating complexity.

For review, we want to use the CodeMetrics add-in to Reflector to track Cyclomatic Complexity and target methods that have a complexity greater than 10.

One common source for elevated complexity is leaving artifacts from the data access in the business logic.   We are probably all familiar with code that looks like this:

  IDataReader dr;
    try
    {
        dr = DAL.GetSomeData(filter);                                                                                
        while (dr.Read())
        {
          if ( dr["field"] != System.DBNull.Value)
            {
                DateTime convertedDate = Convert.ToDateTime(dr["field"].ToString());
            }

         if ( dr["nextField"] != System.DBNull.Value)
            {
                Double convertedDouble = Convert.ToDouble(dr["nextField"].ToString());
            }
              . . . .

  catch (Exception ex)
    {
        // Handle exception
    }
    finally
    {
        dr = null;
    }
}

Here we have a common scenario of reading data from a data reader and verifying that the columns are not null.   If they are not null, we convert the field to a strongly typed variable and then proceeded to implement business logic with the strongly typed variables.   As you can see from the above code fragment, we are dealing with two fields from the database and we already have a complexity of at least 6.   There is a direct correlation between the number of fields being processed and the complexity of the method.   We cannot have a linear relationship between complexity and the number of fields that we are referencing and expect to keep complexity below 10.   A better approach is needed.

A better way might mean adding a function similar to this:

        public object RetrieveDataFromDataReader(IDataReader dr, Type dataType, string whichField)
        {
            if (dr[whichField] == System.DBNull.Value)
            {
                return GetNullRepresentationValue(dataType);
            }
            else
            {
                return Convert.ChangeType(dr[whichField], dataType);
            }

        }

Armed with such a method, we can now rewrite our original method tobe something like:

  IDataReader dr;
    try
    {
        dr = DAL.GetSomeData(filter);                                                                                
        while (dr.Read())
        {
            DateTime convertedDate = (DateTime) RetrieveDataFromDataReader (dr, typeof (DateTime), “field”);
            Double convertedDouble = (Double)  RetrieveDataFromDataReader (dr, typeof (Double), “nextField”);
           
              . . . .

         }

   }

  catch (Exception ex)
    {
        // Handle exception
    }
    finally
    {
        dr = null;
    }
}

This new version should have a constant complexity regardless of the number of fields introduced.   This is a great improvement.

Now, there most likely is still business logic embedded in the data access logic.  This is also bad, but it is not a problem made apparent through a strict review of code complexity metrics.

Next time, we will review why it is bad to merge the data access logic and business logic and review some easy ways to tease apart these interdependencies.

Your project might be in trouble if …


I recently worked on a project that failed.   There is no other way to put it.   The project failed!    It’s a hard thing to admit.

There are many reasons why bad things happen to good projects.   There are many reasons why despite our best intentions, not every project is successful.    There is always plenty of blame to go around.   The business market changed.    The technology changed.   We could not find the right resources.   Critical resources left.   The project lost funding.    The business lost interest…

Many of these reasons can be boiled down to, IT didn’t know what the business needed and the business didn’t know what IT was doing.   In other words requirements were not well documented.

From my recent experiences, I have come up with the following red flags to help you identify when your project might be in trouble

Red flags

·        If you have never met your subject matter  experts, your project might be in trouble
·        If there are no subject matter experts, your project might be in trouble
·        If new requirements are deemed critical but ignored for the sake of the schedule, your project might be in trouble
·        If a user that you have never heard of claims to be a project sponsor, your project might be in trouble
·        If the requirements change so much from one week to the next that you feel like it’s a different project, your project might be in trouble
·        If new people join the team and its weeks before you even meet them, the project might be in trouble
·        If you look around and there is no one to QA your code, your project might be in trouble
·        If you spend more time deploying than you do coding, your project might be in trouble
·        If the same project has failed over three times already, your project might in trouble
·        If no one can succinctly state the business needs being solved, your project might be in trouble
·        If you have already deployed once, but still second guessing the platform, your project might be in trouble
·        If you can submit last month’s status report for this month and no one notices, your project might be in trouble
·        If you can’t tell last month’s status report from this month’s, your project might be in trouble
·        If you have worked on the project for over a month and still don’t know where the database is, your project might be in trouble
Guidelines:

·        Write requirements down
·        Don’t get bogged down in format or structure just content
·        Get consensus that what you are working on is what needs working on
·        Talk to as many users are you can.   They rarely will agree with each other
·        Deploy new functionality as often as you can so that the users don’t worry that you have forgotten about them
·        Keep deployment schedules spread out enough so that you don’t get bogged down in deployment overhead
What red flags have you seen?   What are some guidelines that you have seen work?

Scared of Complex Technologies? Embrace Them!

Many people seem to be intimidated by some of the most exciting programming features in DotNet.   Reflection comes to mind.

I once did a little client training and on the first day was lectured about this “Reflection Crap.”   We often fear what we don’t understand.  We often avoid technologies that we don’t have to use and cling to more familiar technologies and techniques.  It’s no fun admitting what we don’t know, and it is often faster to do something the old familiar way than it is to use new unfamiliar techniques, even when the new technique is better.

The result is that many people never use reflection or regular expressions, or design patterns or threads, or … until there is no other way to accomplish whatever task they are working on.

The result is that their first exposure to the new technique is pretty complicated and scary.   If the first time you use a regular expression is to handle data validation where nothing else would work, then your first regular expression will be a pretty tough one.   If the first time that you try to understand a piece of reflective logic is while production is down and you are tasked with untangling a failed reflective call, without a doubt you will be tempted to rip out the “reflection crap”.   If your first foray into multi-threaded program is when production is reporting concurrency issues that are costing the company money, you will hate threads.

If on the other hand, you play with regular expressions between projects and start with simple pattern matches and work your way up to more complex patterns, you will have no problem using a 20 character expression to parse a string instead of 40 lines of string manipulation logic code.

If you play with Reflection during design and see how easy it is to interrogate the metadata when you don’t have a business manager breathing down your neck, you will not be afraid to use reflection to loop through the properties of an object at runtime to initialize them to known safe values or loop through the methods of an object searching for new methods to run at.

Playing with these technologies when it is safe, gives you more confidence to be able to pull them out and show them off confidently when it may not be so safe, but when they are most useful.

What are you afraid of?   What were you afraid of that you know use with confidence?   What have you been putting off learning?

Reflecting on Software Metrics

At the risk of sounding pedantic, I like software metrics.   They can prove to be invaluable in analyzing source code.   To be clear, I am not proposing that developers be bonused based on metrics and their use in estimating and scheduling should be limited at best.  I am however a fan of software metrics as benchmarks for evaluating design and understanding how software works.

There is an add-in for reflector called CodeMetrics that allows you to easily calculate a wide range of code metrics from within Reflector.   We are most interested in one metric, Cyclomatic Complexity.

Cyclomatic Complexity or CC is the number of logical paths through a piece of code.   This can be used to define a lower bound for the number of test cases needed to ensure proper coverage in unit testing.   This can also be used to predict ongoing maintenance costs and be scaled to predict the likelihood of actually resolving a particular defect or introducing additional defects.

Tables similar to this are common:

CC
Testing Requirements
Ongoing Maintenance Costs
Resolution Likelihood
1-0 A simple procedure Low High
11-50 Complex Moderate to High Medium to low
>50 Very Difficult High Risk Low

 

The actual bands may vary depending on the line of business and skills of the developers involved, but these are generally good benchmarks.

A CC of 10 would indicate that 10 test cases are needed to fully test all of the functionality.  Depending on what you are working on, that may be more complex than you want to consider or that may still be fairly trivial.   As a general rule, I try to keep methods below 10.  

When I am working on existing functionality, I try to review the code and refactor the code in the same module with the highest complexity, systematically reducing the overall complexity.  Sometimes the best you can do is to review the more complex functions and add comments on how they may be refactored.   Often times, the actual refactoring may be out of scope for the change that got you involved in the first place.

It is important to review the complexity of any code before making code changes.   If you are about to change code that already has a high complexity, you want to make sure that your changes do not add more complexity.   You need to be aware that with higher complexities comes a higher chance of introducing new defects and you are also less likely to resolve the underlying defect if it is embedded in complex code.   This may change your estimates and should change your testing strategy.

Sometimes, complex code masks cohesion problems.  If the code is very complex, there is a good chance that the particular method is trying to do more than one thing.   If it is doing more than one thing, than simplifying the code may be nothing more than separating all of the different tasks that the method is trying to do.   For example, code is often complex because data validation logic is blended with business logic or data mapping logic is blended with business logic.    Simply pulling all of the data validation logic to a separate method and all of the data mapping logic to a separate method and isolating all of the business logic to a single method, you end up with several methods each of which is less complex than the original.   Each of these methods will be much more cohesive.   With cohesive methods, it is easier to know where to add new functionality or where to go when reviewing existing functionality.

Future blogs will explore some ways to reduce complexity.

 

 

Copyright © Nick Harrison