For the past few years I’ve been working with a fairly large ASP .NET web application. The app began its life as Classic ASP and was later ported to ASP .NET web forms written in VB .NET. Over the years the language preference of the development team shifted and we started finding ways to leverage C# alongside the existing VB .NET web forms project. This shift lead to a collection of class libraries, Windows services, and assorted utilities written in C# that were built up around the aging VB .NET code base. We found ways to push a lot of code out of the VB .NET web forms project, but still found ourselves having to add or edit code in the VB .NET project often enough to make it painful. In addition, building our main solution containing the VB .NET web forms project and our various C# class libraries took forever, apparently because Visual Studio didn’t like our mixed-language projects. During our first annual DevCon (a yearly week-long meeting of our distributed team) in early 2011 we identified the presence of VB .NET as our primary source of pain on the development team and committed ourselves to eliminating it.
While the majority of this post will be focused on how the conversion was done I think it’s important to briefly talk about why we did it. Taking a large and stable codebase and changing the language it’s written in might seem foolhardy. At a past job I had explicitly decided against converting a much newer and smaller VB .NET application to C# based solely on the fact that it didn’t add any value for our users. I was working as a contractor at the time and couldn’t tell our client that we wanted to spend 2-3 weeks doing a “re-write” that wasn’t going to add any of the new features that were needed. At my current job, however, we strive to devote a portion of our development time each sprint toward eliminating developer pain. We understand that alleviating this pain may not immediately add value to our product, but it will increase productivity and reduce attrition in the future. Some teams use the term “technical debt” when referring to the painful areas of their codebase and recognize that, just like any debt, it’s most cost effective to pay it off quickly.
You could probably make the argument that programming language choice is not really technical debt, but technical debt or not we knew one thing: everyone hated working with VB .NET and that alone was reason enough to get rid of it.
Our VB .NET web forms project consisted of roughly 300,000 lines of code spread over about 600 different pages (.aspx files). Due to the sheer size of the project we opted to use an automated conversion tool to do the brunt of the work for us. There are a number of VB .NET –> C# converters out there, but we ultimately decided to use the VBConversions VB .NET to C# Converter because it offered the fastest, easiest, and highest quality conversion of any of the tools that we evaluated. That said, no automated conversion tool will produce 100% perfect results for all conversions and no one on the team was comfortable blindly trusting an automated tool, so we also had to manually review every converted file to clean up compiler errors and other incorrectly converted code that we found. The code review was followed by a regression test of the application before finally releasing it to the wild.
While we we identified the elimination of VB .NET as a top technical debt priority at our DevCon in early 2011 we didn’t actually complete the C# conversion project until our DevCon in early 2012. While the project spanned roughly 1 year, it certainly didn’t take a year’s worth of developer hours to complete. Any time spent on the project in the months between the 2011 and 2012 DevCon meetings was spent in small increments (i.e. a few hours here and there) doing preparation work.
The preparation work for this project consisted mostly of identifying potential issues with the VB .NET code that would trip up the converter. The VBConversions tool will generate list of suggested changes for the source VB .NET code that help it perform a more accurate conversion. In this way it encourages you to run a conversion, fix potential issues, and run the conversion again to see the updated results. Our initial pass at converting the VB .NET code yielded a C# project with well over 5,000 compiler errors. By the time we were ready to make the final conversion and begin our manual review process we were down to several hundred compiler errors. If we had taken more time with this step we likely would have ended up with an even cleaner conversion, but I don’t think that we would have ever had a 100% perfect result.
Prior to conversion we froze all other development efforts because we knew that we need to focus solely on the conversion until it was completely finished before trying to pick up any new work. Trying to balance new development work with the conversion and the potential bug fixes that would be needed following the conversion would be too difficult, so we had to get buy-in from our product management to let us tackle this conversion effort while putting other work on-hold for awhile. Once the conversion process began the only changes going into source control were related to the conversion itself.
Because the entire development team would be in the same conference room at the same time for a week during our 2012 DevCon meeting we decided to finally pull the trigger and get the conversion done during that week. Over the weekend prior to the meeting I ran the final conversion and committed the converted C# project to source control alongside the VB .NET project. At this point we had code in source control that was known to be broken. This is normally a big no-no, but having everything in source control was going to make collaboration much easier. Also, we left all of the existing VB .NET code and build scripts untouched meaning that all of our CI builds and QA deployment scripts continued to work even though we had a big pile of broken code in source control. We started the week with two Visual Studio solutions that were under the same top level folder in source control:
- Original Website Solution
- Core Class Libraries and Services (already written in C#, left untouched during the conversion)
- Original VB .NET Web Forms project (also left untouched during the conversion)
- C# Website Solution
- Core Class Libraries and Services (same projects referenced in the original solution)
- Converted C# Web Forms project (the converted project that needed to be reviewed)
Having both the original VB .NET and converted C# exist side-by-side in source control allowed us to open both projects for side-by-side comparison during the review process. We were also able to make commits of the cleaned up C# code after each file was reviewed.
We used a shared Google Docs spreadsheet during the conversion to help keep track of our progress. This spreadsheet had the following columns:
- File Path – Relative path and file name of the VB .NET code file (since the conversion produced a one-to-one mapping of code files we ended up with one row per file to be reviewed)
- Lines (approx) - Rough count of the lines of code in the VB .NET file (We came up with this count using a simple Powershell script that you can find here: https://gist.github.com/1674457)
- Reviewer – As developers picked up files to review they would put their names in this column to ensure that we didn’t duplicate our efforts.
- Status- After items were reviewed we’d put ‘Reviewed’ in this column. We’d also use color highlighting in this column to indicate items that might need a second review or that were particularly troublesome to clean up.
We used the following general approach while reviewing each converted file:
- Check the ‘Conversion Issues’ output from the conversion tool for any issues reported with that file (this output was dumped into another shared Google docs spreadsheet after the initial conversion was complete)
- Fix any compiler errors (ReSharper was very helpful as it was easy to see how many compiler errors were present in any given file).
- Force a regeneration of the .designer.cs files by making a minor no-op tweak (e.g. add a whitespace and then remove it) to the .aspx markup and saving it again.
- Use ReSharper to clean up any ‘using’ statements that were importing unneeded namespaces (the conversion pulled a ton of unneeded using statements into each code-behind file).
- If any bit of code seems odd, compare it with the original VB .NET
Our DevCon meetings take a Monday-Friday schedule and we began the conversion review and cleanup on Monday morning. By Wednesday night we had the project compiling successfully and were able to update our build scripts to start pushing the newly converted code out to our QA machines for regression testing.
The regression testing effort turned into an all-company affair as we ended up enlisting members of the support and product management team in addition to developers and QA analysts to get it all done. Even with the added resources, however, it wasn’t feasible to fully regression test all 600 pages in the application. We did some quick brainstorming and came up with a prioritized list of the pages and services exposed by the web application that would need the most rigorous regression testing. We used the two pretty simple criteria to determine what areas of the application needed the most attention:
- Does the page/service alter data or just display it? (i.e. is there a potential for data corruption?)
- How often does the page get used (Google Analytics, IIS logs, and some home-grown analytics tools helped us answer these questions?)
Areas of the application that both alter data and/or are used a lot got the most attention from our QA team while those that only read data but are also used frequently were the second highest priority. For areas of the site that were not used very frequently and had low risk for data corruption were only “smoke-tested” to ensure that they executed without throwing any errors. We added a couple of columns to the shared Google Docs spreadsheet that we used to review each file to keep track of files that had been tested and whether or not any issues had been found. One nice side effect of this exercise was the identification of a few pages in our website that were not being used anymore and could be deleted.
One of the biggest challenges we had during the test effort was figuring out how to test certain areas of the code that are not easily accessible. For example, code-behind files for shared user controls (.ascx) had to be converted and reviewed, but our QA team didn’t necessarily know what pages those controls were used on or how to exercise their code. In some cases a user control might only appear when a certain configuration is enabled and that required that a developer work with them to figure out whether or not each bit of shared code had been exercised adequately.
The regression testing effort ended up spilling over into the week following the DevCon meeting, but we were ready to deploy the converted code by the middle of the week following.
We knew that we hadn’t been able to fully regression test every bit of the converted code and were certain that issues would crop up once the code was out and being used in our production environment. We maintain a production environment that many of our customers use in a SaaS model and several of our customers self-host our application on their own hardware. Our production environment has a load-balanced web farm which we decided to use to our advantage for the initial rollout of the converted code.
Once our regression testing was finished we created a release branch of the converted code just like we would for any other normal deployment. We then took some of the servers in the farm out of the load balancer rotation and modified our deployment scripts to push the converted code from the release branch to those now offline web servers. Once the deployment was done, we swapped them back into the load-balancer rotation and removed the ones with the older VB .NET code.
When issues were found and reported, we were able to quickly swap out the servers and get the older reliable VB .NET code running in production again while we patched the code in the release branch, re-deployed to the offline servers and swapped them back in. I don’t recall the exact numbers, but I don’t’ think we had to do this more than 5-6 times. Because we had created a release branch we were able to let most of the development team get back to new development work against trunk while a portion of the team focused on fixing any issues that cropped up with the conversion in the release branch. Any fixes that were made in the release branch were merged back to trunk. By the end of the week following our DevCon the conversion project was officially finished and the entire team returned to our normal flow of development work. I think we still fielded a few minor bugs related to the conversion here and there over the next couple of weeks, but most of the issues shook out within the first week.
I have a few key takeaways from this effort:
Within a few months of finishing the conversion project, it was hard to imagine that we used to have so much VB .NET in our application. By doing this conversion we were able to remove the, “VB .NET (but we’re phasing it out)” bullet point from our developer job postings and reduced the various VB .NET related grumblings in our team chat logs by 100%. It was a big job, but I think it was worth it in the end.
- We had no idea how long this would take us before we started and we didn’t bother trying to do much in the way of estimating. Instead, we just committed ourselves to getting it finished and powered through for as long as it took. We had to make some adjustments along the way (e.g. pulling in the support team to help with testing and only keeping 2-3 developers to fix the conversion issues after the initial rollout), but I think it all came together as well as it could schedule-wise.
- I wish I had done more to track how many hours we spent on the conversion project. I could estimate based on my (likely faulty) recollection of the project, but it would be nice now to have a feel for how much the conversion really cost us in terms of hours.
- The process of identifying every endpoint in our web application and, at a minimum, smoke-testing each one is an excellent housekeeping exercise. We identified a few areas of our application that are not used anymore and identified a few long-standing bugs that were not related to the conversion effort at all. I’d like to have us go through this process again, maybe once every 2 years or so. Also, by enlisting the majority of the company in this effort were ended up exposing some folks to areas of the app that they normally never see and ended up sharing a lot of valuable application-specific knowledge with each other.
- I don’t think we could have done this as quickly as we did if we didn’t have the entire team in the same room for a week. I might even go a step further and say that we might not have even been able to pull it off at all.
- I think that the collective confidence of the team in a successful outcome for this project ebbed and flowed quite a bit throughout the project. There were a few occasions when we encountered a very odd issue that gave us pause and made us wonder if we were in for a disaster. Having a boss who really understood the value of the project and encouraged us to press on and complete it despite the bumps in the road was crucial to the success of the project.
- Because I did the initial conversion and committed it to source control many of the files in our website project now only have a single commit in their history (with my name on it). We moved the original VB .NET project to an “archive” location in our source control tree so we can still go digging and find earlier revision history for a file if we need to, but until then all questions of, “who last worked on this file” usually end up being answered with my name. If I were to do this over again I might investigate a method to migrate the commit history of each of the original .vb files to their .cs counterparts, but at this point it’s not worth the effort.