Dylan Smith

ALM / Architecture / TFS

  Home  |   Contact  |   Syndication    |   Login
  71 Posts | 0 Stories | 111 Comments | 29 Trackbacks

News



Archives

Post Categories

Blogs I Read

Sunday, January 27, 2013 #

For probably over a year now I’ve been hearing lots of hype around package managers and NuGet in particular.  I’ve never really “got it” – that is until last week.  So what, NuGet will download the nHibernate assemblies for me.  I can do that myself easily enough, why on earth do I need a specialized tool to do that for me?!  But it will download not only nHibernate, but all of nHibernate’s dependencies too!  Big deal, that’s never been an issue for me before, usually these 3rd party packages come with all the necessary dependencies (nHibernate includes ANTLR, Moq includes Castle, etc).  I challenged a couple people I respect to convince me that I need NuGet, and despite their best efforts I was never convinced.

Last week I was working with a client who has multiple teams, working on multiple components/products, in parallel, but they all get released at once as part of one big release.  And all these various components and teams, are writing code with dependencies on other teams.  And each component is evolving independently of the others, but eventually they all need to come up with a final version that works with all the other final versions that will make up the release.  I like to compare it to the TFS team at Microsoft.  When they were developing TFS 2012, they wanted it to work with SQL 2012, Visual Studio 2012, .Net 4.5, none of which actually existed at the time TFS 2012 dev was underway (they were all under development also).  Trying to develop against a dependency that is also a moving target presents a number of problems, and it turns out NuGet can be a tremendous help here.

The problem is there are many projects (lets call them packages) that have interdependencies between them, and are potentially developed by different teams and on different release cycles. The challenge is we want to ship an updated product that contains updated versions of many of the packages, and we need to be confident that they all work well together.

The common approach to handling this is to treat each package as separate projects, and any dependencies on other projects are treated as an external dependency (similar to 3rd party dependencies like Log4Net). This is usually handled by having a lib folder within your source tree, and checking in the binaries for the external dependencies. This allows each project team to make an explicit decision about which version of their external dependency they are going to develop against, and choose if and when to update that version the latest version to reduce disruption to their development cycle.

Another important practice, is if a team is developing a package which other teams depend upon, they often want to control which versions of their package are available for other teams to consume. They don’t want every check-in to produce a build that other teams can potentially consume, because often these builds will have half-finished features in there. Typically a team will want to set a higher quality bar for which builds they share with the rest of the world to depend upon. This is usually handled by using a DEV branch and a MAIN branch. The quality standard for code to get into the MAIN branch is higher (no half-finished features), and every MAIN build is available for other teams to consume. Typically a team will update MAIN at the end of each Sprint.

There is another more subtle problem that becomes more significant as the number of packages and dependency graph between them grows. Let’s look at an example:

Dependency Example 1

If we imagine that all 4 packages are 1.0 to start with. We belong to the dev team for A. We are working towards shipping 2.0 of our product, which will include updated versions of all 4 packages, but there are 4 teams each working on a different package.

Our source tree for A contains a lib folder with sub-folders for B and C. And since B and C each depend on D we can either have a separate sub-folder for D and reference it directly from A, or we can include copies of the D binaries in the lib sub-folders for both B and C. Let’s assume that we do the latter, and both the B and the C sub-folders contain the binaries for D.

Team D finishes its work and makes 2.0 available in its MAIN branch. Team B immediately updates their project to pull in the D-2.0 binaries and updates their code to work with it (B-2.0). However, Team C has not yet pulled in the updated D-2.0 binaries yet (and possibly won’t for a while still). As Team A developers we want to pull in the updated B package so we can recompile against it and ensure our A code still works properly. However, we also depend on C which is 1.0 and depends on D-1.0. So which version of the D binaries should we use? We can’t have both D-1.0 and D-2.0. If we have D-1.0 it will potentially break B, if we have D-2.0 it will potentially break C. What are we to do? It’s versioning hell!

The way teams deal with this is to use versioning policies (either explicitly or implicitly). Rather than saying B depends on D-2.0 and C depends on D-1.0, what you need are versioning policies that say B depends on D-2.0 *and up*, and C depends on D-1.0 and up. This way the above scenario can be made to work by using D-2.0. In order to implement this “versioning policy” there are a few options:

1. Update the B and C csproj files so they don’t demand a specific version and don’t reference a strong-name for the D assembly. Then include the D-2.0 binaries in your A source tree.

2. Leave the csproj files alone, and instead introduce a “binding redirect” that indicates that anything that references D-1.0 should instead use D-2.0. This can be configured in the app.config/web.config.

Using a package manager such as NuGet can automate a lot of this work for you. Instead of having to manually walk the dependency graph to figure out which version of D binaries you require (transitive dependencies), NuGet will do this for you and automatically download all the necessary binaries for both direct and transitive dependencies according to the various versioning policies specified.  Then you simply check them in to your lib folder (usually called packages when using NuGet instead of lib).  NuGet not only walks the dependency graph to figure out the appropriate versions of all binaries you require, but it will create the appropriate binding redirects for you in your config file also.

All you need to do is create some automated builds on the various MAIN branches that will publish each package to your own private NuGet server.  Now when a team wishes to update their dependencies they simply update the versioning policy to specify a newer version number, then let NuGet do the necessary work, and check-in any updated binaries.

Note: Everything above assumes that there are no cycles in your dependency graph. If there are you’re basically screwed.  Refactor to eliminate cycles immediately.