Dylan Smith

ALM / Architecture / TFS

  Home  |   Contact  |   Syndication    |   Login
  48 Posts | 0 Stories | 28 Comments | 29 Trackbacks

News



Archives

Blogs I Read

Wednesday, December 07, 2011 #

PrairieDevCon is a great conference hosted in Canada a few times a year.  For the first time it’s coming to Calgary in March and I couldn’t be more excited.  I’ve participated as a speaker in every PrDC to date (2 in Regina, 1 in Winnipeg), and that streak will continue into Calgary.

In addition to the 2 conference sessions I’ll be doing:

  • Why do we Suck at Estimating? And How to Get Better
  • Evolve Your Code: Fundamental Design Principles

I’m also doing a full day Pre-Con Workshop on TFS Build.  I plan to walk through the process of creating an “enterprise-class” build from scratch, and exploring along the way, what the heck does an enterprise-class build even mean?  What type of things should you be doing in your automated build/deploy process?  How do you customize and tweak the TFS Build to make it dance just the way you want?  We’ll get down into the guts of TFS Build, and explore a lot of technology-agnostic Build best practices along the way.  I can’t wait!

Here’s the full abstracts from the conference site: http://www.prairiedevcon.com/Workshops

Creating Powerful Build and Deploy Processes with TFS Build

In this day-long session we’re going to focus on unleashing the power of TFS Build to create a powerful fully-featured automated build and deploy for a sample application. We’re going to get down into the guts of the tool including an in-depth look at customizing Build workflows, creating custom build activities, and most importantly discuss a lot of best practices around creating valuable build and deploy processes. We’ll also get a chance to bring TFS Lab Management into the mix, and setup some virtual testing environments, and extend our build processes to automatically deploy to the various environments.

Some of the common questions that I hear that we’ll be covering are:

  • What’s the role of MSBuild now that TFS Build is Windows Workflow based? When should I be using MSBuild vs Workflow?
  • How do I test custom build activities?
  • How do I auto-version my assemblies via my build process?
  • How do I auto-deploy my application to a testing environment as part of my build?
  • How do I upgrade my old TFS 2008 Builds?
  • My build takes forever, what can I do?
  • What type of things should I do in a CI Build vs Nightly Build?

At the end of day you’ll have a solid understanding of how TFS Build works, how to deploy and configure it in your organization, and how to create customized builds to meet your team’s needs. Not only that, but you’ll hopefully have a new understanding of the immense power of rich build system tailored to your project.

 

Why we Suck at Estimating (And How to get Better)

Few things are dreaded more than estimating how long it will take to develop software. Why are we so terrible at this activity that is an important part of *every* project we work on? In this session we’ll examine what the common problems with estimation are, and take a look at some strategies we can use to be more effective at estimating. The techniques discussed will improve your ability to estimate in all types of projects (Agile, Waterfall, etc). At the end of the session you’ll be able to immediately apply some of these practices to improve your estimates today, and approach estimates in the future with a different mindset to ensure you are providing the business with the information it requires.

 

Evolve Your Code: Fundamental Design Principles

In this session I'm going to pretend I can travel back in time and give the rookie developer version of myself an hour of advice. This session will be that advice. I'm specifically focusing on advice that will help us write better code. I've written a lot of horrible code in my career, and learned a lot of lessons in the process. In my day job I also get the opportunity to look at a lot of other teams code, and help them clean it up. Over my career I've gotten a pretty good idea of what I think the most important lessons are when learning to write "good" code. And here's a hint, it's not neatly captured in the SOLID set of principles that seem to be so popular at conferences these days. Expect a lot of examples and concrete advice that you can take away and start applying immediately to improve your code.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Friday, December 02, 2011 #

In the 2nd Code Clone result it’s pointing out a block of code that is duplicated across 5 different screens.

CloneCode

 

TLDR: Refactored the code out to a abstract ViewModel base class. Also identified a bunch of other obvious code that belonged in the base class and deleted a TON of duplicate code in the process (over 4% of Rawr’s total code!).

Net Lines Of Code Deleted: 4625 (!!!)

 

Each character class has a separate screen allowing the user to input stats for their character and then it performs some character-specific calculations.  The problem is there are some base stats that pretty much every character class uses, then a handful of character-specific stats that only some classes use.  This block of code takes the value of a bunch of checkboxes for these stats and builds an array out of them. And it is almost identical across many different character classes.

When I looked into I discovered that each character class has a separate options screen that looks very similar to every other character class.  This specific code relates to the Stat Graph section at the bottom:

Screenshot

This Stat Graph section is repeated across many separate Character Options screens, the only difference being slightly different stat checkboxes available depending on the character type.  My first thought was to wrap this up in it’s own User Control and have the character specific-code simply supply the list of stats it was interested in.

As I dug into the code a little more, I discovered that these Checkboxes are data-bound to a separate class (CalculationOptionsRogue in this case) that is basically acting as a ViewModel. So I figured I could achieve the goal I wanted (eliminate the duplicate code) with a lighter-weight refactoring than creating a whole new User Control (and having to get my hands dirty with XAML – blech!).  I first pushed the function that Code Clone identified down into the ViewModel (since the ViewModel has the properties that the Checkboxes are bound to the function has access to all the data it needs).  Then I refactored out a base class from the ViewModel that contained properties for every character stat, and the BuildStatsList function that can be reused between subclass ViewModels.

As you can see in the below code, I also moved a couple of functions and interface implementations up to the base class since they were duplicated across all of the subclass ViewModels: ICalculationOptionBase.GetXml & INotifyPropertyChanged.OnPropertyChanged.  The end result was over 4500 net lines of code removed, over 4% of the total codebase of Rawr!

 

RefactoredCode

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Monday, November 28, 2011 #

In this post we’ll take a look at the first result from the Code Clone Analysis, and do some refactoring to eliminate the duplication.  The first result indicated that it found an exact match repeated 14 times across the solution, with 18 lines of duplicated code in each of the 14 blocks.

 

Net Lines Of Code Deleted: 179

 

CloneResults

 

In this case the code in question was a bunch of classes representing the various Bosses.  Every Boss class has a constructor that initializes a whole bunch of properties of that boss, however, for most bosses a lot of these are simply set to 0’s.

 

ClonedCode

 

Every Boss class inherits from the class MultiDiffBoss, so I simply moved all the initialization of the various properties to the base class constructor, and left it up to the Boss subclasses to only set those that are different than the default values.

In this case there are actually 22 Boss subclasses, however, due to some inconsistencies in the code structure Code Clone only identified 14 of them as identical blocks.  Since I was in there refactoring the 14 identified already, it was pretty straightforward to identify the other 8 subclasses that had the same duplicated behavior and refactor those also.

 

Note: Code Clone Analysis is pretty slow right now.  It takes approx 1 min to build this solution, but it takes 9 mins to run Code Clone Analysis.  Personally, if the results are high quality I’m OK with it taking a long time to run since I don’t expect it’s something I would be running all that often.  However, it would be nice to be able to run it as part of a nightly build, but at this time I don’t believe it’s possible to run outside of Visual Studio due to a dependency on the meta-data available in the VS environment.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Friday, November 25, 2011 #

Code Clone Analysis is a cool new feature in Visual Studio 11 (vNext).  It analyzes all the code in your solution and attempts to identify blocks of code that are similar, and thus candidates for refactoring to eliminate the duplication.  The power lies in the fact that the blocks of code don't need to be identical for Code Clone to identify them, it will report Exact, Strong, Medium and Weak matches indicating how similar the blocks of code in question are.

CodeCloneMenuItem   CodeCloneSimilarCode


People that know me know that I'm anal enthusiastic about both writing clean code, and taking old crappy code and making it suck less. So the possibilities for this feature have me pretty excited if it works well - and thats a big if that I'm hoping to explore over the next few blog posts.

I'm going to grab the Rawr source code from CodePlex (a World Of Warcraft gear calculator engine program), run Code Clone Analysis against it, then go through the results one-by-one and refactor where appropriate blogging along the way.  My goals with this blog series are twofold:

  1. Evaluate and demonstrate Code Clone Analysis
  2. Provide some concrete examples of refactoring code to eliminate duplication and improve the code-base

Here are the initial results:

CodeCloneResults

 

Code Clone Analysis has found:

  • 129 Exact Matches
  • 201 Strong Matches
  • 300 Medium Matches
  • 193 Weak Matches

Also indicated is that there was a total of 45,181 potentially duplicated lines of code that could be eliminated through refactoring.  Considering the entire solution only has 109,763 lines of code, if true, the duplicates lines of code number is pretty significant.

In the next post we’ll start examining some of the individual results and determine if they really do indicate a potential refactoring.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Thursday, October 20, 2011 #

Next month is Prairie Dev Con Winnipeg.  This is the first time the conference has come to Winnipeg and I'm really excited to be a part of it again.  I've spoken at both of the previous PrDC's in Regina, and have 2 slots at the Winnipeg conference.  The conference is Nov 21-22 and there are still registrations available, so if you haven't registered yet I encourage you to do so: http://www.prairiedevcon.com/

Here's some details on the 2 sessions I'm hosting:

Evolve Your Code: Fundamental Design Principles

In this session I'm going to pretend I can travel back in time and give the rookie developer version of myself an hour of advice.  This session will be that advice.  I'm specifically focusing on advice that will help us write better code.  I've written alot of horrible code in my career, and learned alot of lessons in the process.  In my day job I also get the opportunity to look at alot of other teams code, and help them clean it up.  Over my career I've gotten a pretty good idea of what I think the most important lessons are when learning to write "good" code.  And here's a hint, it's not neatly captured in the SOLID set of principles that seem to be so popular at conferences these days.

 

Database Change Management with Visual Studio

The Database Management functionality in Visual Studio (nicknamed Data Dude) has been around for about 3 years now, but I still see alot of teams that aren't using and often aren't even aware it exists.  For those of us that have used it, it's one of those technologies that from that point on we refuse to work without; it's just *that good*!  This session will be an introduction to the functionality available in Visual Studio with some hands-on demo's.  Even for those that have used it before, there's probably some areas you might not be familiar with yet (e.g. Data Generation Plans, DB Unit Testing, etc).

 

I hope to see you all there!

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Thursday, October 13, 2011 #

I was recently helping a client do some capacity planning for an upcoming TFS Lab Management deployment. I was being kind of lazy so I just sent them over the Capacity Planning Calculator spreadsheet that the great folks in the ALM Ranger team released with their LM guidance: http://ralabman.codeplex.com/

Less than a day later and they're calling me up telling me the spreadsheet doesn’t make any sense. So I fire it up, punch in some numbers for a fictional team, and sure enough the numbers it’s giving me don’t make any sense to me either. This blog post is my attempt to point out what I *think* are mistakes in the spreadsheet. I also created a modified version of the Capacity Planning Calculator that does the calculations I think are correct.

So first here’s my fictional scenario, and we’ll do the math by hand:

=====================================================

Let’s assume we have 3 Projects with the following team sizes:

  • P1 – 5 people
  • P2 – 3 people
  • P3 – 7 people

Each project will have 2 environments, a consolidated environment with all components on 1 VM, and a split environment with separate Web and DB Servers. Let’s assume that the Web/DB/Combined servers are each slightly different for each project so they can’t share VM or Environment templates [in the real world you would most likely be able to use your Web and DB VM Templates across most if not all environments, since a Web Server image is pretty much always the same].

  • VM1 – P1 Web
  • VM2 – P1 DB
  • VM3 – P1 Consolidated
  • VM4 – P2 Web
  • VM5 – P2 DB
  • VM6 – P2 Consolidated
  • VM7 – P3 Web
  • VM8 – P3 DB
  • VM9 – P3 Consolidated

 

  • E1 – P1 Consolidated (VM3)
  • E2 – P1 Web (VM1) + P1 DB (VM2)
  • E3 – P2 Consolidated (VM6)
  • E4 – P2 Web (VM4) + P2 DB (VM5)
  • E5 – P3 Consolidated (VM9)
  • E6 – P3 Web (VM7) + P3 DB (VM8)

Every team member will have a personal instance of the consolidated environment, the TFS Build will have its own consolidated environment, and we’ll say there are 2 instances of the split environment for each project (UAT, Demo, Staging, whatever).

All VM’s will be allocated 1 CPU. The Web Servers will get 2GB RAM, DB Servers 4GB, and Consolidated Servers 3GB.

Every VM will use 60GB disk, and every environment instance will have on average 4 snapshots.

 

First lets calculate the host requirements:

  • P1
    • 5 Personal (5xE1 = 5xVM3)
    • 1 Build (1xE1 = 1xVM3)
    • 2 Other (2xE2 = 2xVM1 + 2xVM2)
  • P1 Totals = 6xVM3 + 2xVM1 + 2xVM2

 

  • P2
    • 3 Personal (3xE3 = 3xVM6)
    • 1 Build (1xE3 = 1xVM6)
    • 2 Other (2xE4 = 2xVM4 + 2xVM5)
  • P2 Totals = 4xVM6 + 2xVM4 + 2xVM5

 

  • P3
    • 7 Personal (7xE5 = 7xVM9)
    • 1 Build (1xE5 = 1xVM9
    • 2 Other (2xE6 = 2xVM7 + 2xVM8)
  • P3 Totals = 8xVM9 + 2xVM7 + 2xVM8

 

  • Total = 18 Consolidated VM + 6 Web VM + 6 DB VM = 30 VM’s
  • Total RAM = 18x3GB + 6x2GB + 6x4GB = 90GB
  • Total CPU = 30
  • Total Disk = 60GB x 30 = 1800GB

 

The capacity planning calculator uses a rule of thumb of 10% additional disk for each snapshot. So we need to add 40% to our number resulting in 2520GB.

We should also account for the disk required to persist the RAM from each VM in case they were all paused, so that adds 90GB disk bringing us to 2610GB.

The capacity planning calculator uses some rule of thumbs for how much CPU/RAM/Disk buffer you should budget for. 10% CPU, 10% RAM, 20% Disk. This brings our totals to:

  • Host CPU: 33
  • Host RAM: 99 GB
  • Host Disk: 3132 GB

 

The library should have enough disk to store all VM templates, all Environment Templates, and a copy of each running environment to support team members storing them to the library. (Note: In reality you probably wouldn’t need/want enough library disk to store all running environments to it, but I prefer to do my capacity planning math for the worst case scenario then tweak from there.)

We have 9 VM templates at 60GB each = 540 GB

We have 6 Environment templates containing 9 VM templates at 60 GB each = 540 GB

*** I’m not sure if the Environment templates actually require their own storage, or if an environment template is a just some config data that says which VM templates go together to make an environment template (and as I mentioned at the start, I’m by nature a pretty lazy guy so I’m not going to check right now). I think it’s the latter, but I’ll do my math as if it’s the former to be safe.

From above we know that all 30 VM’s disk + snapshot = 2520 GB

If we assume the same disk space buffer of 20% this bring us to 4320 GB.

  • Library Disk: 4320 GB

 

=====================================================

 

When I punch these numbers into the Capacity Planning Calculator I get very different results (specifically disk) of:

  • Host CPU: 33
  • Host RAM: 99 GB
  • Host Disk: 12247 GB
  • Library Disk: 1134 GB

 

Here is a copy of the original spreadsheet with these results.

 

Here is an updated version of the spreadsheet with what I believe to be the correct formulas.

 

If I remember correctly, the main changes I made were the way it calculates host storage required for snapshots, and added library storage space to allow storing running environments to the library.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Wednesday, March 30, 2011 #

I’m trying to setup TFS Lab Management on a new server and I ran into a really weird issue trying to configure it that I figured I’d share the solution to in case anybody else encountered it.

This was a brand new machine, I installed Windows Server 2008 R2, all the Windows Updates, joined the machine to the domain, then started running through the Lab Management Install Guide: Configuring Lab Management for the First Time

I had a previously created Domain Account called TFSLAB created specifically to be used by Lab Management.  I logged into the server as this account and installed Hyper-V, SCVMM Server, and SCVMM Admin Console.  I setup SCVMM to use TFSLAB as it’s service account.  At this point everything looked OK.

I remoted into my TFS Server, installed SCVMM Admin Console and fired up TFS Admin Console to try and configure Lab Management (logged in as my own personal domain account which is a TFS Admin, local admin on both the TFS box and the SCVMM box, and I had made a SCVMM admin). This is where problems started to occur.

In the Lab Management Config Wizard (launched from TFS Admin) I enter the machine name of our SCVMM machine and click the handy Test button.  What I expect to happen here is it will connect to SCVMM and add the TFS Service Account (in this case a domain account called TFSSERVICE) as a SCVMM Admin.  I get prompted for credentials which have SCVMM Admin rights, which is a little strange as I’m logged in as my domain account which is already a SCVMM admin.  I try entering the TFSLAB credentials and it just keeps prompting me over and over for credentials.  When I eventually hit Cancel to put a stop to that madness it shows an error and won’t let me continue with the Configuration Wizard:

“TF260078: Team Foundation Server could not connect to the System Center Virtual Machine Manager Server: lab.mydomain.local. More information for administrator: You cannot contact the Virtual Machine Manager server. The credentials provided have insufficient privileges on lab.mydomain.local. (Error ID: 1605)”

After some investigation I discovered that I can’t launch the SCVMM Admin Console under any user account other than TFSLAB (regardless of whether I’m trying to do it directly on the SCVMM server or elsewhere).  It gives me an error about insufficient privileges:

“You cannot contact the Virtual Machine Manager server.  The credentials provided have insufficient privileges on localhost.  Ensure that your account has access to the Virtual Machine Manager server localhost, and then try the operation again.  ID: 1605”

SCVMM-Error_thumb1

At this point I was confused as heck, as my user account was clearly a SCVMM admin and I couldn’t figure out what was going on.  I figured I’d probably screwed something up during the install so wiped the SCVMM server, and started from scratch. A day later and I ended up in the exact same spot, so it ruled out any obvious stupidity on my part.

After working with Microsoft support, and manually examining network trace logs, we discovered that the SCVMM server (running under its Service Account: TFSLAB) is requesting a specific permission from Active Directory and getting denied.  We found a relevant KB article: KDC_ERR_C_PRINCIPAL_UNKNOWN Returned in S4U2Self Request

Don’t ask me what exactly is going on here, because we’re getting into low-level stuff that is over my head.  But my understanding is that the SCVMM Service Account (TFSLAB) is trying to do something as a different account (DSMITH, the account I’m trying to login to SCVMM Admin Console as) and AD isn’t allowing it do something on behalf of the other user account.

The resolution suggested in that KB article ended up resolving my issues, we had to get a Domain Admin to add the TFSLAB account to the Windows Authorization Access Group.  Restarted the service, and now I can login to the SCVMM Admin Console as any user that has been setup as a SCVMM Admin, and the TFS Lab Management Configuration Wizard works properly.

 

Summary: Your SCVMM Service Account needs to be added to the Windows Authorization Access Group in Active Directory by a Domain Admin.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Thursday, June 10, 2010 #

Thanks to all the comments and feedback from the last post I think I have a better understanding now of the benefits of CQRS (separate from the benefits of Event Sourcing). I’m going to try and sum it up here, and point out some areas where I could still use some advice:

CQRS Benefits

  • Sounds like the primary benefit of CQRS as an architecture is it allows you to create a simpler domain model by sucking out everything related to queries. I can definitely see the benefit to this, in general the domain logic related to commands is the high-value behavior in the software, but the logic required to service the queries would add a lot of low-value “noise” to the domain model that would dilute the high-value (command) behavior – sorting, paging, filtering, pre-fetch paths, etc. Also the most appropriate domain structure for implementing commands might not be the most optimal for implementing queries. To paraphrase Greg, this usually results in a domain model that is mediocre at both, piss-poor at one, or more likely piss-poor at both commands and queries.

  • Not only will you be able to simplify your domain model by pulling out all the query logic, but at least a handful of commands in most systems will probably be “pass-though” type commands with little to no logic that just generate events. If these can be implemented directly in the command-handler and never touch the domain model, this allows you to slim down the domain model even more.

  • Also, if you were to do event sourcing without CQRS, you no longer have a database containing the current state (only the domain model would) which makes it difficult (or impossible) to support ad-hoc querying and/or reporting that is common in most business software.

  • Of course CQRS provides some great scalability benefits, not only scalability but I have to assume that it provides extremely low latency for most operations, especially if you have an asynchronous event bus.

  • I know Greg says that you get a 3x scaling (Commands, Queries, Client) of your ability to perform parallel development, but IMHO, it seems like it only provides 1.5x scaling since even without CQRS you’re going to have your client loosely coupled to your domain - which is still a great benefit to be able to realize.

Questions / Concerns

  1. If all the queries against an aggregate get pulled out to the Query layer, what if the only commands for that aggregate can be handled in a “pass-through” manner with the command handler directly generating events. Is it possible to have an aggregate that isn’t modeled in the domain model? Are there any issues or downsides to this?

  2. I know in the feedback from my previous posts it was suggested that having one domain model handling both commands and queries requires implementing a lot of traversals between objects that wouldn’t be necessary if it was only servicing commands. My question is, do you include traversals in your domain model based on the needs of the code, or based on the conceptual domain model? If none of my Commands require a Customer.Orders traversal, but the conceptual domain includes the concept of a set of orders belonging to a customer – should I model that in my domain model or not?

  3. I like the idea of using the Query side of the architecture as a place to put junior devs where the risk of them screwing something up has minimal impact. But I’m not sold on the idea that you can actually outsource it. Like I said in one of my comments on my previous post, the code to handle a query and generate DTO’s is going to be dead simple, but the code to process events and apply them to the tables on the query side is going to require a significant amount of domain knowledge to know which events to listen for to update each of the de-normalized tables (and what changes need to be made when each event is processed). I don’t know about everybody else, but having Indian/Russian/whatever outsourced developers have to do anything that requires significant domain knowledge has never been successful in my experience. And if you need to spec out for each new query which events to listen to and what to do with each one, well that’s probably going to be just as much work to document as it would be to just implement it.

  4. Greg made the point in a comment that doing an aggregate query like “Total Sales By Customer” is going to be inefficient if you use event sourcing but not CQRS. I don’t understand why that would be the case. I imagine in that case you’d simply have a method/property on the Customer object that calculated total sales for that customer by enumerating over the Orders collection. Then the application services layer would generate DTO’s off of the Customers collection that included say the CustomerID, CustomerName, TotalSales, or whatever the case may be. As long as you use a snapshotting implementation, I don’t see why that would be anymore inefficient in a DDD+Event Sourcing implementation than in a typical DDD implementation.

Like I mentioned in my last post I still have some questions about query logic that haven’t been answered yet, but before I start asking those I want to make sure I have a strong grasp on what benefits CQRS provides.  My main concern with the query logic was that I know I could just toss it all into the query side, but I was concerned that I would be losing the benefits of using CQRS in the first place if I did that.  I want to elaborate more on this though with some example situations in an upcoming post.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Tuesday, June 08, 2010 #

I’ve been doing a lot of learning on CQRS and Event Sourcing over the last little while and I have a number of questions that I haven’t been able to answer.

1. What is the benefit of CQRS when compared to a typical DDD architecture that uses Event Sourcing and properly captures intent and behavior via verb-based commands? (other than Scalability)

2. When using CQRS what do you do with complex query-based logic?

I’m going to elaborate on #1 in this blog post and I’ll do a follow-up post on #2.

I watched through Greg Young’s video on the business benefits of CQRS + Event Sourcing and first let me say that I thought it was an excellent presentation that really drives home a lot of the benefits to this approach to architecture (I watched it twice in a row I enjoyed it so much!). But it didn’t answer some of my questions fully (I wish I had been there to ask these of Greg in person!). So let me pick apart some of the points he makes and how they relate to my first question above.

  • I’m completely sold on the idea of event sourcing and have a clear understanding of the benefits that it brings to the table, so I’m not going to question that. But you can use event sourcing without going to a CQRS architecture, so my main question is around the benefits of CQRS + Event Sourcing vs Event Sourcing + Typical DDD architecture

Architectures Compared

Architecture with Event Sourcing + Commands on Left, CQRS on Right

  • Greg talks about how the stereotypical architecture doesn’t support DDD, but is that only because his diagram shows DTO’s coming up from the client. If we use the same diagram but allow the client to send commands doesn’t that remove a lot of the arguments that Greg makes against the stereotypical architecture?
    • We can now introduce verbs into the system.
    • We can capture intent now (storing it still requires event sourcing, but you can implement event sourcing without doing CQRS)
    • We can create a rich domain model (as opposed to an anemic domain model)
  • Scalability is obviously a benefit that CQRS brings to the table, but like Greg says, very few of the systems we create truly need significant scalability

  • Greg talks about the ability to scale your development efforts. He says CQRS allows you to split the system into 3 parts (Client, Domain/Commands, Reads) and assign 3 teams of developers to work on them in parallel; letting you scale your development efforts by 3x with nearly linear gains. But in the stereotypical architecture don’t you already have 2 separate modules that you can split your dev efforts between: The client that sends commands/queries and receives DTO’s, and the Domain which accepts commands/queries, and generates events/DTO’s. If this is true it’s not really a 3x scaling you achieve with CQRS but merely a 1.5x scaling which while great doesn’t sound nearly as dramatic (“I can do it with 10 devs in 12 months – let me hire 5 more and we can have it done in 8 months”).

  • Making the Query side “stupid simple” such that you can assign junior developers (or even outsource it) sounds like a valid benefit, but I have some concerns over what you do with complex query-based logic/behavior. I’m going to go into more detail on this in a follow-up blog post shortly. He also seemed to focus on how “stupid-simple” it is doing queries against the de-normalized data store, but I imagine there is still significant complexity in the event handlers that interpret the events and apply them to the de-normalized tables.

  • It sounds like Greg suggests that because we’re doing CQRS that allows us to apply Event Sourcing when we otherwise wouldn’t be able to (~33:30 in the video). I don’t believe this is true. I don’t see why you wouldn’t be able to apply Event Sourcing without separating out the Commands and Queries. The queries would just operate against the domain model instead of the database. But you’d still get the benefits of Event Sourcing. Without CQRS the queries would only be able to operate against the current state rather than the event history, but even in CQRS the domain behaviors can only operate against the current state and I don’t see that being a big limiting factor. If some query needs to operate against something that is not captured by the current state you would just have to update the domain model to capture that information (no different than if that statement were made about a Command under CQRS).

Some of the benefits I do see being applicable are that your domain model might end up being simpler/smaller since it only needs to represent the state needed to process commands and not worry about the reads (like the Deactivate Inventory Item and associated comment example that Greg provides). And also commands that can be handled in a Transaction Script style manner by the command handler simply generating events and not touching the domain model. It also makes it easier for your senior developers to focus on the command behavior and ignore the queries, which is usually going to be a better use of their time. And of course scalability.

If anybody out there has any thoughts on this and can help educate me further, please either leave a comment or feel free to get in touch with me via email: Email

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Monday, June 07, 2010 #

PrairieDevCon 2010 was an awesome time.  Learned a lot, and had some amazing conversations.  You guys even managed to convince me that NoSQL databases might actually be useful after all.

 

For those interested here’s my slide decks from my two sessions:

Agile In Action

Database Change Management With Visual Studio

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Saturday, June 05, 2010 #

I’ve had a lot of discussions at the office lately about the drastically different sets of software engineering practices used on our various projects, if what we are doing is appropriate, and what factors should you be considering when determining what practices are most appropriate in a given context. I wanted to write up my thoughts in a little more detail on this subject, so here we go:

If you compare any two software projects (specifically comparing their codebases) you’ll often see very different levels of maturity in the software engineering practices employed. By software engineering practices, I’m specifically referring to the quality of the code and the amount of technical debt present in the project.

Things such as Test Driven Development, Domain Driven Design, Behavior Driven Development, proper adherence to the SOLID principles, etc. are all practices that you would expect at the mature end of the spectrum. At the other end of the spectrum would be the quick-and-dirty solutions that are done using something like an Access Database, Excel Spreadsheet, or maybe some quick “drag-and-drop coding”. For this blog post I’m going to refer to this as the Software Engineering Maturity Spectrum (SEMS).

clip_image002[4]

I believe there is a time and a place for projects at every part of that SEMS. The risks and costs associated with under-engineering solutions have been written about a million times over so I won’t bother going into them again here, but there are also (unnecessary) costs with over-engineering a solution. Sometimes putting multiple layers, and IoC containers, and abstracting out the persistence, etc is complete overkill if a one-time use Access database could solve the problem perfectly well.

A lot of software developers I talk to seem to automatically jump to the very right-hand side of this SEMS in everything they do. A common rationalization I hear is that it may seem like a small trivial application today, but these things always grow and stick around for many years, then you’re stuck maintaining a big ball of mud. I think this is a cop-out. Sure you can’t always anticipate how an application will be used or grow over its lifetime (can you ever??), but that doesn’t mean you can’t manage it and evolve the underlying software architecture as necessary (even if that means having to toss the code out and re-write it at some point…maybe even multiple times).

My thoughts are that we should be making a conscious decision around the start of each project approximately where on the SEMS we want the project to exist. I believe this decision should be based on 3 factors:

1. Importance - How important to the business is this application? What is the impact if the application were to suddenly stop working?

2. Complexity - How complex is the application functionality?

3. Life-Expectancy - How long is this application expected to be in use? Is this a one-time use application, does it fill a short-term need, or is it more strategic and is expected to be in-use for many years to come?

Of course this isn’t an exact science. You can’t say that Project X should be at the 73% mark on the SEMS and expect that to be helpful. My point is not that you need to precisely figure out what point on the SEMS the project should be at then translate that into some prescriptive set of practices and techniques you should be using. Rather my point is that we need to be aware that there is a spectrum, and that not everything is going to be (or should be) at the edges of that spectrum, indeed a large number of projects should probably fall somewhere within the middle; and different projects should adopt a different level of software engineering practices and maturity levels based on the needs of that project.

To give an example of this way of thinking from my day job:

Every couple of years my company plans and hosts a large event where ~400 of our customers all fly in to one location for a multi-day event with various activities. We have some staff whose job it is to organize the logistics of this event, which includes tracking which flights everybody is booked on, arranging for transportation to/from airports, arranging for hotel rooms, name tags, etc The last time we arranged this event all these various pieces of data were tracked in separate spreadsheets and reconciliation and cross-referencing of all the data was literally done by hand using printed copies of the spreadsheets and several people sitting around a table going down each list row by row. Obviously there is some room for improvement in how we are using software to manage the event’s logistics.

The next time this event occurs we plan to provide the event planning staff with a more intelligent tool (either an Excel spreadsheet or probably an Access database) that can track all the information in one location and make sure that the various pieces of data are properly linked together (so for example if a person cancels you only need to delete them from one place, and not a dozen separate lists). This solution would fall at or near the very left end of the SEMS meaning that we will just quickly create something with very little attention paid to using mature software engineering practices. If we examine this project against the 3 criteria I listed above for determining it’s place within the SEMS we can see why:

  • Importance – If this application were to stop working the business doesn’t grind to a halt, revenue doesn’t stop, and in fact our customers wouldn’t even notice since it isn’t a customer facing application. The impact would simply be more work for our event planning staff as they revert back to the previous way of doing things (assuming we don’t have any data loss).

  • Complexity – The use cases for this project are pretty straightforward. It simply needs to manage several lists of data, and link them together appropriately. Precisely the task that access (and/or Excel) can do with minimal custom development required.

  • Life-Expectancy – For this specific project we’re only planning to create something to be used for the one event (we only hold these events every 2 years). If it works well this may change (see below).

Let’s assume we hack something out quickly and it works great when we plan the next event. We may decide that we want to make some tweaks to the tool and adopt it for planning all future events of this nature. In that case we should examine where the current application is on the SEMS, and make a conscious decision whether something needs to be done to move it further to the right based on the new objectives and goals for this application. This may mean scrapping the access database and re-writing it as an actual web or windows application. In this case, the life-expectancy changed, but let’s assume the importance and complexity didn’t change all that much. We can still probably get away with not adopting a lot of the so-called “best practices”. For example, we can probably still use some of the RAD tooling available and might have an Autonomous View style design that connects directly to the database and binds to typed datasets (we might even choose to simply leave it as an access database and continue using it; this is a decision that needs to be made on a case-by-case basis).

At Anvil Digital we have aspirations to become a primarily product-based company. So let’s say we use this tool to plan a handful of events internally, and everybody loves it. Maybe a couple years down the road we decide we want to package the tool up and sell it as a product to some of our customers. In this case the project objectives/goals change quite drastically. Now the tool becomes a source of revenue, and the impact of it suddenly stopping working is significantly less acceptable. Also as we hold focus groups, and gather feedback from customers and potential customers there’s a pretty good chance the feature-set and complexity will have to grow considerably from when we were using it only internally for planning a small handful of events for one company.

In this fictional scenario I would expect the target on the SEMS to jump to the far right. Depending on how we implemented the previous release we may be able to refactor and evolve the existing codebase to introduce a more layered architecture, a robust set of automated tests, introduce a proper ORM and IoC container, etc. More likely in this example the jump along the SEMS would be so large we’d probably end up scrapping the current code and re-writing. Although, if it was a slow phased roll-out to only a handful of customers, where we collected feedback, made some tweaks, and then rolled out to a couple more customers, we may be able to slowly refactor and evolve the code over time rather than tossing it out and starting from scratch.

The key point I’m trying to get across is not that you should be throwing out your code and starting from scratch all the time. But rather that you should be aware of when and how the context and objectives around a project changes and periodically re-assess where the project currently falls on the SEMS and whether that needs to be adjusted based on changing needs.

Note: There is also the idea of “spectrum decay”. Since our industry is rapidly evolving, what we currently accept as mature software engineering practices (the right end of the SEMS) probably won’t be the same 3 years from now. If you have a project that you were to assess at somewhere around the 80% mark on the SEMS today, but don’t touch the code for 3 years and come back and re-assess its position, it will almost certainly have changed since the right end of the SEMS will have moved farther out (maybe the project is now only around 60% due to decay).

Developer Skills

Another important aspect to this whole discussion is around the skill sets of your architects and lead developers. When talking about the progression of a developers skills from junior->intermediate->senior->… they generally start by only being able to write code that belongs on the left side of the SEMS and as they gain more knowledge and skill they become capable of working at a higher and higher level along the SEMS. We all realize that the learning never stops, but eventually you’ll get to the point where you can comfortably develop at the right-end of the SEMS (the exact practices and techniques that translates to is constantly changing, but that’s not the point here).

A critical skill that I’d love to see more evidence of in our industry is the most senior guys not only being able to work at the right-end of the SEMS, but more importantly be able to consciously work at any point along the SEMS as project needs dictate. An even more valuable skill would be if you could make the conscious decision to move a projects code further right on the SEMS (based on changing needs) and do so in an incremental manner without having to start from scratch.

An exercise that I’m planning to go through with all of our projects here at Anvil in the near future is to map out where I believe each project currently falls within this SEMS, where I believe the project *should* be on the SEMS based on the business needs, and for those that don’t match up (i.e. most of them) come up with a plan to improve the situation.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Friday, November 06, 2009 #

My team is going to start using the Manual Testing functionality available in VS 2010 for one of our larger projects.  We started today to migrate some of our manual test scripts over to Test Cases/Test Plans in Test and Lab Manager.  We ran into a problem immediately that almost prevented us entirely from continuing to use the product.

If you have a Test Case with a lot of Test Steps the scrolling in the Test Case editor in Test and Lab Manager is broken.  When you get enough Test Steps so that you have to scroll vertically to see them all you’re in trouble.  The problem is that the scroll bar keeps resetting itself to the top every time you click in the Test Steps grid.  So for example if you scroll down to see one of the Test Steps further down the list and click it to edit it the scroll bar brings you right back to the top and you can’t see what you’re doing.  The same happens when you try to add new Test Steps to the end of the list, you can still add items but you have to type “blind” since the scroll bar brings you back to viewing the top of your list.

Test Case With Lots of Steps - Test and Lab Manager

 

Luckily a quick phone call to Aaron Kowall provided me with a workaround.  Every Test Case is stored as a Work Item, so you can use the Work Item editor in Visual Studio to edit your Test Cases.  The scrolling issue is only present in Test and Lab Manager and not Visual Studio as far as I can tell.  Since I figured this out I’ve been creating and managing my Test Cases in T&L Mgr still, but for any Test Case with lots of Test Steps I jump over to VS to finish inputting the Test Steps or do any editing.

Test Case With Lots of Steps - VS2010

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Wednesday, October 28, 2009 #

I was trying to create a Sysprep’d VPC image containing VSTS 2010 Beta 2 + TFS 2010 Beta 2 to distribute to the rest of my team to try out some of the new features.  Unfortunately, distributing a Sysprep’d image means that everybody will input a new computer name when they boot it up the first time.  TFS doesn’t like it when you go and change the computer name on it.  I found an MSDN article talking about how to move a TFS server from one domain to another that probably contains the proper steps to fix up TFS once you’ve set the computer name.  But that process looks fairly complex, and the whole point of making these VPC images was to make things simple for my team members who wanted to try out Beta 2.  I’m thinking that creating a VPC without TFS on there and having them install it themselves will be easier than having them try and fix a broken install of TFS due to a changing computer name. Looks like I’m going to be spending the evening creating a VPC image….again.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

I just went through the process of creating a VHD for use by myself and some other members on my team.  I had a pre-built base image with Windows Server 2008 and SQL Server 2008 Enterprise that I used to begin the process (Note: I was hoping to use Windows Server 2008 R2 but it appears it only comes in 64-bit and VirtualPC doesn’t support 64-bit).

From there I installed Visual Studio 2010 Ultimate Beta 2 and Team Foundation Server 2010 Beta 2.  Everything went extremely smoothly except for one small gotcha right at the end.  After installing TFS I was going through the configuration wizard and trying to configure Team Build.  The first time I ran through the config wizard it gave me an error on the last step which was something like initializing the team build service.  The error message said something about not finding the user account, which was odd since I just used the local Administrator account and made sure to click the “Test” button in the config wizard to check that the username/password I entered was valid.  The exact error message was:

TF255070: Cannot register Team Foundation Build Service Host: User Account localhost\Administrator was not found

After a bit of searching and playing around I figured out how to fix the problem.  When running through the TFSBuild configuration wizard one of the steps is pointing it to the Team Collection to use.  When you do this and you browse for your TFS instance, and since this is probably the first time doing this you’ll have to go into the TFS Servers dialog and add in the instance, make sure that you use the actual machine name and not localhost or an ip address.  My mistake was I just entered localhost.  When I ran through the wizard a 2nd time (actually it was more like the 8th time before I figured this out) and used the actual machine name everything else worked great.

Next up I have to sysprep this and distribute it to my team.  I’m hoping that the Sysprep and distribution of the VM goes smoothly without causing any TFS issues.

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Wednesday, September 30, 2009 #

So the other day at work we were doing an architecture/design review on a project to share knowledge gained with other project teams.  After making the code available one of the other architects came to me with a specific piece of the code in hand asking us to justify why it seemed so complex (in this case the comment was mainly because the code used lambda's generously and he had limited exposure to them previously).  Here is the code in question:

Original Approach

public void Save(IJob job)
{
    this.AutoCommit(s => s.SaveOrUpdate(job));
}

public static void AutoCommit(this IDisplacementStorage storage, Action<IStorageSession> action)
{
    storage.AutoSession(s =>
                            {
                                using (var tx = s.BeginTransaction())
                                {
                                    try
                                    {
                                        action(s);

                                        tx.Commit();
                                    }
                                    catch
                                    {
                                        tx.Rollback();

                                        throw;
                                    }
                                }
                            });
}

public static void AutoSession(this IDisplacementStorage storage, Action<IStorageSession> action)
{
    using (var session = storage.OpenSession())
    {
        action(session);
    }
}
 
I wanted to compare this implementation to some more simplistic implementations.

Option 1

public void Save(IJob job)
{
    using (var session = this.OpenSession())
    {
        using (var tx = session.BeginTransaction())
        {

            try
            {
                session.SaveOrUpdate(job);
                tx.Commit();
            }
            catch
            {
                tx.Rollback();
                throw;
            }
        }
    }
}

 

Aside from clearly breaking the Single Responsibility Principle the Option 1 approach is less code (8 lines in one method vs 11 lines in 3 methods) and also easier to understand/read.  However the SRP problems become more evident once you start to flesh out the storage layer and need to implement more Save methods.  Lets say we needed a Save(IZone), Save(IRoom), Save(IDiffuser), etc.  We could just copy/paste the code and change the argument type, but that would quickly result in much more code.  The Original Approach only requires 1 extra method containing one line of code for each additional Save method.  The Option 1 approach would require an extra method containing 8 lines of code for each additional save method.  It also results in lots of code duplication between Save method, so if you wanted to change the way you create sessions/transactions, or add logging, or change the error handling, or anything like that you have to make changes in many places as opposed to just one in the Original Approach.  An obvious answer to some of these concerns is to refactor out the common code.  See Option 2 below.

 

Option 2

public void Save(IJob job)
{
    SaveOrUpdate(job);
}

public void Save(IZone zone)
{
    SaveOrUpdate(zone);
}

public void SaveOrUpdate(object target)
{
    using (var session = this.OpenSession())
    {
        using (var tx = session.BeginTransaction())
        {

            try
            {
                session.SaveOrUpdate(target);
                tx.Commit();
            }
            catch
            {
                tx.Rollback();
                throw;
            }
        }
    }
}
 

The SRP violations are getting better now, the Save method clearly only has one responsibility now, although the SaveOrUpdate is still taking on multiple responsibilities (creating/disposing session, creating/committing/rolling-back transaction, saving the object).  The Option 2 approach now has 9 lines of code spread across 2 methods – still better than the Original Approach in this respect.  Adding additional Save methods now simply requires adding a new method with a single line of code, which is the same as in the Original Approach.  However, what about deletes?  To handle that we could create a similar method as the SaveOrUpdate but to handle deletes; however, that would be another 8 lines of code and an additional method for deletes.  What about more complex scenarios, for example lets say it was a financial application and you had a TransferFunds method that needed up to update 2 account balances within a transaction.  In that case or other cases that are more than just a simple save/delete on a single object we’ll need to essentially copy/paste the code to create/dispose the session + transaction each time along with the appropriate try/catch block.  A nicer way to handle arbitrary scenarios like this is instead of hardcoding the action to be taken we allow the caller to pass in an Action that contains the scenario-specific data-access code.  This might look something like Option 3 below.

 

Option 3

public void Save(IJob job)
{
    ExecuteInTransaction(s => s.SaveOrUpdate(job));
}

public void Delete(IJob job)
{
    ExecuteInTransaction(s => s.Delete(job));
}

public void TransferFunds(IAccount fromAccount, IAccount toAccount)
{
    ExecuteInTransaction(s => 
                              {
                                  s.SaveOrUpdate(fromAccount); 
                                  s.SaveOrUpdate(toAccount);
                              });
}

public void ExecuteInTransaction(Action<IStorageSession> action)
{
    using (var session = this.OpenSession())
    {
        using (var tx = session.BeginTransaction())
        {

            try
            {
                action(session);
                tx.Commit();
            }
            catch
            {
                tx.Rollback();
                throw;
            }
        }
    }
}

 

The lines of code required for just the save is the same as in Option 2 (9 lines of code spread across 2 methods) but now we have much more flexibility to do other data-access scenarios such as delete or more complex use cases with minimal code required.  The ExecuteInTransaction class arguably still violates the SRP because it is responsible for handling the session creation/disposition as well as handling the transaction.  We could break this out to satisfy SRP; but a more reasonable justification would be that you might want to be able to execute data-access code outside of a transaction, or you might want to directly manage the transaction object rather than the simple transaction handling baked into the ExecuteInTransaction method.  We could do this by refactoring out the session handling code into it’s own method.  We can follow the same pattern we did with using the Action in the ExecuteInTransaction method to allow the caller to pass in arbitrary actions to be executed in the session.

 

Option 4

public void Save(IJob job)
{
    ExecuteInTransaction(s => s.SaveOrUpdate(job));
}

public void Delete(IJob job)
{
    ExecuteInTransaction(s => s.Delete(job));
}

public void TransferFunds(IAccount fromAccount, IAccount toAccount)
{
    ExecuteInTransaction(s => 
                              {
                                  s.SaveOrUpdate(fromAccount); 
                                  s.SaveOrUpdate(toAccount);
                              });
}

public void ExecuteInTransaction(Action<IStorageSession> action)
{
    ExecuteInSession(s =>
                        {
                            using (var tx = s.BeginTransaction())
                            {
                                try
                                {
                                    action(s);
                                    tx.Commit();
                                }
                                catch
                                {
                                    tx.Rollback();
                                    throw;
                                }
                            }
                        });
}

public void ExecuteInSession(Action<IStorageSession> action)
{
    using (var session = this.OpenSession())
    {
        action(session);
    }
}
 

If we rename ExecuteInTransaction -> AutoCommit and rename ExecuteInSession -> AutoSession we now have the exact code we started off with in the Original Approach (except that the Original Approach implemented them as extension methods whereas Option 4 is written as instance methods directly on IDisplacementStorage).

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati