A Curious Mind
#tastic

db4o

Just started working with db4o this week, and I must say that in the very small way that I am using it, its a pretty slick little database. XCopy deployable, no install required. Its just a file, and it can support client, server supposedly, but I am not sure how much I really care about that.

This kind of database is really nice for prototyping because all I have to do is throw the objects in there and they are persisted, bam! You query out using .net Predicate<T> or you can query by example, but I am really enjoying the predicate stuff.

Also, as I move to 'one db per application', this kind of database is becoming more and more attractive to me, because its so lightweight. I still don't know what its like to have one of these in production with an issue in it. So if you know, toss a comment my way.

-d

Currently playing in iTunes: Prayer of the Refugee by Rise Against

Mass Transit Announcement

So chris and I have been working on a lightweight service bus similar to NSB called Mass Transit. When Chris and I originally encountered NSB it had a style that didn't really fit the way that I was approaching development at the time and so I decided that, if for nothing else, I would start trying to build one myself to learn more about the concept of an ESB. The concept seemed simple at the outset, but I knew that if I was going to get this right I would have to borrow heavily from experts such as Udi Dahan, Greg Young and the many books I have now read.

It has been a very interesting road, and I am now happy to say that Mass Transit (MT) has reached a 0.1 release. It can send messages to and from endpoints, it has publish/subscribe capabilities, and it has a console/windows service host. Its not much at the moment, but that is a part of its goal (To be simple). One of the goals is that MT should be a very lightweight infrastructure, that is easy to get started with but works as reliable as a lightswitch.

In the near future you will be seeing some API changes between 0.1 and 0.2 as Chris and I continue to make the system easy to use and extend (and play with different names of things). We will also continue to focus on enabling xcopy deployment scenarios, on the testability story, and making it as non-intrusive as possible.

Also, there are some samples of using Mass Transit as well in the SVN repo under, samples

If you have any time, your review of the code or patches will be greatly appreciated. I have taken much from the OSS community and this is one of the ways I hope to contribute back.

Thanks,

-d

A Message Based Project Structure

Many of us have now become comfortable with how to best build a layered system with the typical UI, biz, and data layers. But as I started to write my first application that took messaging as a pretty central concept I found that I didn't really know how I wanted to parcel out my system. So here is a closer look at what I have been playing around with in this area. Project Setup
  • Solution Name
    • Domain
    • Domain.Tests
    • Domain.Messages
    • Persistance
      • Domain.Persistance
      • Domain.Persistance.Tests

Ok some notes: This is still very proto but I wanted to get it out there. I have taken to putting the repository (ala DDD) interfaces in the domain itself, and then implementing them in Domain.Persistance with tests that test its ability to work. On the project that I am trying all of this out on I am also trying db4o so this has been a great way to keep the db4o junk out of my domain (imagine that, heh).

Its also important to note that I am currently struggling with how I want to implement the message consumption. I currently see that there are two main ways that you can write your message handlers. One of these is as a service layer above the domain layer and manipulate it. This would map easily to web pages and I am fairly comfortable with this pattern, but I am not to sure how much I am really digging it. There is another flavor which says that your objects should consume the messages directly, however I want more infrastructural support for this specific format than I can currently find. Most of the OSS projects seem to support the first method quite easily, and the second seems a bit more of a hack to make happen. I hope to fix that.

Let me know what you are thinking.

-d

How does reversibility affect my designs

Well lets see. I think on of the big things it does is encourage a lot of interfaces in my code. I think it encourages the use of an IoC Container. I think a lot about my code and refactor little things often to get the names right. It has me looking into message based systems as a way to build more loosely coupled systems. It has me looking at various parts of my application and wonder "is this really a service that could be off on its own and then I can just be a customer of?"

I tend to move out of analysis paralysis because I know that I can always change my mind. There are times where I hit a spot and I am like damn, that refactoring is going to take all day, but afterwards the code is almost always in a better position for the long run, and if it is painfully it usually means I was doing something wrong from an SoC standpoint anyways.

It has me favor infrastructure that doesn't infect my domain, that way I can swap out infrastructure later, which is a huge benefit.

Hows that sound?

Alt.Net TypeMock Dinner

After Alt.Net had wrapped up for the day on Sunday there were a few stragglers left in the hotel lobby talking about all manner of things. After hanging out for a while (which included a spontaneous group effort to get schema support added for SQLite in NHibernate (r3478). We eventually migrated over to the Claim Jumper for dinner.

Where I was pleasantly surprised by Roy announcing that TypeMock would be sponsoring dinner! We proceeded to order dinner, and a pleasant conversation (2 actually) started up. Again, after MVP and Alt.Net many of the participants were pretty wiped out, but I came away with the goal of trying out TypeMock. And with the free license that Roy handed out at Alt.Net it will be quite a bit easier. :)

So Roy, I hope to be able to provide you the feedback you want very soon as I just started a little personal project where I can explore TypeMock in all of its glory.

-d

Spec#

I want Spec#

For more see these posts:

Roy's Blog

Greg's Blog

Berkely DB

Evan Hoff has been digging into the BerkelyDB lately and this stuff is pretty interesting. He sent me some good links that I thought I would share.

Google Case Study:
http://www.oracle.com/customers/snapshots/google-oracle-berkeley-db-casestudy.pdf

Flash Overview:
http://oukc.oracle.com/static05/opn/oracle9i_database/34313/050306_34313/index.htm

I thinking that this could be a good data store for Mass Transit. No install for the client which is a huge win, and its fast as all get out. With the HA stuff, I can even mirror the DBs.

I also want to bring this up, because this is a non-relational database. Until Evan had showed me this I hadn't really thought about why I choose my database engines, which I tend to just treat as dead object storage. By exploring these options I can find better tools for me and my users to employ for the most benefit.

Now if I can just get my head around the Distributed Hashtables Greg and Chris keep telling me about.

-d

-ilities

Lately Evan Hoff and I have been discussing the concept of '-ilities' which we use to communicate the qualities of the software we develop. As a developer I have always valued certain qualities in the code that I write and in the software I develop but I hadn't really thought about this concretly before. Some sample '-ilities' are scalability, availability, testibility, and so on (basically anything with an 'ilitity' on the end of it). Some of the qualities I have been focusing on lately are reversability (how can I write my code in such a way that I can reverse out a decision easily), decomposability (how can a structure my program so that it can be worked on by multiple teams) and testibility.

By choosing these qualaties I am making an important statement about what I am willing to trade off in favor of these. I can then take this to the customer and ask them how important do they think this is and we can start to have a disscussion about their software in a non-techincal context.

I encourage you to do the same. If you would like to read more about this I would recommend reading the book Software Architecture in Practice, which contains a better and longer discussion on the topic.

NCAA 2008 Champions

It was such an awesome game. I just want to take a second and say thank you to the team for an awesome season.

I am in, or just outside of this photo: Photo of Mass

ActiveMQ

Does anyone know how to make ActiveMQ run as a service on Windows 2003?

Idempotence Part II

So since my last post, I have gotten some feedback on this topic that I want to share, and for file based stuff I really like the following idea. Thanks Udi.

If your message source is a flat file consider the path to the file 'C:\file_drops\expert.csv' and the last date of change '12-2-2007' as a way to tag the message so that it won't get processed twice.

Also, my definition of idempotence earlier was a little off, smack!, idempotence just means that the result of processing the message multiple times will be the same as if it were processed just once.

Music To Code By

Usually when I am getting my code on I am going to listen to some high energy music.

 

  • Ray J
  • Flo Rida
  • Freeway
  • DJ Khaled
  • Rise Against
  • Rage Against the Machine
  • Static X
  • NERD
  • A Day to Remember

However when I am doing 'work' I tend to take a more mellow approach. Last thing I need is to send an email to my boss why listening to 'Smack My Bitch Up' ;)

  • Matt Nasi
  • Clapton
  • Queen
  • 311
  • Ben Harper
  • Black Eyed Peas

For the music holic in me the Zune Pass has been awesome. Now that I have mac'd out I am not to sure how excited I am to 'own' music again with iTunes. hmmm

Idempotence

So lately i have been thinking a lot about how I want to implement idempotence. Well this week I had the pleasure to discuss the topic with Evan Hoff, who is also thinking about this. Today he emailed me the following 'acid' test for idempotence.

If the message was generated by a traveling salesman in his hotel room and got forwarded to the application 10 days later, what are the chances of success?

If processing that message causes data loss by overwriting newer values (as an example), you have an idempotence problem.

Based on our conversation, the first step was to decide how much idempotence do we want. Every column, or every record, or groups of columns, etc? To make this simple lets go with idempotence of the record.

The next thing we discussed was keeping the message goal oriented, not 'CRUDy'. Hopefully that makes sense.

Based on those two things I came up with the following message, to update a sales prospect contact info.

 

public class SalesContactInfo : IMessage
{
    public string Address { get; set; }
	public string Telephone { get; set; }
	//more as necessary
	
	public Guid TransactionId { get; }
	public Guid SourceId { get; }
	public DateTime DateSubmitted {get; }
}

And the message handler

 

public void HandleSalesContactInfo(SalesContactInfo message)
{
    Contact contact = repository.Get(message.ContactId);
	AuditTrail trail = repository.Get<AuditTrail>(contact.Id);
	
	if(trail.LastUpdate < message.DateSubmitted)
	{
	    //update
	}

}

But what if the dates match? Then I start to run into trouble and we have to start employing some smarts about which system 'wins.' But I think if Udi were here he would say, send a message back to the salesman to confirm the info. ?

 

public void HandleSalesContactInfo(SalesContactInfo message)
{
    Contact contact = repository.Get(message.ContactId);
	AuditTrail trail = repository.Get<AuditTrail>(contact.Id);
	
	if(trail.LastUpdate < message.DateSubmitted)
	{
	    //update
	}
	else if(trail.LastUpdate == message.DateSubmitted && message.SourceId != trail.SourceId)
	{
	    bus.Reply(new DoubleCheckInfo(message, contact, trail));
	}

}
 
Thoughts?

Being the Architect and YAGNI

So, I continue to find ways to be an agile architect. I just finished getting two teams started on a rather large project at work. For about a month or so ahead of time I was spending 2-3 hours a day ramping up on various technologies that I had been researching for use on this project. I had read books, discussed concepts with colluegues, and of course pounded out some code. :)

Well I got the team started, and I found myself getting frustrated with the teams. They just didn't get it, it was so clear to me. Then it kinda hit me. I had spent about 60 hours (2 work weeks) learning about this and they didn't get it on the first day or two. Gee, I wonder why. So, now I am trying to figure out how to best communicate this knowledge in the fastest way possible.

Any ideas?

The best idea I have right now is to communicate the various concerns I have identified in the system. I have a pretty good idea of how I would implement it but at this point its all half implemented code. I think it is going to be a better project if I back off of the code and let them find ways to handle the various concerns. To be fair I am forcing some new tools on them, and hope they make the work easier and not harder, but only time will show that to be true.

In closing, a big thanks to my team at the bank, and keep calling YAGNI on me, I am an architect after all. :)

Articles to Read about Databases

As I dig into the future of databases, I have found some articles that I want to share with a wider readership. The basic premise of my search is "RDBMS's were developed over 25 years ago, and we haven't come up with something better since?! I gotta look into that" and so starts my education on all of the new stuff coming from those wacky data guys. ;)

Shards

 One thing that I have learned about is scaling out versus scaling up. I have found it to be a very interesting concept, which is large part due to what we as software developers can do to make this easier. That which I am most excited about is the Hibernate.Shards API. How sweet is it going to be if I can hide the shards concept behind the hibernate api? very.

 Reads:

http://highscalability.com/unorthodox-approach-database-design-coming-shard

http://highscalability.com/tags/shard

http://www.rgoarchitects.com/nblog/2007/08/21/TheRDBMSIsDead.aspx

Column Store Databases

Ok, still getting my head around these bad boys but the concept (I think) is that every column in a typical "row store" database is kept seperate. The benefit here is on reads, and according to the literature (vendor and otherwise) they are very fast at reading. I first discovered this concept while reading about Google's BigTable. Very neat, if only I could figure out how to best use it.

Reads:

http://www.databasecolumn.com/2007/09/one-size-fits-all.html

http://209.85.163.132/papers/bigtable-osdi06.pdf

http://en.wikipedia.org/wiki/Column-oriented_DBMS

Denormalization

A big topic for larger data sets seems to be the responsible denormalization of data. This isn't really a new concept, we have been doing it for reporting purposes for quite awhile but it seems to be coming back to me more and more often. One of the more interesting concepts was related by Mats Helander on storing an object in the db as an XML blob.

Reads:

http://www.matshelander.com/wordpress/?p=66

Object Oriented DB: http://www.db4o.com/

 

BASE vs ACID

I can't remember what got me started on this, but I am at the very beginning of my learning curve here.

http://www.infoq.com/articles/pritchett-latency