Recently a performance bug came my way. A highly multithreaded application, that can run for hours depending on the amount of data its processing, was observed having all its CPUs ramping up to 100% utilization, and the amount of data processed per second dropped down to nothing.
Ok, no big deal here. I've most likely got a state where all threads are stuck in a tight loop (most likely the same loop), and each thread is waiting on the other to set a flag that will allow them to exit the loop. Your basic deadlock issue. Pretty easy to fix, if I can reproduce the problem on my dev machine and use the debugger to tell me what the offending function is.
The problem is that I wasn’t able to reproduce it. Crap.
Ok…on to step two…
Looks like I'll have to find or build a tool that can give me the call stack of all the threads in a managed app. I started out trying to use System.Diagnostics.StackTrace, and StackFrame. From a System.Thread instance I can get create a StackTrace object and see what function the thread is in. But I cant get a list of all System.Thread objects in an app. I have access to the Process.Threads collection, but that gives me a list of System.Diagnostics.ProcessThread objects, not System.Thread. Shoot…that’s not going to work.
Ok, next step is to look at creating a really light weight debugger, from .Net's ICorDebug api, to basically break into the app and dump out the call stack of all the managed threads. I found a couple examples and it didn’t look too bad, but the only issue is that ICorDebug is a COM API. So I'd have to do all that fun C++ COM stuff…Ick. And I need the tool yesterday.
After digging around a bit more I found out that the Visual Studio debugger team wrote a very nice managed wrapper around ICorDebug, called MDbg. Sweet!
There is a bunch of info about it here.
After digging a bit further, I found that someone write a handy little tool called Managed Stack Explorer. Oh geez! The gods are smiling at me! That’s exactly what I need.
This little tool shows all managed apps running on your server. When you pick an app, it shows all threads in the process. When you click the thread, it shows you the call stack for that thread. Simple and nice.
With this tool, I was able to find the offending non-threadsafe function in about 5 minutes. Fixed, done, yipee.
But this post and about someone's tool, of my bug fixing adventures. No, its about coming across one of the most useful APIs I've seen in a long time! A simple and well designed .Net wrapper around ICorDebug, giving .Net developers full access to the CLR debugger. I'm very excited about the idea of a managed wrapper around ICorDebug. There are so many diagnostic tools that could be created with this. I'm looking forward to digging around in the API!
Although most of topics I've written about are pretty random, I'll try to focus in on a much more narrow (yet incredibly broad) topic: multi core vs many core processing, parallel processing, and the paradigm shift that we software engineers are on the leading edge of having to face.
To put it in short Intel, AMD, and other hardware manufacturers are telling anyone that listens that programmers need to change the way they think about designing enduser software. End-user software needs to take advantage of multiple cores. And this doesn't mean spinning up a background thread to do some compute intensive request, so that our UI remains responsive. It means designing all compute intensive algorithms to scale to multiple processors.
Intel goes on to say that designing for 2, 4, or 8 processors is way to short sighted. We need to design our software to scale out to N processors; where N could be 16, 64, or 512.
Coding Horror has a great post from last year that demonstrates how well common end user software take advantage of multi core processors. The results as sad to say the least.
We can no longer just expect our software to get faster with the next chip release by Intel or AMD. What is worse, our software will most likely run slower on newer desktop and mobile chips.
The trends in processor manufacturing is to have slower, cooler, more efficient individual cores, and to pack more and more of them on a single chip. This means that end user software that only use 1 or 2 threads will actually run slower on newer processors.
This can be seen with Intel's new quad core mobile processor: QX9300. It has 4 cores, supporting hyper threading so it shows 8 cores in task manager, but runs at 2.53 GHz. This is an amazing chip, but only for software that is actually designed to run across multiple cores.
To boil it down to a simplified problem statement: Software outlives hardware, and hardware ain't getting any faster. (more on that later)
Generally I'm not one to write a post that does nothing but highlight someone else's blog post...BUT...this one was important enough (IMHO) that I decided to break my own rule.
Are you building a Leatherman or a Samurai sword? (stupid linker isnt working)
http://petewarden.typepad.com/searchbrowser/2008/07/are-you-buildin.html
As programmers we always want to write new functionality...neat, new, COOL functionality. That's just what we do, and we love it.
But its hard to keep in mind what our added functionality does to user efficiency. No matter what we think our job is all about, its really about making the lives of our users easier and more efficient. That’s it…done…its that simple.
This is easy to understand when writing a UI application. If a new feature causes the use to perform 5 extra steps with the UI, but those 5 extra steps only give a small return on efficiency (so small it wasn’t worth the time to perform the 5 new steps), than drop the feature, its not worth it. If the feature is complex or confusing, and will cause the user to misuse it or skip it all together, than drop the feature, its not worth it.
Where this becomes harder to evaluate is in writing an SDK API. Like the above post states, we all want to write the ultimate architecture. The one that can do anything and everything. But "anything and everything" can quickly become a directionless mess, where you have a several hundred of classes with obvious direction on how to weave them together into the next "Wonder Bread". What you end up with is a big mess that your users (other developers) will mostly likely just pass off as too complex and look for a simpler API.
The last part of the above post states it perfectly.
"You end up with a million features, which makes it very time-consuming to build, and even when it's done, the number of different gizmos on your Leatherman scare off potential users. You need to have a strong connection to your actual customers, and be hearing about exactly what they need to do. Then you need to design around that, ruthlessly jettisoning anything that distracts from them achieving their goals."
For grins I looked at my code that calls:
T tmp = new T();
in
Reflector, so see if it could shed any light into T instance creation badness. Well, it turns out that the C# compiler spits out code to call Activator.CreateInstance
T tmp =
Activator.
CreateInstance<
T>();
I kind of get why the C# compiler does this, because it doesnt know what T is at compile time. But at run time the JIT compiler DOES know. I'm surprised that the C# team didn't build in the smarts to JIT code to explicitly call the default constructor of whatever type T is.
I recently needed to change how an array lookup worked to make it more efficient, and decided to use the List<T>.BinarySearch to do the lookup. The class that contained this lookup had a generic parameter, and was constrained like so:
public class SortedNameList<T> where T : class, INameValueItem, new()
{...}
where the T of List<T> was the same as the class generic parameter.
In order to do the BinarySearch, List<T> required an input of type T to search against. Since I only had the value of the property that will be compared against (an int), I needed to create a new temp instance of T, set the value, and then pass it into BinarySearch().
My unit tests passed, all the functionality was good, and I was happy. Then I ran the my app under a profiler to see how much faster my fancy BinarySearch was.
To my surprise, the time spent doing the binary search calls was almost exactly the same as a linear lookup (over 1.2 million searches)! What the heck? I know that creating a new temp object each lookup isn't very efficient, but it shouldn't make that much of a difference.
So after looking a bit deeper and doing some more performance tests, I found out that creating a new instance of a generic ("
T tmp = new T()") is sloooooo. How slow? How about 30X slower! WOW...I had no idea!
And its not that it takes the CLR some time to figure out how to create a new T, where most of the time is on the first instance, and the rest speed up. Nope, the duration to create a new T is consistant, from the first instance to the millionth instance.
Good to know...dont do that in a high volume area
I get a bit sick of checking for null on my IEnumerable objects before doing a foreach over them. In my opinion I think the CLR should check if the list is null, and if it is just exit out of the foreach iteration as if there were no items in it.
Well, I was goofing around with Extension Methods a bit and figured out how to get this kind of functionality (sort of).
Now unfortunatly Extension Methods cant override an existing method on a type, so I cant just create a new GetEnumerator extension method (well, actually i can make one, but it wont get called). But I can create a new method that returns IEnumerable, and just call the foreach on it.
So in order to do this, first add this class to your code
public static class MyExtnesionMethods
{
public static IEnumerable<T> Enum<T>(this IEnumerable<T> input)
{
if (input != null)
{
foreach (var t in input)
{
yield return t;
}
}
else
{
yield break;
}
}
}
Now, anything that inherits from IEnumerable<T> will have the Enum method. Then all you have to do is call foreach on someClass.Enum(), even if someClass is null. Below is an example of ho this works.
static void Main(string[] args)
{
List<string> names = new List<string>()
{"john", "kim", "jean", "brent"};
//iterate names using stock enumerator
foreach (string name in names)
Console.WriteLine(name);
//iterate names using extension method
foreach (string name in names.Enum())
Console.WriteLine(name);
names = null;
//oh man! I have to check for null...I hate that
if (names != null)
foreach (string name in names)
Console.WriteLine(name);
//Yea! I dont have to check for null anymore!
foreach (string name in names.Enum())
Console.WriteLine(name);
}
The extension method uses the "yield return" and "yield break" iterator syntax to let the foreach either spin over the IEnumerable if its not null, or if it is null, "yield break" returns false from the IEnumerable.MoveNext which tells the foreach that there are no more items in the list so it should break out of the loop.
So, no more null checks!
<Update>
A reader commented that this could be optimized by using the static method Enumerable.Empty<T>. This would save an object instance from being created by the yield return functionality. The new and improved Extension Method is as follows:
public static IEnumerable<T> Enum<T>(this IEnumerable<T> input)
{
return input ?? Enumerable.Empty<T>();
}
I recently profiled a sproc that makes heavy use of the TSQL SUBSTRING function (hundreds of thousands of times) to see how it performs on a SQL 2005 database compared to a SQL 2000 database. Much to my surprise the SQL 2005 database performed worse...dramatically worse than SQL 2000.
After much researching it turns out the problem is that the column the text was stored in was an NTEXT, but SQL 2005 has deprecated the NTEXT in favor of NVARCHAR(MAX). Now, you'd think that string functions on NTEXT would have the same performance on 2005 as it did on 2000, but thats not the case.
Ok, so NTEXT is old badness, and NVARCHAR(MAX) is new goodness. Then the next logical step would be to convert the column to be a NVARCHAR(MAX) data type, but here lies a little but very important gotcha.
By default NTEXT stores the text value in the LOB structure and the table structure just holds a pointer to the location in the LOB where the text lives.
Conversely, the default setting for NVARCHAR(MAX) is to store its text value in the table structure, unless the text is over 8,000 bytes at which point it behaves like an NTEXT and stores the text value in the LOB , and stores a pointer to the text in the table.
So, just to recap, the default settings for NTEXT and NVARCHAR(MAX) are completely opposite.
Now, what do you think will happen when you execute an ALTER COLUMN on a NTEXT column that changes the data type to a NVARCHAR(MAX)? Where do you think the data will be stored? In the LOB structure or the table structure?
Well, lets walk through an example. First create a table with one NTEXT column:
CREATE TABLE [dbo].[testTable](
[testText] [ntext] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Next, put 20 rows in the table:
INSERT INTO testTable SELECT 'hmmm...i wonder if this will work'
Then run a select query with IO STATISTICS:
SET STATISTICS IO ON
SELECT * FROM testTable
SET STATISTICS IO OFF
Now, looking at the IO stats, we see there was only 1 logical read, but 60 LOB logical reads. This is pretty much as expected as NTEXT stores its text value in the LOB not the table:
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 60, lob physical reads 0, lob read-ahead reads 0.
Now, lets alter the table to be an NVARCHAR(MAX):
ALTER TABLE testTable ALTER COLUMN testText NVARCHAR(MAX) null
Now when we run the select query again with UI STATISTICS we still get a lot of LOB reads (though less than we did with NTEXT). So its obvious that when SQL Server did the alter table, it didn't use the default NVARCHAR(MAX) setting of text in row, but kept the text in the LOB and still uses pointers lookups to get the text out of the LOB.
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 40, lob physical reads 0, lob read-ahead reads 0.
This is not as expected and can be devastating for performance if you don't catch it, since NVARCHAR(MAX) with text
not in row actually performs
WORSE than NTEXT when doing SUBSTRING calls.
So how do we fix this problem? Its actually fairly easy. After running your alter table, run an update statement setting the column value to itself, like so:
UPDATE testTable SET testText = testText
SQL server moves the text from the LOB structure to the table (if less than 8,000 bytes). So when we run the select again with IO STATISTICS we get 0 LOB reads.
Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
YEA! This is what we want.
Now, just for grins, what do you think happens if we change the NVARCHAR(MAX) back to NTEXT? Well it turns out that SQL Server moves the text back to the LOB structure. Completely backwards from what it did when converting NTEXT to NVARCHAR(MAX).
I was looking at Refletor addins the other day and ran across one that would be an amazing time saver.
Its an addin that generates the Reflection.Emit code!
Anyone who has ever spent any time with the Reflection.Emit namespace should immediately realize how wonderful this tool has the potential to be (as long as the generated code is of good quality of course).
Also, the way integrates with
Reflector is pretty slick. It adds a "Reflection.Emit" choice in the list of languages you want Reflection to display the code in. Then, in the left pane,when you click a module, class, method, property, whatever, it displays the Reflection.Emit code in the right pane that you would have to write to generate the thing you clicked on.
Simple...and amazing!
I recently spent 6 days writing Reflection.Emit code to generate two fairly complex methods. 2 days each for writing the code, and 1 day each for debugging it and making it actually work. I probably could have cut that down to 1 to 2 days using this tool.
I haven't yet compared the addin's generated Reflection Emit code to the code i've written manually to validate its quality, but just playing around with it, the generated code looks pretty good.
It can be found here:
http://www.codeplex.com/reflectoraddins/Wiki/View.aspx?title=ReflectionEmitLanguage&referringTitle=Home
I've spent a lot of time lately thinking about instrumentation and how to integrate it into software projects.
As a performance engineer I tend to think about instrumentation from the point of view of someone who wants to record the details of what a system is doing, and then dig through the data and use it to figure out what is wrong.
But I’ve been talking to people the past few months about instrumentation, I’ve come to realize that instrumentation means different things to different people. Some people think of instrumentation as a high level, light weight set of metrics that are easy to consume, understand, and extrapolate performance deltas; a management point of view. Other people, like me, think of it as recording low level details of what’s going on in the call stacks and sql engine; a trouble shooter point of view. And then others think its somewhere in between; everyone else.
Well, I think everyone is correct. There are different levels of instrumentation that are useful at different points in validating system health. There should be easy to consume and understand metrics to validate day to day health checks, there is medium level detail instrumentation that is used to figure out where a problem is, but takes a bit more effort to analyze. And if that isn’t enough to find and fix the problem, there is the dump everything to file model that gives you all the data you need to understand what is going on in the system, but requires internal knowledge of the system and time to analyze the data. Also, each level builds upon the other, so there is as little duplicated effort as possible.
So I’ve tried to create an instrumentation model demonstrate these different levels, the answers each level tries to answer, and when you move onto the next level
The first level will provide you with the most early bang for your buck, and it’s a easy way to tell if you have a problem, with as little dev effort as possible. Then as you get the high level metrics in, you can start building in the mid level metrics, and so on. The main thing is to not try and build the entire instrumentation framework up front before you put anything it. Start putting high level metrics in early and use then in your automated testing

I've started working on a collection of code analysis tools, that are open source and available for anyone to use. I've got a descriptive article located at CodeProject.com (http://www.codeproject.com/cs/algorithms/Not_Used_Analysis.asp), which includes the source code and binaries.
The three main tools that I have so far are the following:
- A “Not Used Finder”: searches through a list of assemblies and looks for any type, method or field that isnt ever used. Points out code that you should be able to remove.
- A visibility analysis: searches through a list of assemblies and looks at the visibility of methods and types and shows those that have a visibility higher than is required based on current usage.
- A Duplicate / Near Duplicate code analysis
I've recently finished part two of a series of articles on creating dynamic types with System.Reflection.Emit. Dynamic types are types, or classes, manually generated and inserted into an AppDomain at runtime, from within the program. The two articles are linked below.
Part 1:
http://www.codeproject.com/dotnet/Creating_Dynamic_Types.asp
Part 2:
http://www.codeproject.com/useritems/Creating_Dynamic_Types2.asp
I'm curious though. I've got ideas for two more articles, but dont know if people would be interested in them. The third article in this series would cover how to create an Aspect Oriented Programming framework via Reflection.Emit. And the fourth article would go over how to debug dynamic types. Would anyone be interested in these topics?
There is something I've been wanting to post about for a while, but just didn’t have enough info to make it worth my, or anyone who reads this, while.
We all know that .Net 2.0 came out with something wonderful called Generics. I've been profiling a lot of my 1.1 code, comparing it to my 2.0 code that utilizes Generics, and the one change that has given me the greatest, most dramatic performance benefits is to switch from IComparer to IComparer.
I use IComparer a LOT. But there is a problem with it is it's Compare() method. The following is the pattern I use on all my compare functions:
public int Compare(object x, object y)
{
if (x == null || y == null)
throw new ApplicationException("Invalid NGram Compare");
NGram n1 = x as NGram;
NGram n2 = y as NGram;
return n1.Score.CompareTo(n2.Score);
}
It takes an object for both its comparison operators. Then I have to cast it to NGram before I can compare against my Score property. This extra step may not be much, but if you are sorting an array that holds 1.5 million NGram objects, this adds up to a LOT of operations.
Enter IComparer! This is the Generics version of IComparer, and its compare function would look like this for IComparer.
public int Compare(NGram x, NGram y)
{
if (x == null || y == null)
throw new ApplicationException("Invalid NGram Compare");
return x.Score.CompareTo(y.Score);
}
Notice that two NGram instances are passed into the Compare function, instead of two objects. This lets me save the casting operations. YEAH! It must be faster, right?
So like any good performance guy (or gall for that matter) I revved up my favorite code profiler and ran two tests: one with the generic comparer and one with the object comparer, and sorted an array of 1.5 million NGram objects a few times. And guess what I saw.
The generics comparer was slower than the object comparer. And not just a little bit slower either. But 61% slower!!! Oh my gosh…what the hell is going on here? Generics have to be faster! Its less code!
So, to get to the bottom of this mystery, I opened my assembly in Reflector and took a peek at the IL code for each function. I know IL doesn’t lie to me. What I saw was very interesting, and it had to do with checking an object to see if it is null.
This is a section of IL from the generics Compare function.
L_0000: ldarg.1
L_0001: ldnull
L_0002: call bool NGram::op_Equality(NGram, NGram)
L_0007: brtrue.s L_0012
L_0009: ldarg.2
L_000a: ldnull
L_000b: call bool NGram::op_Equality(NGram, NGram)
L_0010: brfalse.s L_001d
L_0012: //throw exception stuff
This is pretty much what I expected to see. Argument 1 is loaded onto the stack, then a null is loaded onto the stack. Then the NGram's operator equality function is called to see if the two are the same or not (is the ngram null). If it is, then it branches down to the throw new exception code. It then loads argument 2 onto the stack, and another null onto the stack. And does another NGram operator equality check. Like I said before, nothing really amazing here, and pretty much what I'd expect. The IL has to call the operator Equality function just to be sure that I didn’t override it in my NGram class.
Ok, now lets take a look at how nulls were checked I the object comparer's IL code to see what was so different that it would be 61% faster. This is what I saw:
L_000e: ldarg.1
L_000f: brfalse.s L_0014
L_0011: ldarg.2
L_0012: brtrue.s L_001f
L_0014: //throw exception stuff
Damn…that’s a lot less code! What's going on here? Well, it looks like the IL compiler does something that the C# compiler wont let you get away with. In C++ a NULL, a 0, and FALSE are all the same thing. They are all 0. But the C# compiler doesn’t allow you to make this leap. Null, 0, and false are three distinctly different things. So what the IL code is doing is loading the first NGram instance onto the stack, then just doing a false equality check. Since null the same as false (in IL) this works just fine.
The C# compiler knows, when comparing null an object that is casted all the way down to Object, it can just compare it to false and call it good. So why didn’t the C# compiler do this for the NGram null comparison? Because I could have overloaded the == operator in my NGram class, that’s why. And if I did, then it would need to call it. But doesn’t the C# compiler have enough info to check if I have overloaded it or not, and if not do a false comparison? Yes it does, but it looks like that’s one optimization it doesn’t do, unfortuanttly
So to test this theory out, I took the null check out of my IComparer.Compare function and re-ran my test code under the profiler, and this time the generic comparer without the null checks were 66% faster than the object comparer. Ahhhh, satisfaction at last.
This exercise reinforced something in my head. Even if you KNOW that thing a is faster then thing b, always profile it just to be sure. Yes, a generic typed comparer is much faster than a normal object comparer. But if I had just done the code change and called it good, I would have actually slowed my app down. Which is a bad thing.
Every now and then I need to create a color object in my code, but don’t know exactly what color I want. So I created this little macro to popup the ColorDialog, then insert a little line of code for the color you picked.
Nothing magic or special here, just another useful macro. The only problem with it is the color dialog comes up behind Visual Studio so you have to Alt-Tab to see it. A bit annoying, I know. If anyone figures that one out I'll post the fix.
Public Sub ColorPicker()
Dim colorDlg As New ColorDialog
colorDlg.AllowFullOpen = True
colorDlg.AnyColor = True
colorDlg.FullOpen = True
colorDlg.SolidColorOnly = False
Dim ret As DialogResult = colorDlg.ShowDialog()
If ret = DialogResult. Cancel Then
Return
End If
Dim color As System.Drawing.Color = colorDlg.Color
Dim code As String
If color.IsNamedColor Then
code = "Color color = Color." + color.Name + ";"
Else
code = "Color color = Color.FromArgb(" + color.ToArgb().ToString() + ");"
End If
Dim textSelection As TextSelection = DTE.ActiveDocument.Selection()
Dim edit As EditPoint = textSelection.TopPoint.CreateEditPoint()
edit.Insert(code)
End Sub
Ever since I discovered snippets in Visual Studio 2005, I've been using them like crazy.
Microsoft has a good list of pre-canned snippets (http://msdn.microsoft.com/vstudio/downloads/codesnippets/) in 13 different categories.
But the best category of snippets I've found thoroughly covers NUnit code templates. It's located here: http://www.codeproject.com/dotnet/UnitTestCodeSnips.asp.
Combining these NUnit code snippets, with TestDriven.Net (http://www.testdriven.net/) and it really makes it easy to practice test driven development
Yesterday I stumbled across a totally invaluable tool to help with unit testing your code in Visual Studio: TestDriven.Net. (formerly known as NUnitAddIn) It's a Visual Studio addin that allows you to run your NUnit, MbUnit, Team System, and soon Zanebug unit tests by just right clicking on the test method, class or namspace and clicking the “Run Test(s)” menu item. You can run just one test, all the tests in a class or all the tests in the namespace. This is cool and all, but the best part is that it will allow you to run the test under debug. So you set your break point in the unit test, right click and pick “Test with...Debugger” and boom!, you've now got the process caught on your breakpoint. No more attaching the debugger to a running NUnit process. This is especially nice if you need native code support with the debugger, because when you detach the debugger from NUnit, NUnit would get closed.
Now this isn't a total replacement for using NUnit when TDD'ing. You still would want to run the entire suite of tests fairly often. Its writing individual tests where this tool really shines.
And did I mention the best part? Its free!
Does anybody have any experience with either of these two types of mock objects? Over the past few months I have gone back and taken a new look at Test Driven Development and am starting to switch the way I think about writing code.
In writing some of my unit tests I've had a need for a mock object framework. In looking around, i've noticed that these two seem to be the two brightest stars in the .Net universe in respect to mock objects.
Can anyone compare and contrast them? Give some insight into using them?
So I've recently been looking for fun rock / metal covers of songs from the 60's / 70's / 80's.
My top 3 favorites are:
1. Deadsy's cover of Rush's “Tom Sawyer“
2. Korn's cover of Pink Floyd's “Another Brink in the Wall
3. Metalica's cover of Bob Seger's “Turn the Page“
What are your favorites?
A few weeks ago I attended an AOP workshop at Microsoft. One of the AOP flavors that was presented requires you to implement an interface for every class that you want to apply aspects too. I find this fairly annoying and constricting. When I voiced my concern about having to create one interface for every class in my 600 class architecture, I was told by the majority of the people there, from both academia and the CLR team, that this is how you should design your framework anyway. That interfaces allows for the greatest extensibility.
Interestingly enough, I just started reading a book called Framework Design Guidelines, written by Krysztof Cwalina and Brad Abrams, both heavy hitters at Microsoft. In chapter 4, Type Design Guidelines, they state that when designing a polymorphic hierarchy for reference types, in general, you should opt for using abstract classes vs interfaces. The book states that when applying a “Is A“ relationship you should utilize an abstract class. And if you are applying a “Can Do” relationship to a class, then you use an interface (IDisposable, IEnumerable, IComparable). The main argument here is that interfaces should be immutable from version to version, but base classes can evolve with much greater ease. The only major down side to using abstract classes vs interfaces is that .Net only allows 1 class inheritance, but you can interface inherit all day long.
My main programming mantra states that there is no silver bullet. There is no “One” tool. Each tool has a purpose, and should only be used for that purpose and that purpose alone. When designing a system, look at your requirements and use the tools that fit the situation appropriately. Don't try to force the use of a tool just because you used it before. Try to understand what the tool's use is for. But it seems people are looking for the Matrix version of a programming tool. The One...
When I here people saying “Every class should have an explicitly defined interface” it really makes me wonder where this comes from. I have to think these guys are throwbacks from the COM days, who didn't really understand why COM did this. They just understood that if it was a class, it had an interface, and they didn't need to know why.
Now, the really interesting thing is that both the authors for the Framework Design Guidelines book are program managers for the CLR team, yet there were people in their team spouting the interface mantra.
Someone on GeeksWithBlogs posted this link, but I think it was so important that it deserves a second showing. Its the 8 fallacies of distributed computing. As anyone who has looked through my blog knows, I'm not a big fan of the Web Service storm thats blowing through the programming world. It just doesnt make sense to blindly make massive distributed architectures inside your own fire walls. But companies are doing it.
I previously posted that I had attended a workshop hosted by Microsoft Research on the topic of Aspect Oriented Programming. The core purpose of AOP I’m a firm believer of: to modularize systems more effectively. But I’m still not sold on the AOP implementation of this directive.
But, I do fine AOP intriguing and have developed an implementation that uses dynamic runtime weaving (more about this below). It works pretty well, and doesn’t degrade the application performance too bad.
Anyway, for those who want to learn some more about AOP, here are some ramblings about AOP that I pulled out of last week’s workshop:
Aspect Oriented Programming Overview
Problem with regular development practices:
With traditional programming techniques, most code is highly coupled, complex and often there is much "administrative" repetitive code which makes system hard to maintain. Or worse, in order to make to code less complex, such administrative code is totally left out.
Generally accepted good design practices state that a function should do one thing, and one thing only. But often a function must do many additional tasks (aspects): logging, error handling, thread safety, etc. This other "stuff" is often boiler plate code that is copy/pasted around all over the system.
What is AOP
AOP tries to fix this problem by separating these aspects out of the function into some sort of external construct. The method is then decorated with these aspects in order to tell the runtime environment (or other external process) that they should be applied to the method at runtime.
This way, the function's source code only does the one thing its supposed to do.
The concepts of what has become AOP started back in 1987. Its goal is to modularize systems more effectively, which is nothing new.
Main AOP Terminology
- Aspects: features of a program that do not relate to the core functionality of the program directly, but are needed for proper program execution
- Cross cutting concerns: same thing as an aspect
- Join point: point within a program where the aspect can be applied. When the process in an application arrives at the join point of the program, the aspect is executed.
- Point cut descriptor: one place in source that defines all places where join points are applied to the source.
- Attributes: defines a single join point for an aspect in the source. Standard way in .Net to decorate classes, interfaces, and methods with aspects
- Weaving: the act of inserting calls to the aspect into the main program. There are many ways to implement this.
Examples of how AOP can be used to solve some of these problems
- Trace output
- Logging
- Checking for error conditions and acting accordingly
- Dynamically generated asserts for method arguments
- Transaction control
- Exception handling (maybe)
- Thread safety and coordination
- State change and response mechanism
- Singleton pattern mechanism
- Business rule engine implementation (allows business rule logic to change dynamically without redeploying binaries)
- encryption / decryption
- Generate execution metrics
- Custom security policy enforcement
- Method pre/post processors (much like proxy classes)
- Dynamically override a method (could use for deploying support patches. Just deploy one aspect with one method. The rest of the class stays the same)
- Object instance pool management
- Onsite client error debugging
- Plug in architecture that customers could use to plug in their custom functionality to our product.
Almost all real world commercial implementations of AOP are in Java.
All (most?) .Net implementations of AOP are either designed through research groups or academia.
Simplified example of AOP:
Take the following 3 simple classes.
|
Line |
|
Point1() |
|
Point2() |
|
UpdateUI() |
|
Figure |
|
Line1() |
|
Line2() |
|
Line3() |
|
Line4() |
|
UpdateUI() |
When any code calls the X or Y property of the Point class, or the Point1 or
Point2 property of the Line class, it should call the UpdateUI method. Or the X, Y, Point1 and Point2 properties could call UpdateUI themselves (as well as the Line1-4 properties of Figure). But then you have duplicate code dispersed throughout the classes that does the same thing. Not only that, but these properties shouldn’t really know about a UI at all.
The UI update logic could be pulled out into an aspect, which is then applied to the class. At some point (depending on the type of AOP your
using) the AOP weaver would interrogate attributes and determine if any aspects should be executed or not.
Problem to watch out for would be unnecessarily duplicated calls to the aspects. For example, what if the developer updates all 4 lines of class Figure?
UpdateUI could get called 28 times. AOP engine should have ability to specify aspect execution rules along the call stack.
.Net'ish examples of limited AOP:
FileIOPermissionAttribute. This does a stack walk to check for file IO permission. You could put the code in each method, or just apply the attribute at the top of the method. It keeps the method from getting cluttered
Problem with .Net and attributes:
Even though .Net has build in AOP style constructs, they are hard coded into the runtime. If you define your own attributes, you must also define how and when those attributes are interrogated and used. .Net does not have a generic way to hook into the runtime to say "When you come across this attribute, do this".
Different AOP implementation techniques
There are 4 main implementation techniques for AOP
1. Static source weaving: The programming language is extended to include constructs for defining define aspects and where they get applied. Before the language compiler gets executed, the AOP engine/compiler inspects the source code for aspects. It then either inlines the aspect code into the method its applied to, or inlines a call to an aspect instance. This altered source is then run through the language compiler and the resulting exe or dll is generated. This has best runtime performance since compiler has a chance to apply optimizations. This is how AspectJ works
2. Static byte weaving: The source is compiled through the normal language compiler. Then an AOP weaver loads the dll/exe into memory. It looks at the aspect mapping mechanism (this could be an xml file, a new language file like IDL, or attributes) and injects byte code into the dll/exe in the appropriate places to execute the aspect. The dll or exe is then saved off to file.
3. Dynamic weaving: dll or exe is compiled through the normal language compiler. At runtime, there is some mechanism that emits code into memory to execute the aspects at the appropriate place. There are several techniques to do this. One approach is to utilize class factories in order to create any instance of a class that has aspects applied to it. The factory would emit into memory a new class that inherits from the class that is being requested. Any method that has an aspect associated with it would get overridden, and contain calls to the aspects in the appropriate place.
One nice thing about this approach is that you can design an AOP to be able to dynamically bind, unbind and change bindings on aspects to join points.
This means you can change application functionality without ever shutting down the application, which is important for long running server applications such as a database or web server.
4. Actually extend the runtime environment and the type loader to create classes with the aspects embedded in them. This is fairly complex as it requires you to extend the CLR and JIT compiler. This is the approach that JBoss uses with Java.
All of these have their good and bad points. The first one gave you processor performance, but at the cost of no IDE support (in .Net) for the new language construct for defining the aspect bindings. Static byte weaving gives you better IDE support, but can break some compiler optimizations because the aspects are woven post build (you have to turn off compiler inlining). Dynamic weaving forces you to use class factories and code against interfaces for any class that has an aspect.
A fifth way (for .Net) to create an aspect weaver that I didn’t hear presented could follow these steps:
- Post build, parse the binary assembly and recreate the entire assembly in a CodeDom object graph.
- Anywhere the weaver found an aspect binding, it would inject the call to the aspect into the code via the CodeDom API.
- Then use the C# compiler API to compile the CodeDom back to an assembly.
- With this approach, you would have full IDE support because you would just use attributes or an external aspect mapping file. You'd have full compiler optimizations because the woven code would be compiled again. You don’t have to use factories and interfaces either. Also, the CodeDom has debug symbol support, so you could update the debug symbols for the woven aspects.
- The hard part would be to write a generic assembly to CodeDom mapping tool.
Where are aspects executed in the context of a method:
Aspects can get applied in three different ways: at the beginning of the method, the end of the method, or instead of the method. This is usually specified on the attribute that decorates the method. There is some cool work at creating an aspect definition construct to define where specifically in a method to apply an aspect. For example, inside a loop, or in the else block of an if / else statement.
How are aspects bound to their target method:
There seem to be four camps on how to tag a method with an aspect. With attributes applied directly to the method, class or interface (generally accepted .Net approach). With a whole new extension to the language (JAspect approach). And with an external file, such as an xml mapping file or a new language file much like how IDL was used with COM. The fourth camp uses method naming conventions to tie the aspect to the method (DoSomething_WithLogging_WithThreadSafety)
I like a combination of using Attributes and an xml file. I think it’s important to have the visual queue when you’re coding that this method has an aspect applied to it. But I like the dynamic flexibility that an external file could give you. At runtime you could change the file to turn on or off aspects, change when they are executed, swap one aspect for another.
Other directions that use AOP:
There is another area of AOP that some people are looking at. Instead of using aspects to separate concerns from the main functionality of a program, they are using aspects to extend existing dlls. For instance, if you have a 3rd party dll, you could use static byte weaving to add fields, properties, methods and interfaces to the existing binary.
Problems with AOP:
- All approaches have problems with debugger and IDE support.
- One of the main ideas of AOP is that Aspects are hidden from most developers. A problem with this is that it separates the aspects from the developers which can lead to problems. What if the developer doesn’t know, or forgets, there is an aspect that will apply thread safety checking, or serialization tags?
- Also, what about aspect priorities when multiple aspects are applied to the same methods. Who is executed first? Are there any considerations?
- It’s easier to see these issues when attributes are used to decorate methods with the aspects.
- How do you find your failure points? Is it the original source, the aspect source, or the AOP weaver?
Future of AOP in .Net and Microsoft:
Currently there seems to be about 30'ish different implementations of AOP available in various languages in .Net.
There doesnt seem to be any leading force withing Microsoft or the CLR team that is driving the integration of AOP into the CLR
Reading:
Aspect Oriented Software Development, Addison Wesley
Other stuff:
It was interesting to see that academic and pure research groups focus totally on "getting it to work". But with very little regard to getting it to work easily. They didn’t seem to realize that if the developer had to jump through lots of hoops in order to get the desired functionality, they most people wouldn’t use it.
How much attention does Aspect Oriented Programming have in the .Net community? Are people using it? implementing it? know what it is?
This last Monday I was fortunate enough to attend a workshop hosted by Microsoft Research on Aspect Oriented Programming. The workshop most mostly attended by researchers from different universities in the US and Europe (though Brazil was also represented).
I've worked on an implementation of AOP in .Net on and off over the past year, and it was interesting to see that several universities were working on implementations that were along the same lines as my own.
I was surprised to see how many different ways there was to implement AOP. The four predominant ways were:
- Static source weaving: this is where a tool weaves the aspects into the methods in the actual source code. Then the normal compiler runs on the code
- Static byte weaving: this is where a post compile process modifies the compiled dll or exe by injecting the aspects into the actual byte code of the dll or exe
- Dynamic runtime weaving: this is where class factories are used to create instances of types that have aspects applied to them. The factory uses System.Reflection.Emit to create new types, that inherit from the requested type. This new type has the aspects woven into them.
- Modify the runtime / type loader: this is much more complex. This is where the type loader of the runtime is changed to do the dynamic weaving. This has the added benefit of not having to use class factories, but very complex to implement.
One thing that was interesting to see the lack of any guiding body on AOP within the .Net community. The Java world has seemed to settle on (mostly) JAspect or JBoss. There are now several large scale enterprise software implementations in Java that utilize AOP with either of these two AOP implementations.
An interesting comment was raised during a discussion as to why the .Net community hasn’t embraced AOP like the Java community has. In the Java arena, when someone creates something new, cool, and useful, companies aren't that afraid to integrate that new thing into their development process.
But in the Microsoft world everybody waits to see what Microsoft will do. Since the development environment and processes are so tightly integrated into MS products (even more so with MS Team System), companies don’t want to head off in one direction when there is a chance that MS might go in the other direction. So if Microsoft doesn’t ever feel the need to implement AOP, most likely it will never grow into the mainstream.
One thing I finally figured out, through the help of the guy that wrote Cecil, is that Cecil wont load Incremental builds. What does that mean? You know how VisualStudio has two build options: Build and Rebuild. Well when you use Build only the methods that change get recompiled. This causes some weirdness in the assembly with the metadata tables that Cecil cant (wont) handle.
How do you get around this? Recompile is the answer! :-) Recompile compiles all your code and creates proper metadata tables that Cecil can happily parse.
The tool I referred to in my last post about code uniqueness was FxCop. I'm sure most people have heard of FxCop, but for those that haven't: FxCop provides a way to validate your compiled assemblies against a list of about 200 canned rules. These rules are anything from design guidelines, security checks, performance checks, to globalization standards. FxCop also provides a way for you to create your own rules and hook them into its validation process. Behind the scenes FxCop uses what Microsoft calls an Introspection engine to parse apart the IL opcodes and metadata tables in your assembly and it builds an object graph that represents all interconnections between Namespaces, Types, Members, Fields...all the way down to the individual IL opcodes. This provides an incredibly powerful way to inspect an assembly.
I had some ideas for some tools that needed to go through the detailed internal structure of an assembly, and the object graph that FxCop provided fit the bill perfectly. But there was no easy way to get at it without writing a custom rule to gather the information and running the rule through the FxCop UI. This quickly became a hassle.
So next I spent many hours reading through the FxCop assemblies with Reflector in order to figure out how I can use the underlying introspection engine without the FxCop UI. But the FxCop team tightly integrated the engine into there two different exes (one WinForm and one cmd line tool). So if you want to utilize this object graph you have to either put up with their UI, or do what I tried to do. Call into the introspection engine directly. Well after many hours of poking around I did get it to work, but only for one assembly at a time and I could run the analyze once. If I tried it again, the engine blew chunks. Not to mention about 100 other hacks and hoops I had to jump through in order to actually get the object graph that represents the internals on my assemblies.
Enter Cecil! Cecil is a mono project that a friend turned me onto that basically does exactly what FxCop's introspection engine does...AND MORE! I've become a huge Cecil fan. In fact I love Cecil! With very little effort and NO hoops or hacks, Cecil returned to me an object graph that represented the internals of my assembly. And, the API representing the assembly object graph very much mirrors FxCop's Cci (Common Compiler Infrastructure) API that FxCop uses to model the assembly internals. So it took no time at all to migrate the tools I wrote against FxCop's assemblies and change them to use Cecil.
The only caveat that I've found to using Cecil is that it does blow chunks on some of the assemblies that I have tried to feed into it. I suspect its because the offending assembles aren't CLS compliant. I'll have to figure that one out.
Recently I found a tool that will parse a .Net assembly(s) and return back an object graph of the underlying structure: Assemblies, Namespaces, Types, Members...all the way down to the IL opcodes in each method.
This opens a lot of doors to writing tools that can look at the overall architecture of your application, across many assemblies. One tool I wrote creates a MD5 hash of the contents of each method, based on the IL opcodes and the values they are operating on. From this I can then figure out which functions are exact duplicates of other functions in your code (Steven Mcconnell once said that copy/pasted code is a design flaw).
So I ran the tool on one of my applications and came up with a lot of duplicate functions. But after looking at them I realized almost all were one line property getters, that were returning a private field. The other culprit was default constructors that get generated for you by the compiler if you don't have one, or if you have an empty one. Once I weeded these culprits out of the analysis, I came up with very few duplicated functions.
Then, just for fun, I sicked my tool on Microsoft's System.dll to see how original their programmers were. Turned out i found 88 duplicate functions. That means there were 176 functions that shared code with another function.