Charles Young

  Home  |   Contact  |   Syndication    |   Login
  144 Posts | 55 Stories | 362 Comments | 376 Trackbacks

News

MVP - Microsoft Most Valuable Professional

Twitter












Article Categories

Archives

Post Categories

Image Galleries

Alternative Feeds

BizTalk Bloggers

BizTalk Sites

CEP Bloggers

CMS Bloggers

Fun

Other Bloggers

Rules Bloggers

SharePoint Bloggers

Utilities

WF Bloggers

I recently had reason to revisit the exact mechanisms Microsoft use when you assert facts to the Microsoft Rules Engine.   I was discussing stuff on-line with a fellow rules enthusiast when a terrible thought occurred to me.   Can the MS BRE always uniquely identify each different fact, or is there a chance that sometimes it might confuse two facts with each other?

For a couple of days, I was convinced that I had stumbled on a significant bug.   Indeed, at one point, I thought there was such a serious problem that I would have to recommend to my company that we desist from any further use of Microsoft’s rules engine.   Melodramatic, huh!    Fortunately, after a bit more research, I discovered that I was quite wrong.   The MS BRE does not suffer from a terminal flaw and can be trusted to always distinguish correctly between all your facts…except when…well, I’ll get onto that later.   It’s worth recording my suspicions, mistakes and eventual enlightenment.   There is something useful to learn, here, about the inner workings of the engine, and also about Microsoft’s implementation of the Hashtable class in .NET.   

The first point to make is that Microsoft uses the term ‘fact’ rather loosely in MS BRE.   A fact is a data item ‘asserted’ to the working memory of a rules engine.   The working memory, itself, is simply some mechanism for storing and retrieving facts, and is typically implemented as a set of node-specific collections (a rule set is represented as a node graph at runtime).   Microsoft makes extensive use of the non-generic Hashtable class in the MS BRE to implement node memories.   In most similar engines, a fact is always a simple data tuple – i.e., a collection of attribute values.   The attributes are usually named, though sometimes un-named and ordered.   It is also almost always the case that each tuple contains some kind to type identifier.    Many engines provide a metadata definition language which rule developers use to define tuple templates.   These definitions are called ‘deftemplates’ in several engines.

Microsoft does not support the direct equivalent of deftemplates, and ‘facts’ do not always convert neatly to single tuples.   The MS BRE philosophy is to apply set-based rule processing directly to data stored in .NET objects.   You can therefore assert any POCO (Plain Old CLR Object) to the engine as a fact.   In this case, a POCO is roughly equivalent to a tuple because it provides named properties (MS BRE does not access fields directly).   Of course, it also provides normal methods as well, and rules can exploit this by calling methods in order to obtain data values, evaluate predicates or invoke custom actions.

Microsoft provides four ‘Typed Fact’ classes for wrapping specific types of .Net object and asserting them to the engine.   These all implement the ITypedFact interface and are as follows:

TypedXmlDocument: wraps an XML document or node
TypedDataTable: wraps an ADO.NET DataTable
TypedDataRow: wraps an ADO.NET
DataRow
DataConnection: wraps an ADO.NET DataSet and either SQL Server or OLE DB .NET data adapters, connections and transactions.   A DataConnection object is used to access a specific table in an external data source via the wrapped data connection.

A TypedDataRow is the closest equivalent to a data tuple.   The other Typed Fact classes are used to instantiate what Microsoft calls ‘facts’, but are, in reality, sources of multiple facts.    Once asserted, the engine uses information embedded in the rules in the currently executing policy to internally select and assert the required facts from these fact sources.   For example, it uses XPaths to select node sets from XML documents, and then uses additional XPaths to access individual ‘fields’ within these nodes.

The important thing to note here is that you only ever assert .NET objects to the engine.   In some cases the engine automatically asserts additional .NET objects internally.   MS BRE must be able to distinguish between all these objects unambiguously.   It must be able to tell when two objects represent two different facts and when they reference the same fact.   The only way this can be done generically across all asserted facts is to use some notion of object identity.

Unfortunately, object identity is a rather slippery subject in .NET.   There is a good reason for this.   In un-managed code, a compiler will normally distinguish between two different objects by using their references.   Each object reference maps directly to a memory address at which the object state is located.   The exact layout of the state varies from compiler to compiler and may depend of whether the object has a vTable or equivalent.   The important point, however, is that once created, an object does not generally move.   Hence, references stay valid over time.

Not so in the CLR (the .NET Common Language Runtime).   The runtime environment provides automatic garbage collection and memory management.   One consequence of this is that objects move.   They cannot be guaranteed to remain at the same memory location for any significant period of time.   Hence, you cannot depend on an underlying memory reference to provide object identity except in a very limited sense.   If you want to test two objects to see if they are identical, you can test their underlying references at a point in time (or, at least, an approximation to a point in time).   If the references are equal, the objects are identical.   Microsoft supports this via the static ReferenceEquals() method of SystemObject.

In itself, this is simply not adequate to meet MS BREs needs.   MS BRE stores fact objects in its working memory over a period of time, and needs to be able to distinguish between existing and new facts unambiguously.   Each time you assert a fact to the engine, MS BRE needs to know if this is a new fact of which it has prior knowledge, or if you are reasserting an existing fact.   Hence the engine needs some reliable form of identity that persists over time.    We might consider using the values of a field in each fact as unique identifiers.    A POCO, though, can contain any data, and there is no guarantee that two entirely different POCOs might not contain identical data.   In any case, you would have to impose some kind of contract on your POCOs…meaning that they are not really POCOs anymore from the engine’s perspective.

Most .NET developers are aware of the GetHashCode() method.    GetHashCode() is inherited from System.Object, and therefore is a member of all .NET objects.   It is also callable on value types.    Microsoft provides two default implementations of this method for reference and value types.   For value types (structs, in C#), the hash code is provided by calling GetHashCode() of the first field in the value type.   Hence, not all fields in a struct are equal.   The first one has a special role.   For reference types, the default implementation in .NET 2.0 is a randomly-generated 32 bit signed integer.

The hash code is often confused for an object identifier.   In many cases, it is almost good enough to be used as such, but not quite.   Hash codes, according to Microsoft’s guidance, should be associated with the ‘value’ of an object.   This is the value which is tested for equality when you use the Equals() method.   If two objects evaluate as equal, they should both return the same hash code.   Of course, the ‘value’ of an object may change over time.   For example, a string object’s value is the text associated with that object, and the hash code is derived algorithmically from that text.   If you change the value of a string, the hash code generally changes as well.   The default implementation of GetHashCode() generates a number randomly which is then associated with the object by the runtime and returned on future calls.   Hence, it is almost like an object reference.   Almost, but not quite.   The fundamental problem is that there is a small chance that two entirely different objects could, by chance, have identical hash codes.   The default implementation of GetHashCode() does not guarantee uniqueness of hash codes.

One of the reasons I suspected there might be bug in MS BRE is that the implementation of GethashCode() in System.Object was different in .NET 1.x.   In the earlier version of the framework, Microsoft returned the index of the object’s sync block as a 32-bit positive integer.   A sync block is a structure used to manage thread synchronisation, and the index value is incremented each time a sync block is created.   If you called GetHashCode(), and there was no sync block, the object would create one internally in order to obtain a hash code.   Because the indexes were incremented, and because (I believe) index numbers were recycled only when objects were garbage collected, there was no chance that two hash codes would collide.     However, this was never part of the hash code contract, and Microsoft, from day one, reserved the right to change this behaviour.

No doubt you can begin to see why I entertained suspicions about MS BRE.   MS BRE was originally written for .NET 1.x, and apart from a few bug fixes and minor enhancements, it has hardly changed in the .NET 2.0 version.   When I used Reflector to inspect how Microsoft handles fact identity, I realised that the approach is based on calls to GetHashCode().   It therefore occurred to me that Microsoft might have incorrectly depended on the coincidental nature of the default implementation of GetHashCode in .NET 1.x, and not realised that this behaviour had changed in .Net 2.0.   I suspected that, because hash codes can collide in .NET 2.0, facts might sometimes be confused.   The engine might think you had reasserted an existing fact when you had actually asserted a new fact.

I needn’t have worried.   The issue is robustly dealt with within MS BRE by using the Hashtable class.   I had a reasonable idea of how this class works, but there was one detail of which I was not aware, although in hindsight it appears blindingly obvious that this is the only sensible way in which Microsoft could implement this class.   I’m sure many developers are aware of this feature, although I haven’t yet come across any detailed discussion or documentation that describes the exact behaviour.

When you assert a fact to the engine, its type is tested, and it is handed off to a ‘class node’.   This is a type-specific node object in the Rete.   A Rete is a node graph that provides an efficient and executable run-time representation of a rule set.    Class nodes first test to see if a fact is already stored in working memory.   A class node implements its working memory as a simple hashtable, and it uses the object itself as the key value.   It tests the hashtable to see if the object already exists in the collection.   If the hashtable determines that the fact is not in the memory, it wraps it in an instance of a class called ‘WorkingMemoryElement’ (WME), and then adds the WME to the hashtable.   Incidentally, the WorkingMemoryElement class overrides GetHashCode() in order to delegate to the GetHashCode() method of the fact it wraps.    If the hashtable already contains the fact, the engine assumes that you are re-asserting an existing fact.   It retracts the existing fact, and then asserts the new version of the fact.

Now, as I say, it is blindingly obvious with the benefit of hindsight that this must be implemented as a robust mechanism.   After all, imagine what life would be like if, when you exploit a .NET 2.0 hashtable that uses objects as keys, it sometimes fails to distinguish between the objects it stores.   This would be terrible, and we would all have heard about it.   It would break huge amounts of code causing strange, arbitrary and infrequent exceptions.   The trouble is that I knew just enough to know that, when you use an object as a key, the hashtable calls GetHashCode() internally.   You can see why I was worried.

Hashtables compute an internal hash code by calling GetHashCode() and ensuring that the negation bit is set to 0.   This is necessary because the negation bit (the first high-order bit in the 32-bit integer) is used for collision detection purposes.   Hashtables detect hash code collisions and use an approach called ‘rehashing’ to deal with this.

When the MS BRE tests for an existing fact, the hashtable calls GetHashCode() on the fact and computes the internal hash code.   It then looks this hashcode up in the collection to see if it is already used for a given fact.   If it isn’t, the hashtable knows unambiguously that this is a new fact.   However, if it does find the hash value, it cannot be certain if it already contains that fact, or if a collision has occurred.   This brings us to the feature of which I was unaware.   Very simply, if it finds an existing item with the hash code, the hashtable performs an additional test.   It tests the equality of the fact object, using the instance Equals() method, against the object it already contains (the one with the same hash value).   For performance purposes, it only performs this additional test when needed.

The default implementation of the Equals() method, inherited by all reference types from System.Object, internally does a ReferenceEquals() check to discover, unambiguously, if the two objects are actually identical.   So, if your facts use the default implementation of Equals(), the hashtable will always correctly distinguish one object from another.   The ‘TypedFact’  wrapper classes all use the default implementation, and your POCO facts will also use the default implementation unless you have overridden it, or are using a class that overrides this method.

In conclusion, therefore, we can see that MS BRE safely handles fact identity as long as you never assert .NET objects that have a poor or incorrect implementation of Equals().   For a long time, I have wondered if I ought to recommend overriding GetHashCode() and Equals() on fact classes.   My conclusion is that I should definitely recommend that, unless you have a very good reason to do so, you should NOT override these methods.   For safety, use the default implementations, or existing overrides on classes you trust (like ‘System.String’, for example).    If you do have a reason to provide overrides, make sure you fully understand the characteristics of the code that you implement and ensure that you code conforms rigorously to the ‘rules’ specified by Microsoft.   If you don’t, you may well live to regret it.   And, unlike me, don’t worry…Microsoft developers really do have some idea (generally) of what they are doing J

Previously posted at http://blog.solidsoft.com/blogs/charles_young_mvp_biztalk/default.aspx


posted on Wednesday, April 04, 2007 10:29 PM

Feedback

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/8/2007 11:00 PM Peter Lin
This is an academic question, but does .NET 2.0 recalculate every time GetHashCode() is called? I would have thought it does it the same way as Java. Only when the object changes "should" the hashCode change. For example, in Jamocha, when modify function is invoked in the action of the rule, the rule engine does this.

1. retract the object
2. alter the object
3. assert modified object
The reason is that if you modify the object before doing the retract, it might not find the object correctly under all situations. For a RETE rule engine this turns out to be very important for consistency and performance. If the rule engine were to assume hashCode is not trust worthy, the only other way around that would be to do an exhaustive value comparison of the new fact being asserted against all existing facts. Obviously, that would kill performance :)

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/10/2007 12:21 AM Charles Young
To quote MSDN:

"The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again."

Well done, Peter. You have highlighted a genuine problem in MS BRE. I did some testing, and sure enough, if you override the GetHashCode() method on a custom class and implement it so that it returns a different hash code when the object's state is changed, and if you then write a rule that changes the object's state and reasserts, the engine does not realise that the object has been modified, and instead creates a new WME which holds a second reference to the same object. Hence, a whole load of redundant WMEs can build up in memory.

It is difficult to see how this can be easily fixed. The trouble is that the engine needs to locate a WME based on the hash code of the object. Because MS BRE consumes custom objects so readily (see my latest post on side effects), there isn't much coice than to rely on hash codes. So, as you rightly say, the only obvious way to avoid this problem would be to implement a hugely costly value comparison.

I had already concluded that it is best not to override the GetHashCode() method. I will take this opportunity to strengthen that recommendation. Do not UNDER ANY CIRCUMSTANCES override GetHashCode() in order link hash codes to object state. If you use the Microsoft's default implementation, you will be OK.

Thanks again for highlighting this. It's a shame Microsoft hasn't documented this issue anywhere.



# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/10/2007 12:32 AM Charles Young
...and thinking about it a bit more, I guess the answer would be to implement the same approach as you have in Jamocha. I would imagine this would involve doing a bit more up-front analysis of rule actions at translation time to determine situations where an object is changed and reasserted. Microsoft could flag this up and cache the original hash codes before any actions are executed, and then use those hash code later to retract the relevant WMEs. Excellent. I shall send an email to a contact I have at MS to highlight this issue. There might be time to get a fix in place before BizTalk R2 is released.

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/10/2007 2:03 AM Peter Lin
In java, if a developer overrides the hashCode, they also have to override the equals method. But even with that advice, many java developer have a hard time when they start using rule engines. If you search JESS mailing list, you'll see that ernest patiently answers that same question again and again :)

It's an old problem, but overriding the hashCode can work correctly if the developer does so carefully. Of course, developer never make mistakes :P

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/10/2007 7:24 AM Charles Young
Yes, its exactly the same approach in .NET (see the quote I provided from MSDN which links hash codes to Equals()). And yes, you need to take great care when overriding these methods.

I have reported the issue to Microsoft, and look forward to their acknowledgment.

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/10/2007 11:09 PM Peter Lin
On a side note. Last year I had a conversation with some people about the learning curve for rule technology. Some people buy the sales pitch and think "I can give any developer a rule engine and they can write rules."

Of course in an ideal world that's true. But in reality, using rules properly takes a lot of patience and experience. All these tiny subtles add up. Building a performant application using rules is rather challenging :) Most of the commercial vendors do not sound these alarms, since sales guys will say anything to get their commission. Glad the discussion was fruitful.

# re: MS BRE: Fact Identity in the Microsoft Rules Engine, or how the author chased a non-existent bug 4/22/2009 12:31 AM Charles Young
I should just record here that I did eventually get some feedback from Microsoft on this problem. Not very encouraging, I'm afraid. They acknowledged that the bug exists, but said that I am the only person who has ever reported it, and hence it is currently not a high-enough priority to be fixed. I need to check BTS 2009 (shortly to be released) to see if they have revisited this issue. While I agree that very few customers are likely to ever run the risk of encountering this problem, the consequences, should it ever arise in real life, could be severe. Be very careful if overriding GetHashCode.

Post A Comment
Title:
Name:
Email:
Website:
Comment:
Verification: