Charles Young

  Home  |   Contact  |   Syndication    |   Login
  193 Posts | 64 Stories | 508 Comments | 373 Trackbacks

News

Twitter












Article Categories

Archives

Post Categories

Image Galleries

Alternative Feeds

BizTalk Bloggers

BizTalk Sites

CEP Bloggers

CMS Bloggers

Fun

Other Bloggers

Rules Bloggers

SharePoint Bloggers

Utilities

WF Bloggers

For almost two years now, I've been intending to write an article about the mysterious 'side effects' flag used in Microsoft Business Rule Engine policies.  Microsoft documents this feature (see http://msdn2.microsoft.com/en-us/library/aa559124.aspx), and describes very briefly how to control it.   The mystery that surrounds this flag arises because it is represented by an attribute named 'sideeffects' in Microsoft's BRL (Business Rule Language) although it actually controls a caching mechanism, and because Microsoft has not provided access to the flag in their Rules Composer, thereby giving the impression that it is not a 'first-class' feature of rule definitions.

I will discuss why I think it is not exposed within the Rules Composer later.   For the time being, let's investigate what this flag does, how it is used and why you might want to take control of side effects in your rules.   In doing so, we will explore some of the deeper concepts within MS BRE and understand a bit more about the issues that face rule developers.

Predicates, Functions and Bindings
The sideeffects flag is defined as an optional Boolean attribute defined on three complex types in the BRL schema.   These are the classBinding, databaseBinding and xmlBinding types.   They are used to define bindings between the <classMember>, <datarowmember> or <xmldocumentmember> elements of a rule set and corresponding runtime type members.   A bound .NET class member can be a public field, property or a method.   Xml Document members are elements or attributes referenced using XPaths.  Data row members are named data columns.  

The <classMember>, <datarowmember> or <xmldocumentmember> elements are used to define bindings for 'custom' predicates and functions.   Predicates and functions provide a higher-level abstraction within BRL.   They are absolutely central to understanding Microsoft’s approach to defining and executing rules in MS BRE, and can be defined as follows:

  • Predicates
    A predicate is a logical expression used in rule conditions.   Specifically, a predicate is always an immediate child of <if>, <and>, <or> or <not>.   When evaluated, it returns a Boolean value.   As well as custom predicates, MS BRE provides a library of pre-defined predicates. 
  • Functions
    A function is similar to a predicate, but can return a value of any type, including nulls.   It can be used either as an immediate child of the <then> element (i.e., as a rule action) or passed as an argument to a predicate or a function.   Again, as well as custom functions, MS BRE provides a library of pre-defined functions.1

To support the generation of BRL using the Rules Composer, Microsoft provides vocabulary definitions for all their pre-defined predicates and functions, and all but one (CreateObject) of their built-in functions.   The pre-defined predicates and functions can also be accessed through context menu selections on the right side of the Rules Composer interface.   The CreateObject function is supported by dragging and dropping .NET instance constructors onto the argument placeholder of an 'assert' action.

To summarise, the 'sideeffects' attribute can be optionally specified as part of a binding between a predicate or function, on one side, and a .NET class member, XML node or a data row column on the other.   The ‘sideeffects’ attribute is always defined at the member level, and never at the type level.   You cannot switch side effects on or off for an entire type or fact.

Domain-Specific Rule Language Extensions
The ability to define custom predicates and functions is an important feature of MS BRE.   It provides a powerful and clean mechanism for extending the basic rule language to encompass specific business or solution domains.   Bear in mind, while reading the rest of this article, that this is what predicates and functions are all about in MS BRE.   They are the basic building blocks from which rules are constructed.    They are grouped and arranged using logical expressions (if, then, and, or and not) together with constants and object references.   These constitute the essential elements of any rule defined in BRL.

Custom predicates and functions allow the rule developer to bind rule building blocks to underlying custom code implementations.   When supported intelligently with appropriate vocabulary definitions, they can be used to build rules that capture business-orientated semantics far more clearly and precisely than might otherwise be the case.   The combination of predicates, functions, constants, references and vocabularies allow you to build policies that more strongly align technology to business requirements.   For example, you can encapsulate complex business logic within custom code, and then expose that business logic directly to your rule definitions as a library of predicates and functions.    You can add a semantic layer over this library using a vocabulary so that rules can be expressed using defined business terms and natural language expressions.  From this perspective, you can see that a policy (an executable rule set) is an externalised and versioned definition of abstract rule logic and semantics which you can use to 'inject' concrete custom code implementations of business logic into your solution.   An MS BRE policy is a sophisticated way of implementing and managing 'dependency injection' patterns.

MS BRE is by no means unique in supporting domain-specific extensions to its rule language.   For example, take a look at JBoss Rules (formally known as ‘Drools’).   This is an open source Java-based engine (a .NET port has recently been released).    JBoss provides a facility for defining domain-specific rule language (DSL) definitions.   This feature provides similar facilities to those available in MS BRE.   ‘Predicate expressions’ can be used in conditional patterns.   These are Java expressions which can be considered the equivalent of custom predicates in MS BRE.      Java fragments can be embedded into the right-hand side of rules as the equivalent of custom actions.    Facts are asserted as JavaBeans, and data is accessed through method calls (this includes support for JavaBean ‘property’ accessor methods).   Over all this, the DSL feature allows rule developers to provide a semantic layer that maps template definitions onto the underlying rule language in order to allow rules to be expressed in a business-orientated or domain-specific fashion.     The only significant difference is that JBoss Rules uses Java code embedded in rule sets rather than functional bindings.    MS BRE’s functional binding approach provides a mechanism for building rule predicates and functions from .NET code, but also supports bindings directly onto XML and ADO.NET data sources.

Functional Programming and side effects
The terminology of predicates and functions was not chosen arbitrarily.  It indicates that development of a rule set is really a specialised form of functional programming which directly maps onto concepts drawn from predicate logic.   The term 'side effect' is a key concept in functional programming.   A Microsoft employee once told me that the MS BRE originated in work done by Microsoft Research.   I have never verified this statement, but if true, it is no surprise.   Microsoft Research has a long history of advocating functional, rather than imperative, models of programming, and was responsible for creating the F# language. 

In pure functional programming, a function never produces side effects.   A side effect occurs when shared program state is modified and accessed by different procedures within a program.   When one procedure executes, it can change program state.  This, in turn, can affect and change the outcome of another procedure.   Side effects are central to imperative languages like C#.   This is why, in imperative languages, you have to pay a great deal of attention to the exact sequence in which your code is executed (the ‘how’ rather than the ‘why’), and why forms of parallelism (e.g., multi-threaded execution) are generally hard to implement.   A pure functional programming language avoids side effects, allowing a more declarative approach when expressing logic and supporting the creation of code which is more easily verifiable.   The order of execution does not have to be controlled explicitly in the same way as an imperative language, making it easier to exploit parallel processing mechanisms.   Of course, C# and VB.NET are increasingly absorbing concepts from the functional world.   One of the drivers for this is the mid-long term goal of providing far better support for parallelism.

It turns out that if a programming language were to abide simplistically to the concept of pure functionality, it would be fairly useless.   Many functional languages are therefore purposely 'impure' (compare this to C# 3.0 which offers an impure hybrid of OO and functional models), and some, such as Haskell, safeguard their purity by utilising 'monadic' patterns that allow functions to avoid side effects by providing their callers with patterns of work that they should undertake in order to access or change state appropriately at the right time (LINQ, in C# 3.0, is based on underlying monadic patterns).

The existence of a Boolean 'sideeffects' flag on bindings for custom predicates and functions indicates that Microsoft’s BRL is not a pure functional language.   Even with side effects disallowed, BRL is not purely functional, though it approximates more closely to the model.  If you write custom predicates and functions within a non-functional language like C# or VB.NET, you are mixing two different programming models, and there has to be some compromise.   The ‘sideeffects’ flag gives you some control at a fine-grained level over this compromise, weighting the behaviour of the engine towards one or other of the two worlds.   Because you can set the flag differently on different members, you can exploit both models side by side more effectively.

MS BRE extends its support for the concepts of functional programming by allowing functions to be passed as arguments to a function.   This is a central concept in functional programming.   However, MS BRE does not support the functional model completely.   Specifically, there is no support for returning functions from functions.    This is generally considered to be a core feature of functional programming, but MS BRE only supports the return of .NET types.   You could write helper methods to return delegates.   However, the engine does not know what to do with them.   For example, consider the scenario where one helper method takes an integer argument and a second helper method returns a delegate that, when invoked, returns an integer.   You cannot assign the second method to the argument of the first method.   The Rule Composer will report a type mismatch.

Understanding Bindings
Before describing the implementation of side effect control in depth, it is important to discuss the bindings between functions (including predicates, which for our purposes can the thought of as a specialised function) and members more fully.   It is easy to see how bindings work when invoking custom code.   A function can be bound to a public field, property of method of a .NET class.   Because a .NET method can return System.Void, functions bound to class members may not always return a value to the engine. This represents a necessary compromise between the functional world and the imperative world.   It is true that VB distinguishes between 'Functions' and 'Subs' in terms of return values, but C-type languages such as C# and Java do not.   They use the special 'void' type to represent procedures that do not return a value.   When a function is bound to an XML document node or an ADO.NET data row, you might think that there is always a data value.   However, that is not the case.   Remember that we are discussing bindings between rule engine functions and data members.  When we want to obtain a bound member's value, the engine needs to invoke a function that returns that value.   However, when we want to change a bound member's value, the engine also needs to invoke a function to do this work.   Just like a binding to a method that returns void, this function will not return a value.

Understanding MS BRE ‘Facts’
There is one last point I need to make before discussing the implementation of side effect control.  It is all very well defining bindings between rule engine functions and type members.   However, at runtime, we need some way to provide the engine with instances of the types to which the functions are bound.   The first version of MS BRE provided only one mechanism for doing this.   You assert your instances as 'facts'.   This highlights a peculiarity of MS BRE.   Most rule engines view a fact as a relational data tuple.   You assert tuples to the engine, and the engine evaluates the data provided by the facts using rule conditions.   Often, there is some metadata description language that is used to define tuple structure.   Some engines extend the model by supporting the assertion of 'shadow' facts.   Shadow facts augment their metadata definitions with bindings to members of a class.   In addition, they copy data at runtime from bound objects to a local cache.   Object instances are used as backing stores for what the engine perceives as relational tuples.

Although MS BRE retains the terminology of 'facts', the term really indicates something subtly different.  At the level of assertion, a fact is an object, and therefore provides both data and functionality.   The engine binds functions and predicates to the members of a fact.    Hence, if you define a rule condition using a custom predicate, you have to assert a fact to the engine to provide the code for that predicate.   If you want the engine to perform some custom action, you again must first assert an object to provide that functionality.   In both these cases, an MS BRE approach extends the normal concept of a fact.  Of course, if you want to provide some data that will be evaluated by rule conditions, you again assert objects in order to provide the data.  Data is provided to the engine via functions used as arguments to predicates or actions.   When using a function as an argument, MS BRE will invoke the bound member and return the data to the engine for evaluation or use.

In the case of XML documents and ADO.NET data rows, you wrap your fact objects in a 'TypedFact' wrapper before assertion.   The wrapper object provides the necessary methods to get or set data values.  Each TypedFact class provides a number of strongly typed methods for 'get' and 'set' actions for each supported data type.   In this case, the binding definitions are a little less explicit than for POCOs (Plain Old CLR Objects).   The engine selects the appropriate method at runtime based on the type of data to be returned.   When you assert a wrapped XML Document, or a wrapped ADO.NET data table, there is a further complication.   Both these fact types act as sources of multiple internal facts.  An XML document, for example, may contain many different elements and attributes, only some of which may be required by the engine.  In this case, the engine uses XPaths defined in each binding to select the required data and internally asserts each selected element or attribute as a fact is its own right.   Hence, if you assert a single wrapped XML document containing multiple <employee> elements and write a rule that tests some attribute or nested element of an <employee> element, the engine will use XPaths to select each employee element and assert it as a separate typed fact.   The engine does not shred or copy the underlying XML, however.   If a rule action changes the value of an employee attribute, the engine performs the change on the XML document you originally asserted.

Once data has been returned from the bound class members and passed to the Rete (the node network responsible for performing rule evaluation), it can be regarded as having been implicitly mapped to a relational model.   The Rete does not care where the data came from or what binding types were used.    It works directly with predicates and functions, but ultimately reduces everything to a relational view of data tuples.   As we shall see, bindings are used to invoke code via a dispatch method that views each fact as being composed of attributes.   In this regard, the Rete implementation in MS BRE is no different to any other Rete implementation.   It works with the concept of tuples as a collection of attribute values.

As well as instance methods, MS BRE allows bindings to static class members.   If you are creating custom predicate or action code, it is often natural to implement your methods statically.  This highlights a peculiarity of MS BRE.   Using the fact assertion mechanism, MS BRE, by default, requires the assertion of objects as facts even if the engine only binds functions to static methods of the object's class.   The reason for this is that predicates and functions are invoked from Rete nodes, or from activated rule instances on the agenda.   A node will only invoke custom predicates or functions when activated by the arival of a 'working memory element' (WME).   As we will see, a WME in MS BRE is a container for an asserted 'fact' object.   You cannot assert a type.    You must first create instances of types (or .NET values, which will be boxed), and assert those objects.   This is true even if a Rete node elects to use a binding to a static member of an object's class when it is activated.    Having to assert objects in order to invoke static members feels rather unnatural to rules developers, and can be confusing.   It also means that you cannot use static classes, as supported by C# 2.0.  In the updated version of the engine supplied with BizTalk Server 2006, you can optionally turn on a feature that allows direct invocation of static methods without the need to assert an object.   I shall discuss this feature in more depth once we have explored caching.

Side Effects and Caching
Armed with a detailed understanding of BRL’s functional nature, and the way it binds functions to facts, let's now discuss the implementation within MS BRE that services the side effects attribute of a binding.   When a fact is asserted to the engine, MS BRE obtains an instance of the internal WorkingMemoryElement class from a pool and adds the fact object to it.   Pooling is an important optimisation that minimises garbage collection when the engine is dealing with large numbers of facts.   A WorkingMemoryElement (WME) is exactly as it sounds - an element that that is held in the 'working memory' of the engine.   The WME class is central to the implementation of function-member bindings.   It provides a method called GetAttributeValue().   This acts as a dispatch mechanism for invoking methods (including property accessor methods) or accessing fields on the fact object.   When the engine invokes this method, it passes the value of the 'sideeffects' attribute of the function binding that is currently being used to access a member of the fact stored in the WME.   This value is used to control a very simple caching mechanism.   If the member is being accessed for the first time during invocation of a policy, the WME will invoke the method or property accessor.   If the 'sideeffects' attribute is set to 'false', the WME caches the returned value in an object array before returning it to the engine.   Then, if the engine subsequently invokes GetAttributeValue(), it returns the value directly from the cache rather than dispatching to the bound member.   If the fact is re-asserted at any time, the engine invalidates the WME’s cache.   The next call to GetAttributeValue() for a given member will cause the member to be invoked, and the new value will be cached.

So far, so good.   However, several issues arise.   The root problem for MS BRE is twofold.   First, by binding to non-functional class members, the functional model is unavoidably compromised, and cannot be pure.   Second, the use of a cache cannot provide a true equivalent to pure functionality.   A pure function can be executed repeatedly with no side effects.   In MS BRE, the cache is used to prevent second and subsequent executions.   Indeed, the problem goes deeper.   Consider a rule action.   The semantics of an action do not sit well with the concept of pure functionality.   Almost by definition, an action changes state (it is possible to implement actions that don’t change state, but difficult to see what use such an action would be).  True, they may only change state in an external system (e.g., by performing a web service call), but even this can be considered a side effect.   MS BRE handles this impedance mismatch very simply.  At translation time, the engine sets a flag to ensure that caching is never used, regardless of the setting of the ‘sideeffects’ flag in the action binding.   However, the ‘sideeffects’ flag may still be honoured on arguments passed to an action.

Unlike actions, the ‘sideeffects’ flag is honoured on predicates.   This is surprisingly safe.   Remember that each WME stores a separate cache, and that the cache is invalidated when the WME is re-asserted.   A predicate defines a conditional pattern in a rule, and, for any one conditional pattern, it is generally invoked from a single node in the Rete.   As WMEs pass through the Rete network, they will therefore be evaluated just once during any match phase by any one predicate.   In most scenarios, the only way the WME will be re-evaluated by the same predicate code is if it is re-asserted, and at this point the cache will have been invalidated.   Hence, although the ‘sideeffects’ flag is honoured in predicates, it is of no consequence.   Due to the way a Rete network is constructed, this holds true even if the same conditional pattern is used in many different rules.   The only way I have found to get the engine to evaluate a WME multiple times using the same predicate in a single match cycle is to create multiple identical conditional patterns in a single rule.   This is a fairly unusual occurrence in real-world rule development, but if it does occur, and ‘sideeffects’ is set to false, the custom predicate code will only be called once.   Subsequent calls will be serviced from the cache.

This still leaves a problem.   Predicates can take arguments, and those arguments may, themselves, be custom functions.   What if you switch caching on for the predicate, but leave it off for the argument functions?   The rule, simply, is that the ‘sideeffects’ flag is honoured on the predicate.   However, this highlights an important subtlety.   When the ‘sideeffects’ flag is true for a particular class member binding, and that member is invoked through GetAttributeValue(), the WME always invalidates its entire cache.   This is quite startling, and underlines the fact that the functional model in BRL is impure.   If just a single binding to a class member allows side effects, other functions or predicates that are marked to use the cache may, in effect, always invoke their bound code.  

Consider the scenario where a single helper class provides a predicate method and two function methods.   The predicate and one of the functions are configured to disallow side effects, whilst the other function is configured to allow them.  In a rule condition, you pass the two functions as arguments to the predicate. Whenever the function that allows side effects is evaluated, it invalidates the entire cache for the WME.   If the other function or the predicate are ever evaluated a second time for the same WME in the same match cycle, they will always invoke their bound member.

This is actually a reasonably safe situation.   However, there is another danger. Consider the following scenario.   You create a custom predicate in one class and a custom function in another.   You pass your custom function as an argument to your custom predicate.   You set the predicate to disallow side-effects, and set the function to allow them.   In this scenario, the function will invalidate the cache in one WME, but the predicate will be invoked on another.   Hence, even though the engine will always evaluate the argument function, it won’t necessarily invoke the bound member for the predicate.   As well as performing redundant work, this could be the source of very subtle bugs.   Admittedly, this is a very rare situation, but leads to a recommendation.   If you are planning to control side effects manually, it is safest practice to group custom predicates and functions that will be used together within the same class.

The return value of the GetAttributeValue() method represents the value returned by the BRL-defined function to the engine.   If a function invokes a bound member that returns void, the WME dutifully caches a null value.

Static Support
I mentioned above that the .NET 2.0 version of MS BRE that ships with BizTalk Server 2006 has a new feature which allows the engine to invoke static members of a class without the need to assert an object of that class.   This feature is disabled by default, and you change a registry setting to enable it.   When this option is switched on, you can implement custom predicate and action code but avoid the need to have to assert objects to the engine solely for the purpose of invoking your custom code.  The registry DWORD value is:

     HKEY_LOCAL_MACHINE\Software\Microsoft\BusinessRules\3.0\StaticSupport

An alternative approach is to provide application-level configuration settings to set this value.   For example, in BizTalk Server, you can add additional configuration settings to the BTSNTSvc.exe.config file.   Here is the additional configuration required:

StaticSupport configuration

If you set StaticSupport to 1, the engine will invoke the static member directly without requiring an object to be asserted.   The method will be invoked each time a rule is evaluated or fired, depending on the location of the static method binding within the rule.   The engine invokes the method directly without calling GetAttributeValue() on a WME.   This is the equivalent to invoking GetAttributeValue() with side effects on.   You cannot cache the result value.

If you set StaticSupport to 2 (or greater), and if certain other requirements are met, the engine will invoke the static member just once at translation time (i.e., when the RuleSetToRete component generates a Rete from the rule set).   This is roughly equivalent to switching off side effects, but without any ability to invalidate the cache.   Instead of caching a value in a WME, the engine converts the value returned by the static member into a constant which it then uses during rule processing.   The engine will only perform this translation-time evaluation if all arguments provided to the static method are constants, and if the static method is bound to a predicate or an argument function.   This rule means that, in accordance with the general behaviour of the engine, translation-time evaluation is never used directly for rule actions, but may be used for functions passed as arguments to a user function.

I have a few issues with this new functionality.   First, the MS BRE will raise a translation-time error if a rule does not reference a fact, either in its conditions or in its actions.   Put another way, a rule will never be evaluated unless at least one of its run-time conditions matches a fact.   You should understand that a rule may have more run-time conditions than are shown in the Conditions pane of the Rules Composer.   If you have a custom function that you use as an action,  there will be an additional run-time condition for the bound fact even if that fact that is not used in any of your conditions.   This extra condition is implicit and matches every instance of the fact type.

If you set the StaticSupport to 1 or greater, you may find that your policy is broken.   If you have any rules which only reference static class members in their conditions and actions, the rule cannot be compiled into the Rete, and a translation-time exception occurs.   If you set StaticSupport to 2 or greater, you could face additional problems.   In this case, you suppress side-effects for static members that meet the criteria defined above.   This, again, could easily break a previously working policy.   Imagine a static helper method that is used to provide some kind of timestamp value every time a fact is asserted.   When you set StaticSupport to 2, it is invoked once at translation time, and its return value is turned into a constant.   Each timestamp value is now identical!   You have broken your policy.

This would not be an issue if you could set the StaticSupport flag at the level of an individual policy.    If we think of a policy as a functional programme, we can think of the StaticSupport flag as a compiler directive.   It ought to be specified as part of the policy itself.   I have no idea why Microsoft did not implement the flag in this fashion.   However, by setting it at the machine level, or even the application level, policies become brittle.   Imagine a BizTalk Server installation in which several policies are deployed to multiple servers via the Rule Engine Update Service.   You would have to ensure that the registry or BizTalk config file was configured identically on each machine, and that every policy you deploy (including future policies not yet created) is designed to work correctly with those settings.   If you get this wrong, you run the risk of breaking policies.   Worse still, the problems may be quite subtle and hard to detect or diagnose.

Given the above issues, I cannot recommend that you use the new functionality in a production environment, and I am glad that it is switched off by default.   If you are tempted, think long and hard about the consequences.    Although asserting an instance of a class just to invoke static members of that class seems peculiar, I will continue to consider this to be the superior model in MS BRE until such time the StaticSupport flag is supported at policy level.

You can control side effects for any class member binding by editing your BRL directly.   Alternatively, you could use the API to control side effects programmatically.    If you are using the Rule Composer to create and edit policies, you will need to use the Rule Engine Deployment Wizard to export and save your BRL to a file.   Edit the file, and then use the same wizard to import your policy back to the SQL Server repository.   For other member binding types, your edits will make no difference.  Although the BRL schema defines the ‘sideeffects’ attribute on <datarowmember> and <xmldocumentmember> elements, the attribute setting is ignored in the current version of the engine.   Instead, the engine uses hard-wired values which are, incidentally, reflected in the settings of the ‘sideeffects’ attributes generated by the Rule Composer.   When getting data values, the binding sets the value to 'false'.   Hence, caching is always switched on for XML and ADO.NET data row values.   However, when setting values, the ‘sideeffects’ attribute is set to 'true'.   This ensures that the appropriate method in the ‘TypedFact’ wrapper is invoked.   Note, however, that the cache is not invalidated.  This may seem strange, but is actually compliant with the clear semantics of 'assert'.   You can change a value as many times as you want in XML or ADO.NET data.   However, until you re-assert the fact to the engine, MS BRE will continue to use the original value when evaluating rules.    If you do change a value without re-asserting the fact to the engine, you will find that, when you inspect your asserted XML document or ADO.NET data set after the rules engine has completed its work, the changes are reflected in your object.   This is semantically correct.   The engine has executed a function with side effects.

Side Effects and Truth Maintenance
For bindings to members of POCOs, the default value of the ‘sideeffects’ attribute is 'true'.   This ensures that class members are always invoked, and that return values are not cached.   Switching side effects on by default is not ideal, but then neither is switching them off by default.   There are two problems when side effects are allowed.   The first is performance, which I will discuss later.  Obtaining values from cache is more efficient than using late-bound member invocations.   The second problem has to do with truth maintenance.  

Truth maintenance is a big subject in its own right.   MS BRE currently has no support for more advanced forms of truth maintenance that track truth dependencies between facts.  Consider a scenario where a rule tests one fact and, if it has a certain value, asserts a second fact.   If the value of the first fact subsequently changes, and the fact is re-asserted, the second fact will remain in the working memory.   It will not automatically be retracted, even though the condition under which it was asserted has changed.   This can be a problem if the 'truth' of the second fact depends on the value of the first.   In MS BRE you must always control these types of truth dependency explicitly.   There is no mechanism that does the work for you automatically. 

A more basic form of truth maintenance is to preserve the attribute values of a fact until that fact is re-asserted.   A fact represents a truth assertion, and that assertion should logically remain immutable until the point of re-assertion.   Re-assertion tells the engine that the attribute values of the fact may have changed, and the engine can then re-evaluate the fact against the rule conditions.

This basic form of truth assertion can potentially be compromised by switching on side effects for 'get' class member invocations.   What if an object asserted to the engine is accessed by a second thread external to the rules engine instance?   In this case, the attribute values of the fact may be changed without the engine being aware.   The values of the fact are not, therefore, guaranteed to be immutable, and any change could cause havoc in terms of rule logic.   Engines that consume objects as facts often support the optional use of an interface which, when implemented on a class, allows objects of that class to notify the engine when their state is changed externally.  For example, in Java-based engines, this is often implemented by supporting the java.beans PropertyChangeListener interface.   .NET 2.0 introduced a System.ComponentModel interface called INotifyPropertyChanged.   This has the same semantics as PropertyChangeListener, but is not currently supported by MS BRE.

At present, MS BRE only ships with Microsoft BizTalk Server, and is therefore normally invoked within the context of a BizTalk orchestration.   By happy co-incidence, BizTalk orchestrations impose tight rules with regard to thread synchronisation, especially when accessing code in instances of non-serialisable classes such as the MS BRE RuleEngine class.  This greatly minimises the danger of compromising truth maintenance.  However, developers should still be aware of the potential dangers of allowing side effects on their class member bindings.   The problem never occurs with other binding types because side effects are always disallowed for 'gets'.

Trading Risks
You may ask why Microsoft switches side effects on by default for class member bindings.   I suspect that answer is that it represents a trade-off between different risks.   On the one hand, there is the risk of compromising truth maintenance, as discussed above.   However, if side effects are switched off, the implications may not be obvious to the rule developer.   Given that MS BRE is so richly extensible through the creation of custom code, developers are likely to pack all kinds of business logic into custom functions.  This is especially true when creating predicates and rule actions, but may also be true when implementing function code that will be invoked as predicate or function arguments.   

If side effects were switched off by default, the benefits gained by avoiding potential truth maintenance problems would be offset by the possibility, and even likelihood, of developers introducing a whole range of bugs into their rule sets because they write code based on the assumption that it will always be called.   I believe it to be fairly self-evident that this would pose a much higher risk than the truth maintenance problem.  As we have seen, side effects are always allowed for actions, though not necessarily for action arguments.  In choosing a suitable default for predicates and argument functions, the designers of MS BRE have had to weigh one risk against another.   This second risk is a direct consequence of the impedance mismatch between the functional model employed by MS BRE and the imperative model that underpins mainstream .NET languages such as C# and VB.NET.   There is no easy answer to this, short of re-training developers to use Haskell, or some other functional language, to implement their custom predicate and function logic.    There are, incidentally, Haskell compilers and interpreters available for .NET, though I have no idea if it is a practical proposition to use them to create custom predicates and functions  for use in MS BRE.   There is, in fact, quite rich support on the .NET platform for a variety of functional languages, even though the CLR offers somewhat partial (but growing) support for the functional model.

On balance, I think that Microsoft has made the right choice with regard to switching side effects on for POCOs, although I am aware that the truth maintenance issue may lead some readers to draw different conclusions.   The good news is that a knowledgeable rules developer can control side effects at the level of individual class member bindings in order to fine-tune their rules and steer the best possible path between the two worlds of functional and imperative programming.   It’s a pity you can't do this through the Rule Composer.   I can understand why the Rule Composer doesn't expose the side effects flag.  The UI attempts to make rules accessible to a wider audience than developers.  However, there is already a lot of technical content within the Rule Composer which is difficult for a non-technical analyst or domain expert to understand.   For example, consider the need to augment rules with XPaths to select facts and access fields.   On balance, I think it would be better to expose the side effects flag and, at the same time, provide better documentation so that developers have a chance of understanding the issues.

Could the model be improved?   Possibly.   For example, Microsoft could consider allowing rule developers to create metadata definitions that specify which bound members are to be used as arguments to predicates and rules.   They could then switch side effects off wherever these members are only used as arguments and not used as predicates or actions.   I'm not convinced by this idea, though.   It would add a layer of complexity and, I suspect, increase confusion.   Although it would help to minimise the impact of the truth maintenance issue, it could increase the risk of bugs being introduced based on the assumption that side effects apply.   The best answer is probably to leave the model as it is, but to provide much clearer guidance to developers, and also possibly some UI feedback that warns them of potential issues.

In summary, and taking all the subtleties into account, we can see that the ‘sideeffects’ flag does not support a pure functional model, and does not even guarantee that values are always sourced from cache on second and subsequent calls.    Developers should treat the flag rather like a hint.   When set to ‘false’ on class member bindings, it tells the engine to avoid side effects wherever reasonable to do so by sourcing values from cache.   The exact behaviour at run time is dictated by a number of considerations, and could conceivably change in subtle ways from version to version of the engine.  Only switch side effects off for custom predicates and functions where you are sure that it is safe to do so, and try to avoid writing code that depends on caching.

Caching and Performance
Finally, I need to say a little more about performance.   Some time ago, I did a little testing on the performance differences between allowing and disallowing side effects.   As I recall, my conclusion at the time was that although switching side effects off was more optimal, it didn’t make a huge difference.   In many cases, this is probably true.   If, for example, you are invoking policies with a small number of simple decision rules and a small number of facts, and if the engine does not perform intensive forward chaining and combinatorial evaluation, then the performance improvement is likely to be of little consequence.   This is especially true if you are invoking the engine in a low-throughput scenario or if latency requirements are not challenging.   It is remarkable how often this is the case in BizTalk Server programming.   The rules engine is typically exploited within orchestrations, and orchestrations are often used in scenarios where the main driver is to implement robust automated business processes.   High throughput or low latency may not be significant drivers, and the rules engine may be used primarily to control simple decision points within the orchestration flow.   Hence, it may simply not be worth considering manually controlling side effects in order to squeeze maximum performance from the engine.

I was prompted to write this article after some on-line discussion with a fellow rules enthusiast who pointed out that caching was bound to lead to significant performance improvement in scenarios where the engine operates under stress.   I responded by experimenting with an implementation of a variation of a well-known benchmark for Rules Engines called 'Miss Manners'.   I created the variation some time ago, and interestingly, when I revisited my rule set I found I had switched side effects off, suggesting that I had consciously optimised the rule set.   It is not a pure implementation because the benchmark depends on features which are not currently implemented in MS BRE.   Nevertheless, it is a close approximation which, whilst not useful for performance comparisons with other engines, nevertheless keeps to the spirit of Miss Manners by stressing the engine to a similar degree as 'pure' implementations.   Given that, at the back of my mind, I thought that controlling side effects did not have a dramatic impact on performance, I was surprised by the results.   Miss Manners tends to exaggerate apparently small differences in internal performance, and my tests show a very large variance between allowing and disallowing side effects.   Here are the results for various runs of the test with different numbers of guests.

caching in Miss Manners

I hope this provides some food for thought.   You cannot assume, from this, that by switching off side effects you will always get a comparable performance increase.   However, the results do indicate that intelligent control of side effects may provide one means of improving performance in scenarios where you do need to optimise the engine as much as possible.

Conclusion
In conclusion, we have seen that Microsoft’s BRL implements a functional model that is different to the imperative model which most developers are familiar with.   We have also seen how bindings are used to allow custom predicates and functions to be used extensively in MS BRE rule development.   The functional model, combined with vocabularies, provides a clean mechanism for implementing domain-specific extensions to Microsoft’s Business Rule Language.

The impedance mismatch between the functional and imperative models gives rise to the need to control side effects on class member bindings.   By default, Microsoft always allows side effects on class member bindings, and rule actions always support side effects regardless of configuration.  The default configuration for argument functions poses some risk concerning truth maintenance, and is less than optimal.   However, these concerns are outweighed, in my opinion, by the risks of introducing bugs due to developers assuming the presence of side effects in their imperative code.   We have seen that rule developers can, however, control side effects on class member bindings by manually editing their rule sets.   Care needs to be taken when doing this to ensure that side effects are exploited where necessary, but disallowed where appropriate.   This allows performance to be optimised thanks to the use of caching.   In low-stress scenarios, the optimisation may not be significant enough to make it worthwhile, but in high stress scenarios, the performance improvement may prove invaluable.

 


 

1 Predicates and Functions in BRL can collectively be considered to be 'functors'.   A functor is a 'function object'; i.e. an object which can be invoked as if it were a function.   At runtime, the engine invokes a functor using a binding to dispatch to the member of a .NET object.  A basic categorisation of Functors recognises predicates, functions and procedures.   Predicates return Boolean values, functions return any type, and procedures return nothing.   BRL effectively overloads the term 'function' to mean both functions and procedures.   Functors are supported in many languages, libraries and run-time environments.   For example, .NET implements functors as delegates, and allows them to be named or anonymous.   Anonymous delegates are closures.


posted on Monday, April 9, 2007 9:48 AM

Feedback

# re: MS BRE: Controlling rule side effects 4/10/2007 11:17 PM Peter Lin
This comment is probably going to be boring and academic, but you might find it interesting. In many rule engine that do value caching (aka shadow facts), test patterns that invoke a function access the attribute directly. The reason is actually straight forward. Assuming a rule engine provides a base interface for developers to implement functions, that code is beyond the control of the rule engine. Therefore, from a design perspective, it's safer to always assume a function will access an object's method directly.

The only case where it isn't always safe is distributed RETE. By distributed, I mean a partitioned working memory across multiple rule engines. The distributed RETE approach I describe in my patent filing distributes just the betaNode indexes. When a function is invoked, there's no gaurantee the object will be within the same workingmemory. The object could be in another working memory in a different VM on a separate system.

I should note that there are engines that implement the distribute RETE described in my patent filing :)

# re: MS BRE: Controlling rule side effects 4/11/2007 3:04 AM Peter Lin
oops typo. mean to say.

there are NO engines that implement distributed RETE. I really should proof read, which I rarely do until after I hit submit.

# re: MS BRE: Controlling rule side effects 4/11/2007 7:24 AM Charles Young
Interesting. Microsoft's BRL model (i.e., at the level of the rule language) doesn't support the concept of facts and attributes - just functions and predicates. So, to obtain the value of an 'attribute' of a 'fact', the rule language defines a function and uses it as an argument to a predicate or an action (which is a function). Of course, you can also use constants as arguments. This is a great model for supporting domain-specific extensions to the rule language, but when binding functions and predicates to class members (as opposed to XML nodes or datarow columns), you are constantly faced with the side effects dillema. In a sense, Microsoft is adopting the same approach you describe. By switching side effects on by default for all class member bindings, they are saying to the developer that they are responsible for their own custom code and the side effects it exhibits - the code they create and bind to BRL functions is beyond the control of the engine. Switching side effects off simply hooks in caching for the values returned from BRL functions and predicates. This doesn't really eliminate side effects, as a bound member must always be invoked at least once to populate the cache. In fact, as I stated, the sideeffects flag is really a hint. When 'off', The engine ultimately decides if it needs to invoke the bound member or retrieve a value from cache. It always invokes the member for an 'action', for example.

Distributing Rete would appeal, I would think, to very high throughput scenarios, such as a web search engine. Interestingly, Google uses a functional model (mapReduce) to processing very large data sets in a distributed fashion. Over then next few years, I suspect the main emphasis will be on supporting multi-core processors which, of course, have the benefit of a shared memory.

# re: MS BRE: Controlling rule side effects 4/11/2007 10:43 PM Peter Lin
I discovered an interesting performance quirk with jdk1.5.0 JVM and binaries compiled with jdk1.4.2, which I think is related to multi-core like Core Duo.

makes me think CPU affinity is going to become more important in the future. I stumbled on the solution for distributed RETE while trying to solve performance and scalability barriers for real-time systems like order management systems. I spent over a year searching the literature to see if anyone had done it. To my surprise no one else had considered the approach, which really surprised me. I spent over 3 years thinking about the problem of extremely large systems that need inferencing. One night I slapped my head and realize I'd stupidly missed a simple solution.

# re: MS BRE: Controlling rule side effects 4/11/2007 11:38 PM Charles Young
Good for you. Your patent could be worth a lot of money.

Microsoft, and now Sun, are both scrambling to add better support for the functional model to their OO languages. Personally, I think they, and other language vendors, are going to have to re-think things from the ground up to allow Moor's Law to be honoured in future years. Multi-core highlights just how bad our existing mainstream programming models are at parallelism. The other driver towards functional models is, of course, the rise of declarative programming, model-driven development, DSLs, software factories, etc.

# re: MS BRE: Controlling rule side effects 4/12/2007 10:32 PM Peter Lin
the development is definitely interesting, but I'm skeptical that declarative programming will become mainstream. LISP is a great language, but still many people dislike it. Imperative programming stil seems to be the preferred approach for a significant percent of the population.

it's pure luck no one else thought of distributing betaNode indexes to do distributed RETE. being lucky can be better than being smart :)

# Controlling rule side effects 2/27/2008 11:03 AM Romiko
Hi Folks,

We need to develop custom predicated, due to we like to handle null elements in XML, however, I have not ever seen a sample of a custom predicate being developed, and the documentation is vague, I know we need to use the Class userpedicate, but if anyone has developed on, I would be thankfull to provide me witha sample.

The bane of manipulating XPATH like [Field!=""] in the selector is that for every GET you do this in, the SET that relies on the GET has to be filtered as well, so imagine how large the vocabulary becomes, multiple set opertations for the samle xml field but with a different filter, this is a direct result of the author's comment below
"In the case of XML documents and ADO.NET data rows, you wrap your fact objects in a 'TypedFact' wrapper before assertion. The wrapper object provides the necessary methods to get or set data values. Each TypedFact class provides a number of strongly typed methods for 'get' and 'set' actions for each supported data type. In this case, the binding definitions are a little less explicit than for POCOs (Plain Old CLR Objects). The engine selects the appropriate method at runtime based on the type of data to be returned. When you assert a wrapped XML Document, or a wrapped ADO.NET data table, there is a further complication. Both these fact types act as sources of multiple internal facts. An XML document, for example, may contain many different elements and attributes, only some of which may be required by the engine. In this case, the engine uses XPaths defined in each binding to select the required data and internally asserts each selected element or attribute as a fact is its own right. Hence, if you assert a single wrapped XML document containing multiple <employee> elements and write a rule that tests some attribute or nested element of an <employee> element, the engine will use XPaths to select each employee element and assert it as a separate typed fact. The engine does not shred or copy the underlying XML, however. If a rule action changes the value of an employee attribute, the engine performs the change on the XML document you originally asserted."


Post A Comment
Title:
Name:
Email:
Comment:
Verification: