Charles Young

  Home  |   Contact  |   Syndication    |   Login
  201 Posts | 64 Stories | 521 Comments | 373 Trackbacks



Alternative Feeds

BizTalk Bloggers

BizTalk Sites

CEP Bloggers

CMS Bloggers


Other Bloggers

Rules Bloggers

SharePoint Bloggers


WF Bloggers

Wednesday, February 9, 2011 #

Over Christmas I got to play a bit with the W3C RIF PRD and came across a few issues which I thought I would record for posterity. Specifically, I was working on a grammar for the presentation syntax using a GLR grammar parser tool (I was using the current CTP of ‘M’ (MGrammer) and Intellipad – I do so hope the MS guys don’t kill off M and Intellipad now they have dropped the other parts of SQL Server Modelling). I realise that the presentation syntax is non-normative and that any issues with it do not therefore compromise the standard. However, presentation syntax is useful in its own right, and it would be great to iron out any issues in a future revision of the standard.
The main issues are actually not to do with the grammar at all, but rather with the ‘running example’ in the RIF PRD recommendation. I started with the code provided in Example 9.1. There are several discrepancies when compared with the EBNF rules documented in the standard. Broadly the problems can be categorised as follows:
·      Parenthesis mismatch – the wrong number of parentheses are used in various places. For example, in GoldRule, the RHS of the rule (the ‘Then’) is nested in the LHS (‘the If’). In NewCustomerAndWidgetRule, the RHS is orphaned from the LHS. Together with additional incorrect parenthesis, this leads to orphanage of UnknownStatusRule from the entire Document.

·      Invalid use of parenthesis in ‘Forall’ constructs. Parenthesis should not be used to enclose formulae.

Removal of the invalid parenthesis gave me a feeling of inconsistency when comparing formulae in Forall to formulae in If. The use of parenthesis is not actually inconsistent in these two context, but in an If construct it ‘feels’ as if you are enclosing formulae in parenthesis in a LISP-like fashion. In reality, the parenthesis is simply being used to group subordinate syntax elements. The fact that an If construct can contain only a single formula as an immediate child adds to this feeling of inconsistency.

·      Invalid representation of compact URIs (CURIEs) in the context of Frame productions. In several places the URIs are not qualified with a namespace prefix (‘ex1:’). This conflicts with the definition of CURIEs in the RIF Datatypes and Built-Ins 1.0 document. Here are the productions:

CURIE          ::= PNAME_LN
                 | PNAME_NS
PNAME_NS       ::= PN_PREFIX? ':'
PN_LOCAL       ::= ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* PN_CHARS)?
                 | '-' | [0-9] | #x00B7
                 | [#x0300-#x036F] | [#x203F-#x2040]
                 | '_'
PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6]
                 | [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF]
                 | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF]
                 | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD]
                 | [#x10000-#xEFFFF]

The more I look at CURIEs, the more my head hurts! The RIF specification allows prefixes and colons without local names, which surprised me. However, the
CURIE Syntax 1.0 working group note specifically states that this form is supported…and then promptly provides a syntactic definition that seems to preclude it! However, on (much) deeper inspection, it appears that ‘ex1:’ (for example) is allowed, but would really represent a ‘fragment’ of the ‘reference’, rather than a prefix! Ouch! This is so completely ambiguous that it surely calls into question the whole CURIE specification.   In any case, RIF does not allow local names without a prefix.

·      Missing ‘External’ specifiers for built-in functions and predicates. 

The EBNF specification enforces this for terms within frames, but does not appear to enforce (what I believe is) the correct use of External on built-in predicates. In any case, the running example only specifies ‘External’ once on the predicate in UnknownStatusRule. External() is required in several other places.

·      The List used on the LHS of UnknownStatusRule is comma-delimited. This is not supported by the EBNF definition. Similarly, the argument list of pred:list-contains is illegally comma-delimited.

·      Unnecessary use of conjunction around a single formula in DiscountRule. This is strictly legal in the EBNF, but redundant.
All the above issues concern the presentation syntax used in the running example. There are a few minor issues with the grammar itself. Note that Michael Kiefer stated in his paper “Rule Interchange Format: The Framework” that:
“The presentation syntax of RIF … is an abstract syntax and, as such, it omits certain details that might be important for unambiguous parsing.”
·      The grammar cannot differentiate unambiguously between strategies and priorities on groups. A processor is forced to resolve this by detecting the use of IRIs and integers.
This could easily be fixed in the grammar.
·      The grammar cannot unambiguously parse the ‘->’ operator in frames. Specifically, ‘-’ characters are allowed in PN_LOCAL names and hence a parser cannot determine if ‘status->’ is (‘status’ ‘->’) or (‘status-’ ‘>’).   One way to fix this is to amend the PN_LOCAL production as follows:

PN_LOCAL ::= ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* ((PN_CHARS)-('-')))?

However, unilaterally changing the definition of this production, which is defined in the
SPARQL Query Language for RDF specification, makes me uncomfortable.

·      I assume that the presentation syntax is case-sensitive. I couldn’t find this stated anywhere in the documentation, but function/predicate names do appear to be documented as being case-sensitive.

·      The EBNF does not specify whitespace handling. A couple of productions (RULE and ACTION_BLOCK) are crafted to enforce the use of whitespace. This is not necessary. It seems inconsistent with the rest of the specification and can cause parsing issues. In addition, the Const production exhibits whitespaces issues. The intention may have been to disallow the use of whitespace around ‘^^’, but any direct implementation of the EBNF will probably allow whitespace between ‘^^’ and the SYMSPACE.

Of course, I am being a little nit-picking about all this. On the whole, the EBNF translated very smoothly and directly to ‘M’ (MGrammar) and proved to be fairly complete. I have encountered far worse issues when translating other EBNF specifications into usable grammars.   I can’t imagine there would be any difficulty in implementing the same grammar in Antlr, COCO/R, gppg, XText, Bison, etc.
A general observation, which repeats a point made above, is that the use of parenthesis in the presentation syntax can feel inconsistent and un-intuitive.   It isn’t actually inconsistent, but I think the presentation syntax could be improved by adopting braces, rather than parenthesis, to delimit subordinate syntax elements in a similar way to so many programming languages. The familiarity of braces would communicate the structure of the syntax more clearly to people like me. 

If braces were adopted, parentheses could be retained around ‘var (frame | ‘new()’) constructs in action blocks. This use of parenthesis feels very LISP-like, and I think that this is my issue. It’s as if the presentation syntax represents the deformed love-child of LISP and C. In some places (specifically, action blocks), parenthesis is used in a LISP-like fashion. In other places it is used like braces in C. I find this quite confusing.
Here is a corrected version of the running example (Example 9.1) in compliant presentation syntax:
 Prefix( ex1 <> )
 (* ex1:CheckoutRuleset *)
 Group rif:forwardChaining ( 
   (* ex1:GoldRule *)
   Group 10 (
     Forall ?customer such that And(?customer # ex1:Customer
       (Forall ?shoppingCart such that ?customer[ex1:shoppingCart->?shoppingCart]
          (If Exists ?value (And(?shoppingCart[ex1:value->?value]
                                 External(pred:numeric-greater-than-or-equal(?value 2000))))
           Then Do(Modify(?customer[ex1:status->"Gold"])))))
   (* ex1:DiscountRule *)
   Group (
     Forall ?customer such that ?customer # ex1:Customer
       (If Or( ?customer[ex1:status->"Silver"]
        Then Do ((?s ?customer[ex1:shoppingCart-> ?s])
                 (?v ?s[ex1:value->?v])
                 Modify(?s [ex1:value->External(func:numeric-multiply (?v 0.95))]))))
   (* ex1:NewCustomerAndWidgetRule *)
   Group (
     Forall ?customer such that And(?customer # ex1:Customer
                                    ?customer[ex1:status->"New"] )
       (If Exists ?shoppingCart ?item
                       ?item # ex1:Widget ) )
        Then Do( (?s ?customer[ex1:shoppingCart->?s])
                 (?val ?s[ex1:value->?val])
                 (?voucher ?customer[ex1:voucher->?voucher])
                 Modify(?s[ex1:value->External(func:numeric-multiply(?val 0.90))]))))
   (* ex1:UnknownStatusRule *)
   Group (
     Forall ?customer such that ?customer # ex1:Customer
       (If Not(Exists ?status
                           External(pred:list-contains(List("New" "Bronze" "Silver" "Gold") ?status)) )))
        Then Do( Execute(act:print(External(func:concat("New customer: " ?customer))))
I hope that helps someone out there :-)

Jesus Rodriguez has blogged recently on Tellago Devlabs' release of an open source RESTful API for BizTalk Server Business Rules.   This is an excellent addition to the BizTalk ecosystem and I congratulate Tellago on their work.
The Microsoft BRE was originally designed to be used as an embedded library in .NET applications. This is reflected in the implementation of the Rules Engine Update (REU) Service which is a TCP/IP service that is hosted by a Windows service running locally on each BizTalk box. The job of the REU is to distribute rules, managed and held in a central database repository, across the various servers in a BizTalk group.   The engine is therefore distributed on each box, rather than exploited behind a central rules service.
This model is all very well, but proves quite restrictive in enterprise environments. The problem is that the BRE can only run legally on licensed BizTalk boxes. Increasingly we need to deliver rules capabilities across a more widely distributed environment. For example, in the project I am working on currently, we need to surface decisioning capabilities for use within WF workflow services running under AppFabric on non-BTS boxes. The BRE does not, currently, offer any centralised rule service facilities out of the box, and hence you have to roll your own (and then run your rules services on BTS boxes which has raised a few eyebrows on my current project, as all other WCF services run on a dedicated server farm ).
Tellago's API addresses this by providing a RESTful API for querying the rules repository and executing rule sets against XML passed in the request payload. As Jesus points out in his post, using a RESTful approach hugely increases the reach of BRE-based decisioning, allowing simple invocation from code written in dynamic languages, mobile devices, etc.
We developed our own SOAP-based general-purpose rules service to handle scenarios such as the one we face on my current project. SOAP is arguably better suited to enterprise service bus environments (please don't 'flame' me - I refuse to engage in the RESTFul vs. SOAP war). For example, on my current project we use claims based authorisation across the entire service bus and use WIF and WS-Federation for this purpose.   We have extended this to the rules service. I can't release the code for commercial reasons :-( but this approach allows us to legally extend the reach of BRE far beyond the confines of the BizTalk boxes on which it runs and to provide general purpose decisioning capabilities on the bus.
So, well done Tellago.   I haven't had a chance to play with the API yet, but am looking forward to doing so.