Geeks With Blogs
Jeff Ferguson Irritating other people since 1967

Consider the following simple calculator authored in yacc:

%{  
    #include <stdio.h>   
    int yylex(void);   
    void yyerror(char *);   
%}   
  
%token INTEGER   
  
%%   
   
program:   
        program expr '\n'         { printf("%d\n", $2); }   
        |    
        ;   
  
expr:   
        INTEGER   
        | expr '+' expr           { $$ = $1 + $3; }   
        | expr '-' expr           { $$ = $1 - $3; }   
        ;   
  
%%   
  
void yyerror(char *s) {   
    fprintf(stderr, "%s\n", s);   
}   
  
int main(void) {   
    yyparse();   
    return 0;   
}   

A notable feature of this grammar is that the syntax definitions are augmented with C code statements that are executed when the syntax is discovered in the input stream. This code is able to interpret the input and, if necessary, perform a calculation and send the result of the calculation into the abstract syntax tree (AST). Consider, for example, the following fragment from the yacc grammar above:

expr '+' expr           { $$ = $1 + $3; } 

The yacc engine will assign temporary variables to matching input as follows:

  • $1: the first expression
  • $2: the ‘+’ token
  • $3: the second expression

An additional variable, called $$, contains the results of the parsed input. In this fragment, the equivalent of an anonymous delegate includes C code to perform the addition operation on the contents of $1 and $3 and place the results in $$. With this, some work can be done “in place” to not only construct, but also to evaluate, the AST.

This background information is necessary to understand the following: MGrammar has no such capability today. Anonymous methods cannot be attached to MGrammar syntax productions, and “in place” interpretations of the AST in this manner is not possible.

However, this doesn’t mean that such a feature will never appear in MGrammar. In discussing this issue in the MSDN Forums for “Oslo”, Microsoft’s Paul Vick made this comment:

I thought I'd interject that we're considering ways to evaluate expressions directly in grammars. It's a pretty useful feature, and we'd love to provide the full M expression capabilities on the RHS of a grammar production. So stay tuned!

Disclaimer:This does not represent an official commitment from Microsoft. Don’t read too much into this. It might happen. It might not. Don’t storm Paul’s office with torches and pitchforks if it doesn’t appear. But Microsoft is thinking about it, and that’s all we can ask.

Posted on Wednesday, April 1, 2009 7:47 AM | Back to top


Comments on this post: Expression Evaluation Coming to MGrammar?

# re: Expression Evaluation Coming to MGrammar?
Requesting Gravatar...
The February CTP release notes also included the following intriguing statement...

"Any production in a token can now have a code action or a graph action (formerly known as term construction)! You can now specify a return type for a token definition in the case of code actions, similar to a syntax definition."

'Code actions' sound like they are what you are describing - some kind of embedded code expressions on RHS of productions. Currently, only MGraph 'actions' are supported. I suspect more work has been done on implementing this than MS is currently prepared to talk about.
Left by Charles Young on Apr 02, 2009 7:56 AM

# re: Expression Evaluation Coming to MGrammar?
Requesting Gravatar...
Thank you for the tip, Charles!

I was most interested in the "return type for a token definition" phrase. I will most likely need to do some research in this area. Just by reading that description, and without doing any additional work, I am guessing (and it's ONLY a guess) that we will be able to add non-string types into the object graph. For example, if I have a token that specifies a whole number (for example, "0..9+"), the matched lexeme is entered into the AST as a string. Today, that leaves me to perform the necessary conversion to an integer. Perhaps the note references support for a type specification, which would allow me to specify that MGraph needs to do the work of converting the string lexeme into an integer, and the resulting integer value is placed into the AST.

Thanks for reading!
Left by Jeff Ferguson on Apr 02, 2009 8:03 AM

Your comment:
 (will show your gravatar)


Copyright © Jeff Ferguson | Powered by: GeeksWithBlogs.net