Blog Stats
  • Posts - 24
  • Articles - 0
  • Comments - 61
  • Trackbacks - 92

 

Monday, March 29, 2004

Matching Parenthesis Pairs Using Regular Expressions

It is a known hard problem to match nested parenthesis pairs using regular expressions.  Put another way, regular expressions to not typically support counting occurrences.  .NET has a little-known RegEx construct for doing just that called the “balancing group definition“:

Balancing group definition. Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets; for example, (?'name1-name2').

I found great information on using the balancing group definition here:

http://www.oreilly.com/catalog/regex2/chapter/index.html

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpcongroupingconstructs.asp?frame=true

In my opinion this construct greatly enhances the utility of regular expressions.

 

 

 

Copyright © Eron Wright