Blog Stats
  • Posts - 24
  • Articles - 0
  • Comments - 71
  • Trackbacks - 91

 

Matching Parenthesis Pairs Using Regular Expressions

It is a known hard problem to match nested parenthesis pairs using regular expressions.  Put another way, regular expressions to not typically support counting occurrences.  .NET has a little-known RegEx construct for doing just that called the “balancing group definition“:

Balancing group definition. Deletes the definition of the previously defined group name2 and stores in group name1 the interval between the previously defined name2 group and the current group. If no group name2 is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct allows the stack of captures for group name2 to be used as a counter for keeping track of nested constructs such as parentheses. In this construct, name1 is optional. You can use single quotes instead of angle brackets; for example, (?'name1-name2').

I found great information on using the balancing group definition here:

http://www.oreilly.com/catalog/regex2/chapter/index.html

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpgenref/html/cpcongroupingconstructs.asp?frame=true

In my opinion this construct greatly enhances the utility of regular expressions.

 

  • Share This Post:
  • Share on Twitter
  • Share on Facebook
  • Share on Technorati

Feedback

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar The short description from MSDN doesn't really explain this to me -- can you give a better example? 6/25/2004 4:58 PM | Doug Moore

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar Do you have a specific example. I don't have the book and the sample chapter does not specificially deal with "balancing group definition" I would like to match XML tags which is similar to the problem with parenthesis with the exception that XML tags of course are more than one character and the start tag can have attributes.

Kevin 8/27/2004 5:22 PM | Kevin Burton

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar See chapter 9, pg 430 in the O'Reilly book (online, see the link). I think that the example can be applied to XML matching. 8/28/2004 3:28 PM | Eron Wright

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar For a dicussion and a few example take a look at my post (http://weblogs.asp.net/whaggard/archive/2005/02/20/377025.aspx) 2/24/2005 12:36 PM | Wes Haggard

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar How can I get rid of a comma and a paranthesis inside a tag.
eg.

<p(,)> this is the function, calculate() </p(,)>


after replace it should look like:

<p> this is the function, calculate() </p> 4/22/2005 3:19 PM | Don

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar Based on all the links from this page the following code should return a match, but it doesn't. I am using the 1.1 framework. What am I doing wrong?

Dim m As Match = Regex.Match( _
"before (nope (yes (here) okay) after", _
"\( " & _
" (?> " & _
" [ˆ()]+ " & _
" | " & _
" \( (?<DEPTH>) " & _
" | " & _
" \) (?<-DEPTH>) " & _
" )+ " & _
" (?(DEPTH)(?!)) " & _
"\)", _
RegexOptions.IgnorePatternWhitespace) 10/21/2005 2:05 PM | Wyatt Haase

# Examples of using balancing groups

Gravatar This post covers some interesting uses for balancing groups: Fun With .NET's Regex Balancing Groups 1/25/2008 5:16 PM | Steve

# re: Matching Parenthesis Pairs Using Regular Expressions

Gravatar .Net has supplied its RegEx system with a push-down (capture) stack (with implicit push and pop operations: (?<group>xxx) pushed while (?<-group>xxx) pops). The construct is no longer technically a Regular Expression (Finite Automaton) but something more like a PDA (=Push-Down Automaton) which has infinite memory and can recognize a lot more complex patterns (Context-Free Grammars=CFGs) 3/8/2010 3:42 AM | Hassan

Post A Comment
Title:
Name:
Email:
Website:
Comment:
Verification:
 
 

 

 

Copyright © Eron Wright