posts - 217, comments - 218, trackbacks - 239

My Links

News




I am a Microsoft Certified Application Developer MCAD Chartered Member (C# .Net) and born in Bangladesh.
I work for Ocean Informatics Pty Ltd as a Senior Developer - Analyst.
I am also co-founder and core developer of Pageflakes (acquired by LiveUniverse) www.pageflakes.com
and most recently created SmartCodeGenerator

My Articles
Flexible and Plugin based .Net Application..
Mass Emailing Functionality with C#, .NET 2.0, and Microsoft® SQL Server 2005 Service Broker'
Write your own Code Generator or Template Engine in .NET
Smart Code Generator .NET: Usage Overview
Smart Code Generator .NET: Architectural Overview
Smart Code Generator .NET: using with NAnt and Cassini

Archives

Free Programming Language Training

Making Regex more readable

Making RegEx more readable

Compare the following code statements defining the same regular expression in .NET: static readonly Regex ParameterReference = new Regex(@"(?<empty>\<\>)|\<(?<parameter>[^\<\>]+)\>|(?<open>\<[^\<\>]*(?!\>))",
 RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);


static readonly Regex ParameterReference = new Regex(@"
  # Matches invalid empty brackets #
  (?<empty>\<\>)|
  # Matches a valid parameter reference #
  \<(?<parameter>[^\<\>]+)\>|
  # Matches opened brackes that are not properly closed #
  (?<open>\<[^\<\>]*(?!\>))",
 RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);

While the former is still understandable for a fairly regex-aware developer, the later is far more explicit about the purpose of each part of it. The ability to place comments inside the expression is enabled by the RegexOptions.IgnorePatternWhitespace, which is not used enough by developers. In the case of this pretty simple expression this may seem unnecessary, but imagine a regex-based parser that processes (CodeSmith-like) template files:

static Regex CodeExpression = new Regex(@"
  # First match the full directives #
  <\#\s*@\s+(?<directive>\w*)(?<attributes>.*?)\#\/>(?:\W*\n)?|
  # Match open tag #
  (?<open><\#)|
  # Match close tag #
  (?<close>\#\/>)|
  # This is a simple expression that is outputed as-is to output.Write(<output>); #
  (?:=)(?<output>.*?)(?<badmultiple>;.*?)?(?=\#\/>)|
  # Anything previous or after a code tag #
  (?<code>.*?)(?=<\#|\#\/>)|
  # Finally, match everything else that is written as-is #
  (?<snippet>.*[\r\n]*)",
 RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled | RegexOptions.Singleline);

It's pretty obvious that not commenting such complex expressions makes them almost unreadable except for the guy who wrote them (and even to him after some time!). Bottom line: ALWAYS comment your expressions in-line!!!

Source: http://weblogs.asp.net/cazzu/archive/2004/02/10/70621.aspx

Print | posted on Wednesday, October 12, 2005 9:00 PM |

Feedback

No comments posted yet.

Post Comment

Title  
Name  
Email
Url
Comment   
Please add 8 and 1 and type the answer here:

Powered by: