Jeff Ferguson

Irritating other people since 1967

  Home  |   Contact  |   Syndication    |   Login
  41 Posts | 0 Stories | 49 Comments | 0 Trackbacks

News

I am a Principal Consultant with Magenic (www.magenic.com).

Twitter












Archives

November 2008 Entries

As I mentioned back in this post, the initial phase of work needed to allow Sotue to recognize data in input streams is to build a state machine that input characters can move through as they are read. If the state machine ends up in what is called an accepting state, then the input characters match a pattern. To review, Sotue’s process for building these state machines are as follows: Construct a non-deterministic finite automaton (NFA) from a regular expression. Convert the NFA into a deterministic...

In my post on adding closure operator support for regular expressions input to Sotue, I showed the unoptimized NFA generated by Sotue for a regular expression built to match a number. Since the regular expression used only the OR operator, the generated NFA contained some 40 states. While the state machine was perfectly valid, asking a developer to write a regular expression such as (0|1|2|3|4|5|6|7|8|9)+ seems like a bit too much to write. The de facto standard for specifying ranges of characters...