Life in my own company

Its all up to me.
posts - 138, comments - 219, trackbacks - 109

My Links

News



Twitter



Tag Cloud

Archives

Post Categories

Play

Work

Pluralization

Wow, who knew that the english language is so messed up!

All I needed was the ability to input a word and get back it's plural, but that turned into most of a day project implementing something that works “MOST” of the time.  sheesh!

I'd post the code, but it belongs to my company, so I can't.  However, here's how it was implemented:

I found an algorithmic solution for pluralization that seems to work very well at http://www.csse.monash.edu.au/~damian/papers/HTML/Plurals.html.

I also found an implementation of these rules that acutally does even more than the rules specify, but it's implemented in perl, which wasn't helpful to me.  I did borrow most of the lists, though.  It can be found here: http://search.cpan.org/dist/Lingua-EN-Inflect/

Once you have that information, creating the code is actually pretty straight forward, especially if you only inflect the last word that you're passed and ignore classical inflections of words, which I did.

Regular expressions are critical to being able to solve this, and the rules have regular expressions written into them which can be used directly by .net's regular expressions.

Cool stuff!  Have fun storming the castle!

Robert

Print | posted on Thursday, August 24, 2006 11:30 AM | Filed Under [ Work ]

Feedback

No comments posted yet.
Post A Comment
Title:
Name:
Email:
Website:
Comment:
Verification:
 

Powered by: