Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re: Regexes are slow (or, why I advocate String::Index)

by halley (Prior)
on May 17, 2004 at 13:59 UTC ( #353966=note: print w/replies, xml ) Need Help??

in reply to Regexes are slow (or, why I advocate String::Index)

My first reaction is, deciding to go with index() for anything that even remotely smells like natural language processing development is probably premature optimization. NLP code is organic code; organic programming features like patterns or templates will benefit the developer.

Sure, index() is faster than s///. But only for the things that index() can solve.

With much of natural language processing, you're probably going to try a LOT of alternative forms, and grammars, and minor adjustments until you get it right. Regexen may be slower to run, but they're faster to develop in any but the simplest of cases.

If you develop your code with regexen, and end up realizing that a few of your lines could "benefit" from a simple index() replacement, go ahead and replace it. I doubt that you'll replace 1% of your whole NLP code in a typical project, but you'll spend a lot of time hunting for it and verifying that the replacements didn't break anything. And if you later realize you need to tweak the NLP again, you might have to undo your little optimizations.

[ e d @ h a l l e y . c c ]

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://353966]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (5)
As of 2020-08-11 15:57 GMT
Find Nodes?
    Voting Booth?
    Which rocket would you take to Mars?

    Results (62 votes). Check out past polls.