Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: What's the best way to do a pattern search like this?

by CharlesClarkson (Curate)
on Jul 20, 2001 at 10:58 UTC ( #98343=note: print w/ replies, xml ) Need Help??


in reply to What's the best way to do a pattern search like this?

Some things to ponder:

How should the algorithm handle hyphenated words? Should pre-paid become pre and paid or remain pre-paid? Will any words wrap to the next line using a hyphen?

Are there any slang or shortcut words in the file? How should b4 be handled?

Is the file short or long? Should the algorithm read the entire file into memory or would it be better to process each line?

How might you handle dates: 500 A.D., c. 1500 bc.

And what about other abreviations: Mr. Jr. Ave. etc. e.g.


HTH,
Charles K. Clarkson


Comment on Re: What's the best way to do a pattern search like this?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://98343]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2014-07-26 09:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls