Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re: Fuzzy Searching: Optimizing Algorithm Selection

by tachyon (Chancellor)
on Nov 11, 2004 at 03:56 UTC ( #406906=note: print w/replies, xml ) Need Help??

in reply to Fuzzy Searching: Optimizing Algorithm Selection

AGREP (approximate grep) probably does what you want and the algorithms are outlined on the site, plus you can get the source code. A variation built around this code may well be as fast as it gets. Here is a postscript research paper on it



  • Comment on Re: Fuzzy Searching: Optimizing Algorithm Selection

Replies are listed 'Best First'.
Re^2: Fuzzy Searching: Optimizing Algorithm Selection
by perlcapt (Pilgrim) on Nov 11, 2004 at 04:06 UTC
    I also recommend use of agrep. The only caveat is the restrictions on free use for commercial applications. I don't believe there is anything more efficient or better suited. The link that tachyon and I point to has links to other libraries and applications.
Re^2: Fuzzy Searching: Optimizing Algorithm Selection
by BrowserUk (Pope) on Nov 11, 2004 at 04:33 UTC

    From a fairly quick perusal of the options, I don't think agrep will help much, except maybe as a pre-filter.

    • It won't report where in a line a a match was found.
    • It stops matching against a given line when it finds the first match.
    • If you supply a file of things to match, it doesn't tell you which one matched.

    Maybe I missed some things in amongst the six help 'screens'?

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re^2: Fuzzy Searching: Optimizing Algorithm Selection
by halley (Prior) on Nov 12, 2004 at 19:10 UTC
    I went to the University of Arizona, and as an undergraduate, I would sit in on masters- and postdoc- level classes where folks were discussing stuff like agrep.

    Somewhere in my files I have the follow-up to this paper, which allows for affine weighting for various symbols. For example, you might say that vowels are more interchangeable than consonants, if you're looking for fuzzy matches in the pronunciation problem space. I think the author of that paper went on into bioinformatics in a big way after that.

    [ e d @ h a l l e y . c c ]

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://406906]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (10)
As of 2018-06-19 18:17 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (114 votes). Check out past polls.