Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Finding a _Similar_ Substring? (Fuzzy Searching?)

by BrowserUk (Pope)
on May 21, 2004 at 03:21 UTC ( #355168=note: print w/replies, xml ) Need Help??

in reply to Finding a _Similar_ Substring? (Fuzzy Searching?)

Depending upon how loose you want the criteria to be, you might get away with something like this.

my $term = 'P100'; ## my $re = qr[@{[ join '\W*', split '', $term ]}]; # Improved slightl +y. my $re = qr[@{[ join '\W*', map "\Q$_\E", split '', $term ]}]x; for( 'P100', 'P-100', 'P 100', 'P1 00', 'the P 100 is very similar in style to the P-101 & P102.'. 'The P-100 is a generation behind the P1000' ) { print "Matched $1" while m[\b($re)\b]g; };; Matched P100 Matched P-100 Matched P 100 Matched P1 00 Matched P 100 Matched P-100

You could also add /i if you want case insensitivity.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Replies are listed 'Best First'.
Re: Re: Finding a _Similar_ Substring? (Fuzzy Searching?)
by rjahrman (Scribe) on May 21, 2004 at 04:03 UTC
    What exactly are you doing in the regexes at the top? What's the difference between the first and second one?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://355168]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2018-07-20 09:15 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (427 votes). Check out past polls.