Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

1 mismatch string matching

by Anonymous Monk
on Sep 21, 2003 at 03:07 UTC ( #292956=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Folks, I would like to compare two strings (a pattern and a text) such that one mismatch character is allowed. For example, if the pattern is: "match", and the text is: "for the watch to babble and to talk is most tolerable", then there is a match between "match" and "watch" with one mismatch character. Any suggestions about this? Thank you in advance,

Comment on 1 mismatch string matching
Re: 1 mismatch string matching
by kvale (Monsignor) on Sep 21, 2003 at 03:43 UTC
    You can perform approximate matching using the String::Approx module.

    If you want to roll your own, try

    $what_matched = $1 if $string =~ /\b(\watch|m\wtch|ma\wch|mat\wh|matc\w)\b/;
    Replace the  \w with  . or  [a-z] , etc., according to what errors you want to accept.

    -Mark

Re: 1 mismatch string matching
by tachyon (Chancellor) on Sep 21, 2003 at 08:22 UTC

    You don't specify what you want this FOR which would be enlightening. What you are describing is sort of asking for a Levenshtein edit distance of 1, but then again perhaps not. These links provide answers/code/modules that may be relevant.

    Here is a snippet of code that lets you generate a dynamic regex to match you targets. This assumes what you need is as simple as stated.

    my $word = 'match'; my $string = 'Hatch a plan to watch this RE match'; my @match_re; for my $offset( 0..length($word)-1 ) { my $possibility = $word; # replace one char with . to match anything substr $possibility, $offset, 1, '.'; push @match_re, $possibility; } my $match_re = join '|', @match_re; print "The match RE is qr/$match_re/\n"; $match_re = qr/($match_re)/; my @matches = $string =~ m/$match_re/g; print "Found @matches\n" if @matches; __DATA__ The match RE is qr/.atch|m.tch|ma.ch|mat.h|matc./ Found Hatch watch match

    Note I am substituting a . into the possibility which will match anything but a newline in the form given. You may want A-Za-z or \w or whatever.....

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://292956]
Approved by kvale
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-09-01 12:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (6 votes), past polls