<?xml version="1.0" encoding="windows-1252"?>
<node id="812616" title="Re: Pattern searching allowing for mis-matches..." created="2009-12-13 16:16:01" updated="2009-12-13 16:16:01">
<type id="11">
note</type>
<author id="33341">
Albannach</author>
<data>
<field name="doctext">
Another option I often use for this sort of thing is [cpan://Text::Levenshtein]. You should use the XS version if available for your platform though as the speed difference is significant. Below I've assumed the string must have the same length, but that is not necessary as an insertion or deletion also counts as distance. Accommodating variable length strings is left as an exercise for the reader. ;-)
&lt;code&gt;use strict;
use warnings;
use Text::Levenshtein qw(distance);

my $text = 'TGATTGAA';
my $search = 'TGAT';
my $fuzz = 1; # how far off a match can we be

for my $start (0..(length($text) - length($search)) ) {
  my $chunk = substr($text,$start,length $search);
  print "checking for $search from position $start: $chunk: ";
  my $dist = distance($search, substr($text, $start, length $search));
  if($dist == 0) {
    print "Match!\n";
  }elsif($dist &lt;= $fuzz) {
    print "Close enough\n";
  }else{
    print "nope\n";
  }
}&lt;/code&gt;
&lt;p&gt;
&lt;pre&gt;checking for TGAT from position 0: TGAT: Match!
checking for TGAT from position 1: GATT: nope
checking for TGAT from position 2: ATTG: nope
checking for TGAT from position 3: TTGA: nope
checking for TGAT from position 4: TGAA: Close enough&lt;/pre&gt;


&lt;!-- Node text goes above. Div tags should contain sig only --&gt;
&lt;div class="pmsig"&gt;&lt;div class="pmsig-33341"&gt;
&lt;p&gt;--&lt;br&gt;
I'd like to be able to assign to an luser
&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
812599</field>
<field name="parent_node">
812599</field>
</data>
</node>
