Re: approximate regular expression

in reply to approximate regular expression

If you would like a non-regex brute force method.

#! C:/Perl/bin/perl

use strict;
use warnings;

my  $pattern = "JEJE";
my  $string = "EJKJUJHJDJEJEJEDEJOJOJJJAHJHJSHJEFEJUJEJUJKIJS";

my  @pattern_list = split //, $pattern;
my  $pattern_length = @pattern_list;

for my $x ( 0..((length $string) - $pattern_length) ){
    my $test_string = substr $string, $x, $pattern_length;
    my @result_array = split //, $test_string;
    my $score = 0;
    for my $y ( 0..$#pattern_list ){
        $score++ if $pattern_list[$y] eq $result_array[$y];
    }
    if( $score > 1 ){
        print "String: $test_string, position: $x, score: $score\n";
    }
}
[download]

Results

String: JKJU, position: 1, score: 2
String: JUJH, position: 3, score: 2
String: JHJD, position: 5, score: 2
String: JDJE, position: 7, score: 3
String: JEJE, position: 9, score: 4
String: JEJE, position: 11, score: 4
String: JEDE, position: 13, score: 3
String: DEJO, position: 15, score: 2
String: JOJO, position: 17, score: 2
String: JOJJ, position: 19, score: 2
String: JJJA, position: 21, score: 2
String: JHJS, position: 26, score: 2
String: SHJE, position: 29, score: 2
String: JEFE, position: 31, score: 3
String: FEJU, position: 33, score: 2
String: JUJE, position: 35, score: 3
String: JEJU, position: 37, score: 3
String: JUJK, position: 39, score: 2
[download]

Comment on Re: approximate regular expression Select or Download Code

Replies are listed 'Best First'.
Re^2: approximate regular expression by Marshall (Canon) on Mar 23, 2012 at 06:12 UTC
Yes, split() is certainly "brute force". If you have bench-marked this, you know that this is a very "expensive operation". @array = split (//,$some_var) is super "expensive" and your code does it many times. Going "with the flow" of the language is (usually) going to execute faster and in general "be better", meaning easier to understand.	[reply]
Re^3: approximate regular expression by jandrew (Chaplain) on Mar 23, 2012 at 22:00 UTC
Marshall thank you for your feedback Honestly I don't have a good handle on what perl "with the flow" really means. I guess I was responding to jrblas's request regarding fuzzy regex's. And by that I mean that fuzzy regex's mostly land in the TODO bucket of the regex wizards from what I have read. I do say that as a regex weakling so there may be something out there that I don't know about. Specifically Marpa seems to promise some alternatives but that is even farther beyond my current grasp. With that said I have to confess to laziness in calculating the match score. As a guess the original question appears to fall in the bio-perl realm which upon further study would also benefit from regex Look-Around add-ons. So I offer the following in penance. Read more... (4 kB)	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom