Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: approximate regular expression

by jandrew (Chaplain)
on Mar 22, 2012 at 23:37 UTC ( #961125=note: print w/replies, xml ) Need Help??

in reply to approximate regular expression

If you would like a non-regex brute force method.

#! C:/Perl/bin/perl use strict; use warnings; my $pattern = "JEJE"; my $string = "EJKJUJHJDJEJEJEDEJOJOJJJAHJHJSHJEFEJUJEJUJKIJS"; my @pattern_list = split //, $pattern; my $pattern_length = @pattern_list; for my $x ( 0..((length $string) - $pattern_length) ){ my $test_string = substr $string, $x, $pattern_length; my @result_array = split //, $test_string; my $score = 0; for my $y ( 0..$#pattern_list ){ $score++ if $pattern_list[$y] eq $result_array[$y]; } if( $score > 1 ){ print "String: $test_string, position: $x, score: $score\n"; } }


String: JKJU, position: 1, score: 2 String: JUJH, position: 3, score: 2 String: JHJD, position: 5, score: 2 String: JDJE, position: 7, score: 3 String: JEJE, position: 9, score: 4 String: JEJE, position: 11, score: 4 String: JEDE, position: 13, score: 3 String: DEJO, position: 15, score: 2 String: JOJO, position: 17, score: 2 String: JOJJ, position: 19, score: 2 String: JJJA, position: 21, score: 2 String: JHJS, position: 26, score: 2 String: SHJE, position: 29, score: 2 String: JEFE, position: 31, score: 3 String: FEJU, position: 33, score: 2 String: JUJE, position: 35, score: 3 String: JEJU, position: 37, score: 3 String: JUJK, position: 39, score: 2

Replies are listed 'Best First'.
Re^2: approximate regular expression
by Marshall (Abbot) on Mar 23, 2012 at 06:12 UTC
    Yes, split() is certainly "brute force".
    If you have bench-marked this, you know that this is a very "expensive operation".
    @array = split (//,$some_var) is super "expensive" and your code does it many times.

    Going "with the flow" of the language is (usually) going to execute faster and in general "be better", meaning easier to understand.

      Marshall thank you for your feedback

      Honestly I don't have a good handle on what perl "with the flow" really means. I guess I was responding to jrblas's request regarding fuzzy regex's. And by that I mean that fuzzy regex's mostly land in the TODO bucket of the regex wizards from what I have read. I do say that as a regex weakling so there may be something out there that I don't know about. Specifically Marpa seems to promise some alternatives but that is even farther beyond my current grasp.

      With that said I have to confess to laziness in calculating the match score. As a guess the original question appears to fall in the bio-perl realm which upon further study would also benefit from regex Look-Around add-ons. So I offer the following in penance.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://961125]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2018-06-23 00:27 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (124 votes). Check out past polls.