Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: approximate regular expression

by jandrew (Hermit)
on Mar 22, 2012 at 23:37 UTC ( #961125=note: print w/ replies, xml ) Need Help??


in reply to approximate regular expression

If you would like a non-regex brute force method.

#! C:/Perl/bin/perl use strict; use warnings; my $pattern = "JEJE"; my $string = "EJKJUJHJDJEJEJEDEJOJOJJJAHJHJSHJEFEJUJEJUJKIJS"; my @pattern_list = split //, $pattern; my $pattern_length = @pattern_list; for my $x ( 0..((length $string) - $pattern_length) ){ my $test_string = substr $string, $x, $pattern_length; my @result_array = split //, $test_string; my $score = 0; for my $y ( 0..$#pattern_list ){ $score++ if $pattern_list[$y] eq $result_array[$y]; } if( $score > 1 ){ print "String: $test_string, position: $x, score: $score\n"; } }

Results

String: JKJU, position: 1, score: 2 String: JUJH, position: 3, score: 2 String: JHJD, position: 5, score: 2 String: JDJE, position: 7, score: 3 String: JEJE, position: 9, score: 4 String: JEJE, position: 11, score: 4 String: JEDE, position: 13, score: 3 String: DEJO, position: 15, score: 2 String: JOJO, position: 17, score: 2 String: JOJJ, position: 19, score: 2 String: JJJA, position: 21, score: 2 String: JHJS, position: 26, score: 2 String: SHJE, position: 29, score: 2 String: JEFE, position: 31, score: 3 String: FEJU, position: 33, score: 2 String: JUJE, position: 35, score: 3 String: JEJU, position: 37, score: 3 String: JUJK, position: 39, score: 2


Comment on Re: approximate regular expression
Select or Download Code
Re^2: approximate regular expression
by Marshall (Prior) on Mar 23, 2012 at 06:12 UTC
    Yes, split() is certainly "brute force".
    If you have bench-marked this, you know that this is a very "expensive operation".
    @array = split (//,$some_var) is super "expensive" and your code does it many times.

    Going "with the flow" of the language is (usually) going to execute faster and in general "be better", meaning easier to understand.

      Marshall thank you for your feedback

      Honestly I don't have a good handle on what perl "with the flow" really means. I guess I was responding to jrblas's request regarding fuzzy regex's. And by that I mean that fuzzy regex's mostly land in the TODO bucket of the regex wizards from what I have read. I do say that as a regex weakling so there may be something out there that I don't know about. Specifically Marpa seems to promise some alternatives but that is even farther beyond my current grasp.

      With that said I have to confess to laziness in calculating the match score. As a guess the original question appears to fall in the bio-perl realm which upon further study would also benefit from regex Look-Around add-ons. So I offer the following in penance.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://961125]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (11)
As of 2014-09-01 12:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (6 votes), past polls