Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^3: approximate regular expression

by jandrew (Hermit)
on Mar 23, 2012 at 22:00 UTC ( #961321=note: print w/ replies, xml ) Need Help??


in reply to Re^2: approximate regular expression
in thread approximate regular expression

Marshall thank you for your feedback

Honestly I don't have a good handle on what perl "with the flow" really means. I guess I was responding to jrblas's request regarding fuzzy regex's. And by that I mean that fuzzy regex's mostly land in the TODO bucket of the regex wizards from what I have read. I do say that as a regex weakling so there may be something out there that I don't know about. Specifically Marpa seems to promise some alternatives but that is even farther beyond my current grasp.

With that said I have to confess to laziness in calculating the match score. As a guess the original question appears to fall in the bio-perl realm which upon further study would also benefit from regex Look-Around add-ons. So I offer the following in penance.

#! C:/Perl/bin/perl use strict; use warnings; use Smart::Comments '###'; $| = 1; my $pattern = "JEJE"; my $string = "EJKJUJHJDJEJEJEDEJOJOJJJAHJHJSHJEFEJUJEJUJKIJS"; my $test_regex = _build_test( $pattern ); ### $test_regex for my $x ( 0..((length $string) - (length$pattern)) ){ my $test_string = substr( $string, $x, length$pattern ); my $score = _test_for_score( 1, $test_string, $test_regex ); #### $test_string #### <where> - returned: $score if( $score and $score >= 2 ){ print "String: $test_string, position: $x, score: $score\n"; } } sub _test_for_score{ my ( $min_score, $test_string, $test_regex ) = @_; my $score = undef; my $x = length $test_string; for my $regex ( reverse @$test_regex ){ #### $x #### $regex if( $test_string =~ /$regex/ ){ $score = $x; last; } $x--; } return $score; } sub _build_test{ my ( $pattern ) = @_; my $test_reg; my $pattern_reg = [ split //, $pattern ]; for my $substitutions ( 0..(length $pattern) ){ my $return_reg = _build_score_reg( $substitutions, $pattern_re +g ); my $string = join '|', @$return_reg; # add the look ahead / look behind assertions to $string here +... $test_reg->[ (length $pattern) - $substitutions ] = qr/$string +/; #### $test_reg } return $test_reg; } # This is the fuzzy regex build part sub _build_score_reg{ my ( $score, $pattern_reg, $finsished_pos ) = @_; $finsished_pos ||= 0; #### <where> - reached _build_score_reg #### <where> - score : $score #### <where> - pattern : $pattern_reg #### <where> - complete: $finsished_pos my $sub_reg; if( $score == 0 ){ $sub_reg->[0] = join '', @$pattern_reg; }else{ for my $x ( $finsished_pos..$#$pattern_reg ){ #### <where> - running position: $x last if ( $#$pattern_reg - $score + 1 ) < $x; my $copy_reg = [ @$pattern_reg ];# To not overwrite the pa +ssed reference $copy_reg->[$x] = '.'; #### $copy_reg #### $finsished_pos my $recursive_reg = _build_score_reg( $score - 1, $copy_re +g, $x + 1 ); #### $recursive_reg push @$sub_reg, ( join '|', @$recursive_reg ); #### $sub_reg } } #### $sub_reg return $sub_reg; }
With the Results
### $test_regex: [ ### qr/(?-xism:....)/, ### qr/(?-xism:...E|..J.|.E..|J...)/, ### qr/(?-xism:..JE|.E.E|.EJ.|J..E|J.J.|JE..)/, ### qr/(?-xism:.EJE|J.JE|JE.E|JEJ.)/, ### qr/(?-xism:JEJE)/ ### ] String: JKJU, position: 1, score: 2 String: JUJH, position: 3, score: 2 String: JHJD, position: 5, score: 2 String: JDJE, position: 7, score: 3 String: JEJE, position: 9, score: 4 String: JEJE, position: 11, score: 4 String: JEDE, position: 13, score: 3 String: DEJO, position: 15, score: 2 String: JOJO, position: 17, score: 2 String: JOJJ, position: 19, score: 2 String: JJJA, position: 21, score: 2 String: JHJS, position: 26, score: 2 String: SHJE, position: 29, score: 2 String: JEFE, position: 31, score: 3 String: FEJU, position: 33, score: 2 String: JUJE, position: 35, score: 3 String: JEJU, position: 37, score: 3 String: JUJK, position: 39, score: 2


Comment on Re^3: approximate regular expression
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://961321]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (13)
As of 2015-07-01 19:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (19 votes), past polls