Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Match and Extract String with Regex

by ikegami (Patriarch)
on Nov 17, 2007 at 01:47 UTC ( [id://651363]=note: print w/replies, xml ) Need Help??


in reply to Match and Extract String with Regex

Are you asking to find the Longest Common Subsequence (LCS) of the strings? Searching for that should reveal nodes on the topic.

Or if all you want to do is check if $str1 is in $str2, then index will do the trick.

if (index($str2, $str1) >= 0) { ... }

Replies are listed 'Best First'.
Re^2: Match and Extract String with Regex
by erroneousBollock (Curate) on Nov 17, 2007 at 09:25 UTC
    Searching for (Longest Common Subsequence) should reveal nodes on the topic.
    ... only if they understand how/why Longest Common Subsequence is a generalisation of Longest Common Substring.

    While it's an interesting area in which to educate oneself, String::LCSS does exactly what you need.

    -David

      String::LCSS does exactly what you need.

      I don't want to bash String::LCSS, but the implementation seems to be the naive O(n^3) algorithm instead of the O(mn) dynamic programming solution (http://en.wikipedia.org/wiki/Longest_common_substring_problem). A quick and dirty (and not thoroughly tested) implementation is much faster (although probably buggy).

      sub lcss2 { my ($s, $t) = @_; my $z = 0; my $m = length $s; my $n = length $t; my @S = (undef, split(//, $s)); my @T = (undef, split(//, $t)); my @L; my @ret; for my $i ( 1 .. $m ) { for my $j ( 1 .. $n ) { if ($S[$i] eq $T[$j]) { $L[$i-1][$j-1] ||= 0; $L[$i][$j] = $L[$i-1][$j-1] + 1; if ($L[$i][$j] > $z) { $z = $L[$i][$j]; @ret = (); } if ($L[$i][$j] == $z) { push @ret,substr($s, ($i-$z), $z); } } } } # warn Dumper \@L; return join '*', @ret; }
      my $s1 = '6'x 200 . 'zyzxx'; my $s2 = '5'x 200 . 'abczyzefg'; my $count = 1; timethese($count, { 'String::LCSS' => sub { String::LCSS::lcss( $s1, $s2 ) }, 'dynprog' => sub { lcss2( $s1, $s2 )}, });
      Update: Took the opportunity to learn XS and wrote String::LCSS_XS.
        Ah, well spotted.

        I don't suppose you'd like to file a bug? ;)

        -David

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://651363]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2024-03-29 15:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found