Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: Match and Extract String with Regex

by lima1 (Curate)
on Nov 17, 2007 at 16:58 UTC ( #651423=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Match and Extract String with Regex
in thread Match and Extract String with Regex

String::LCSS does exactly what you need.

I don't want to bash String::LCSS, but the implementation seems to be the naive O(n^3) algorithm instead of the O(mn) dynamic programming solution (http://en.wikipedia.org/wiki/Longest_common_substring_problem). A quick and dirty (and not thoroughly tested) implementation is much faster (although probably buggy).

sub lcss2 { my ($s, $t) = @_; my $z = 0; my $m = length $s; my $n = length $t; my @S = (undef, split(//, $s)); my @T = (undef, split(//, $t)); my @L; my @ret; for my $i ( 1 .. $m ) { for my $j ( 1 .. $n ) { if ($S[$i] eq $T[$j]) { $L[$i-1][$j-1] ||= 0; $L[$i][$j] = $L[$i-1][$j-1] + 1; if ($L[$i][$j] > $z) { $z = $L[$i][$j]; @ret = (); } if ($L[$i][$j] == $z) { push @ret,substr($s, ($i-$z), $z); } } } } # warn Dumper \@L; return join '*', @ret; }
my $s1 = '6'x 200 . 'zyzxx'; my $s2 = '5'x 200 . 'abczyzefg'; my $count = 1; timethese($count, { 'String::LCSS' => sub { String::LCSS::lcss( $s1, $s2 ) }, 'dynprog' => sub { lcss2( $s1, $s2 )}, });
Update: Took the opportunity to learn XS and wrote String::LCSS_XS.


Comment on Re^3: Match and Extract String with Regex
Select or Download Code
Re^4: Match and Extract String with Regex
by erroneousBollock (Curate) on Nov 18, 2007 at 04:10 UTC
    Ah, well spotted.

    I don't suppose you'd like to file a bug? ;)

    -David

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://651423]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2014-09-21 22:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (176 votes), past polls