Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
There's more than one way to do things
 
PerlMonks  

Re^2: Longest Common Subsequence

by Limbic~Region (Chancellor)
on May 16, 2006 at 19:47 UTC ( #549876=note: print w/ replies, xml ) Need Help??


in reply to Re: Longest Common Subsequence
in thread Longest Common Subsequence

If you understood the mental process I went through to create the Longest Common Subsequence, it should not be hard to see how to adapt it to the Longest Common Substring.

Once we have a list of character positions across strings, we can create ordered piles where each mapping has a difference of 1 in each position with its adjacent neighbors. You can get rid of the need for looping to find mappings greater because you can predict the values they should contain. Then you just need to find the largest pile and that represents the LCS (S = substring).

Arrays are replaced with hashes. Picking a hash key at random, you place it in a new pile and then predict the next mapping. If that key exists you place it on the same pile. You repeat this process until you have no more mappings. You then start back at the beginning of the pile and look for smaller items. Once no more items fit in that pile, you move on to the next hash key.

use Algorithm::Loops 'NestedLoops'; use List::Util 'reduce'; my @str = map {chomp; $_} <DATA>; print LCS(@str), "\n"; sub LCS{ my @str = @_; my @pos; for my $i (0 .. $#str) { my $line = $str[$i]; for (0 .. length($line) - 1) { my $char= substr($line, $_, 1); push @{$pos[$i]{$char}}, $_; } } my $sh_str = reduce {length($a) < length($b) ? $a : $b} @str; my %map; CHAR: for my $char (split //, $sh_str) { my @loop; for (0 .. $#pos) { next CHAR if ! $pos[$_]{$char}; push @loop, $pos[$_]{$char}; } my $next = NestedLoops([@loop]); while (my @char_map = $next->()) { my $key = join '-', @char_map; $map{$key} = $char; } } my @pile; for my $seq (keys %map) { push @pile, $map{$seq}; for (1 .. 2) { my $dir = $_ % 2 ? 1 : -1; my @offset = split /-/, $seq; $_ += $dir for @offset; my $next = join '-', @offset; while (exists $map{$next}) { $pile[-1] = $dir > 0 ? $pile[-1] . $map{$next} : $map{ +$next} . $pile[-1]; $_ += $dir for @offset; $next = join '-', @offset; } } } return reduce {length($a) > length($b) ? $a : $b} @pile; } __DATA__ qwertyJoyzhnac Jyzshuaqwertyb Joyzqwertybbc Yqwertyblah

Cheers - L~R


Comment on Re^2: Longest Common Subsequence
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://549876]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (5)
As of 2014-04-20 01:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls