perlmeditation
Limbic~Region
All,
<br />
A few days ago, I posted a [id://548346|challenge] to find a hidden message. The message can be revealed by determining the [wp://Longest_common_subsequence|longest common subsequence] of all the lines in the data. I was unable to find an implementation that handled more than two strings so I created my own. I originally imposed several constraints but was hoping someone would provide an efficient (non-brute force) method of handling the general case. The general case being:
<ul>
<li>Allows for more than 2 strings</li>
<li>Allows strings to be variable length</li>
<li>Allows chars to be repeated within a string</li>
<li>Allows chars to be absent from some strings</li>
</ul>
Here is that code that I ultimately came up with to solve the general case:
<READMORE>
<CODE>
#!/usr/bin/perl
use strict;
use warnings;
use Algorithm::Loops 'NestedLoops';
use List::Util 'reduce';
use constant NODE => 0;
use constant PATH => 0;
use constant CNT => 1;
use constant DEPTH => 1;
use constant LAST => 2;
my @str = map {chomp; $_} <DATA>;
print LCS(@str), "\n";
# Longest Common Subsequence in strings
sub LCS{
my @str = @_;
# Map pos of chars in each str
my @pos;
for my $i (0 .. $#str) {
my $line = $str[$i];
for (0 .. length($line) - 1) {
my $char= substr($line, $_, 1);
push @{$pos[$i]{$char}}, $_;
}
}
# Find the shortest string
my $sh_str = reduce {length($a) < length($b) ? $a : $b} @str;
# Coord map & lookup of char across lines
# Create permutations if char is duplicate
# Skip any where char isn't in every line
my (%lookup, @order);
CHAR:
for my $char (split //, $sh_str) {
my @map;
for (0 .. $#pos) {
next CHAR if ! $pos[$_]{$char};
push @map, $pos[$_]{$char};
}
my $next = NestedLoops([@map]);
while (my @char_map = $next->()) {
my $ref = [@char_map];
$lookup{$ref} = $char;
push @order, $ref;
}
}
# Predetermine which char mappings are greater than others
my %greater;
for my $i (0 .. $#order - 1) {
for my $j ($i + 1 .. $#order) {
my $gt = is_greater(@order[$i, $j]) or next;
my ($lg, $sm) = $gt == 1 ? ($i, $j) : ($j, $i);
++$greater{$order[$sm]}[CNT];
push @{$greater{$order[$sm]}[NODE]}, "$order[$lg]";
}
}
# A max depth watermark and a path representing that depth
my ($max, $path) = (0, '');
# Work queue
my @work = map [$_, 1, $_], keys %greater;
while (@work) {
my $item = pop @work;
my ($cur_depth, $route, $last_node) = @{$item}[DEPTH, PATH, LAST];
($max, $path) = ($cur_depth, $route) if $cur_depth > $max;
my $left = $greater{$last_node}[CNT];
next if ! $left || $cur_depth + $left <= $max;
push @work, map ["$route:$_", $cur_depth + 1, $_], @{$greater{$last_node}[NODE]};
}
my $hidden_msg = join '', map $lookup{$_}, split /:/, $path;
return $hidden_msg;
}
# All vals in 1 ref > corresponding vals in other ref
sub is_greater {
my ($ref1, $ref2) = @_;
my $cmp = $ref1->[0] <=> $ref2->[0] or return;
($ref1->[$_] <=> $ref2->[$_]) == $cmp || return for 1 .. $#$ref1;
return $cmp;
}
</CODE>
</READMORE>
<p>
Do you know of a more efficient way to do this? Are there ways to improve this approach? Do you know how dynamic programming might help (Some implementations for 2 strings I found used DP)? Do you know how this is useful or how can the approach be adapted and applied to be useful?
</p>
<div class="pmsig"><div class="pmsig-180961">
<p>
Cheers - [Limbic~Region|L~R]
</p>
</div></div>