Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Longest Common SubSequence Not Working Correctly

by blokhead (Monsignor)
on Nov 13, 2007 at 20:01 UTC ( #650578=note: print w/ replies, xml ) Need Help??


in reply to Re: Longest Common SubSequence Not Working Correctly
in thread Longest Common SubSequence Not Working Correctly

Don't let the name of the subroutine fool you. The "brute force" algorithm is not really "brute forcing" the problem. A brute force approach would be to consider every possible subsequence of the strings, taking O(2min(x,y)) time.

In fact, the "brute force" algorithm is doing the same thing as the recursive algorithm (i.e., doing the dynamic programming solution), but iteratively. It uses a standard trick for making a recursive memoizing dynamic programming algorithm iterative. Since the two algorithms solve the problem in essentially the same way, but the iterative one doesn't have the overhead of subroutine calls (which are slow in Perl), it is no surprise that the iterative one is faster.

Usually it's easier and more intuitive to write a dynamic programming problem in terms of recursive calls. However, it's necessary to memoize the result of each recursive call, because several other subproblems might use that result of this subproblem in their computation.

Now imagine a table that holds all of these memoized results. What happens to this table while the recursive algorithm is running? The table is gradually filling up. How does it fill up? Well, in this case, to compute the value of the subproblem ($a,$b), I need to get the solutiosn for at most these three subproblems:

($a,substr($b,0,-1)), (substr($a,0,-1),b), (substr($a,0,-1),substr($b,0,-1))
In other words, I need to have those 3 cells in the table filled in before I can fill in this cell.

So suppose I now do things iteratively instead of recursively, and just concentrate on filling up the table. I'll visit the table's cells in such a way so that I visit the cell ($a,$b) after I visit the three above cells. That way, to fill up the cell ($a,$b), I just check those 3 other cells, do some local comparisons, and I'm done. Finally, the last cell in the table is generally the answer to the "main" subproblem, and I can return that. That's exactly what this "brute force" algorithm is doing.

blokhead


Comment on Re^2: Longest Common SubSequence Not Working Correctly
Re^3: Longest Common SubSequence Not Working Correctly
by Anonymous Monk on Nov 14, 2007 at 05:16 UTC
    Thanks for the explanation. How would I modify the above brute force code to be truely brute force? Also, the code only print out the length of the sequence but does not print out the characters of the sequence. I tried putting in some print statement in between but it does not seem to work correctly. Any helps?
      The lcsbruteforce algorithm maintains this big table of solutions to subproblems. In this example, it's maintaining just the length of the LCS. Just change it to maintain the actual substring itself:
      sub lcsbruteforce { my($x, $y) = @_; my(@v, $cx, $cy, $left, $above); for my $xi (0 .. length($x) - 1) { $cx = substr $x, $xi, 1; for my $yi (0 .. length($y) - 1) { $cy = substr $y, $yi, 1; if ($cx eq $cy) { # $v[$xi][$yi] = 1 + (($xi && $yi) ? $v[$xi - 1][$yi - 1] : 0); $v[$xi][$yi] = ($xi && $yi ? $v[$xi-1][$yi-1] : "") . $cx; } else { # $left = ($xi && $v[$xi - 1][$yi]) || 0; # $above = ($xi && $v[$xi][$yi - 1]) || 0; # $v[$xi][$yi] = ($left > $above) ? $left : $above; $left = ($xi && $v[$xi - 1][$yi]) || ""; $above = ($xi && $v[$xi][$yi - 1]) || ""; $v[$xi][$yi] = length($left) > length($above) ? $left : $above +; } } } return $v[length($x) - 1][length($y) - 1]; }
      To change it to an actual brute force algorithm? That would be pretty strange. The brute force algorithm is:
      $best = ""; for every subsequence $s of $x: if $s is also a subsequence of $y: $best = $s if length($s) > length($best); return $best;
      Of course, the part where you get all subsequences and check for subsequence-ness is a pain. You can probably generate all subsequences using Algorithm::Loops, and perhaps use some regex stuff to check whether a string was a subsequence of another.

      blokhead

Re^3: Longest Common SubSequence Not Working Correctly
by Anonymous Monk on Nov 15, 2007 at 00:49 UTC
    Thank you Blokhead. I still have lots to learn about Perl. I have another question for you. The sub lcs takes 2 parameters ($a, $b) but inside the loop where the recursive call is made, the program did a call with

    return lcs($a, $b) . $az;

    what does the . $az do? I tried printing out the $a, $b but did not see any differences. The new string is one character shorter. When I remove $az from the return, the program produce the wrong result. Thanks again for your help.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://650578]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2014-07-23 22:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (154 votes), past polls