Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Puzzle: Longest Increasing Sequence

by TedPride (Priest)
on Apr 16, 2006 at 06:52 UTC ( #543621=perlquestion: print w/ replies, xml ) Need Help??
TedPride has asked for the wisdom of the Perl Monks concerning the following question:

Your job, should you choose to accept it, is to come up with an algorithm to find the longest increasing sequence of numbers in a given array in O(n lg n) time. For instance:

A = {8, 6, 5, 1, 9, 3, 7, 4, 2, 10}
Longest increasing sequence = {1, 3, 4, 10}

A = {3, 10, 6, 1, 5, 7, 8, 2, 4, 9}
Longest increasing sequence = {1, 5, 7, 8, 9}

If more than one longest sequence is possible, any of the available longest sequences may be returned.

Have fun!

Comment on Puzzle: Longest Increasing Sequence
Re: Puzzle: Longest Increasing Sequence
by gaal (Parson) on Apr 16, 2006 at 08:25 UTC
    Isn't this O(n)? Scan the list and maintain the tuple (start index, length) for your best candidate so far, as well as for the subsequence you are looking at now. If you are currently looking at your best candidate, keep incremening both length field as long as the sequence is monotonic. Reset the 'current' tuple or install the 'best' tuple as appropriate.

      This works if you are looking for the longest contiguous ascending sequence.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The problem is, you could have as many as i (where i = 1 to n) "best candidates" at any time. Consider — if the list looks like this:

      9 7 5 3 1 x
      (i.e. you've processed all but the last number) you won't know until you look at the last number which of the preceding numbers (if any) are in the solution. x could be 10, or it could be 2, for example.

      So you need to keep all the increasing subsequences seen so far; and that could be as many as n; and that means the algorithm is at least O(n log n).

      We're building the house of the future together.
        Well, no, you only need to keep at most one increasing subsequence for each length. In your example, you can forget about the 9, 7, 5, and 3 entirely. Any increasing subsequence beginning with them will be just as long if they begin with the 1 instead.
        Yes, as BrowserUk++ points out, I was wrongly assuming contiguous subsequences in the problem statement.
Re: Puzzle: Longest Increasing Sequence
by ikegami (Pope) on Apr 16, 2006 at 08:49 UTC
    This is just a conversion to Perl of this pseudocode. It finds a Longuest Ascending Sequence in O(N log N). The sequence it happens to find is the first sequence when sorted numerically.
    # Minimal Longest Ascending Subsequence sub mlas { my $n_las; # Length of output sequence my @las; # Output sequence my @terminals; # Array of terminals my @backptrs; # Array of back pointers $terminals[1] = 0; $backptrs[0] = -1; $n_las = 1; for my $i (1..$#_) { my $low = 1; my $high = $n_las; while ($low <= $high) { my $mid = int(($low + $high) / 2); if ($_[$i] <= $_[$terminals[$mid]]) { $high = $mid - 1; } else { $low = $mid + 1; } } $terminals[$low] = $i; if ($low <= 1) { $backptrs[$i] = -1; } else { $backptrs[$i] = $terminals[$low - 1]; } if ($low > $n_las) { $n_las++; } } for ( my $i = $terminals[$n_las]; $i != -1; $i = $backptrs[$i] ) { unshift(@las, $_[$i]); } return @las; } { my @a = (8, 6, 5, 1, 9, 3, 7, 4, 2, 10); my @mlas = mlas(@a); local $, = ", "; local $\ = "\n"; print(@mlas); # 1, 3, 4, 10 } { my @a = (3, 10, 6, 1, 5, 7, 8, 2, 4, 9); my @mlas = mlas(@a); local $, = ", "; local $\ = "\n"; print(@mlas); # 1, 5, 7, 8, 9 }

    The page mentions a secant search would be faster than a binary search although the O() of would still be the same The O() would be slightly better, as noted by jdalbec.

      Actually, O(n*log(n)**((sqrt(5)-1)/2)) is strictly less than O(n*log(n)). The difference probably won't be noticeable for reasonable values of n, though, since log(n) grows so slowly anyway.
        oh yeah. I thought it was *, not **
      Just for kicks, here is the same implemented in Ruby. I noticed that the original page has a bug you fixed - towards the end of the main loop, there is:
      if ($low > $n_las) { $n_las++; }
      While in the pseudocode it's "if (low .ge. n_as) n_as += 1" and I presume .ge. means "greater or equal" ?!

      Anyways, here's my implementation. It is notably cleaner than yours, but twice slower:

      def find_lis_greedy(seq) n = seq.length terminals = [0] * (n + 1) backptrs = [-1] + [0] * (n - 1) lis = [] n_lis = 1 1.upto(n - 1) do |i| low = 1 high = n_lis while low <= high mid = (low + high) / 2 if seq[i] <= seq[terminals[mid]] high = mid - 1 else low = mid + 1 end end terminals[low] = i if low <= 1 backptrs[i] = -1 else backptrs[i] = terminals[low - 1] end n_lis += 1 if low > n_lis end lis[n_lis - 1] = seq[terminals[n_lis]] temp = terminals[n_lis] (n_lis - 2).downto(0) do |i| temp = backptrs[temp] lis[i] = seq[temp] end return lis end

        It is indeed a bug fix, not a typo.

        The other thing I did was to clean up the code that builds the return value by using unshift. I no longer needed a counting loop, so I removed the counter variable ("i") and replaced the counting loop with a while loop (implemented as a for loop). "temp" was a bad name — even i, j, etc would be better since temp holds an index — so I renamed it to "i".

      It finds a Longuest Ascending Sequence in O(N log N). The sequence it happens to find is the first sequence when sorted numerically.

      I implemented a variant of this that will find all longest increasing sequences in the list.

      It returns an iterator over all of the LAS'es in the list. The code contains an explanation of how it works.

      ---
      $world=~s/war/peace/g

Re: Puzzle: Longest Increasing Sequence
by kvale (Monsignor) on Apr 16, 2006 at 08:50 UTC
    Hmm, sounds like a homework problem.

    I'll give you a hint. Look up Patience Sorting.

    -Mark

Re: Puzzle: Longest Increasing Sequence
by BrowserUk (Pope) on Apr 16, 2006 at 11:05 UTC


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      that's exponential, unfortunately. Try it on a sorted array, and you'll end up going through all possible subsequences.

      Update: not to mention that you'll run out of memory on a list longer than 20 elements..

Re: Puzzle: Longest Increasing Sequence
by demerphq (Chancellor) on Apr 16, 2006 at 21:59 UTC

    Heres my go. Originally I had something that was performing between 2N-1 and N*((N+1)/2), but kaif provided the insight I needed to get it to N log N when he pointed out that you only need to store one path of a given length at any one time.

    BTW, I wanted a single routine that could return either the indexes or the actual values, so this is a little clunkier than it need be if you didnt care about the indexes.

    Update: I was wrong, the worse case run time for this routine is (N+1)/2*N, which is O(N2).

    ---
    $world=~s/war/peace/g

Re: Puzzle: Longest Increasing Sequence
by TedPride (Priest) on Apr 16, 2006 at 22:02 UTC
    It was a homework problem assigned to my dad by his Advanced Algorithms instructor (5000-level course), but the instructor only required an O(n^2) algorithm. I, on the other hand, was sure there was SOME way to do it in O(n lg n), so I lay there thinking about the problem for several nights (what does this say about me?) and finally figured out how to do it in a relatively efficient manner. Basically, you store the 1-sequence, 2-sequence, 3-sequence, etc. with the lowest last element. The last elements are therefore arranged in order from the 1-sequence to the x-sequence, since a longer sequence has a larger smallest end item. So for all items in the original array (n), you do a find in the sequence array for the largest item that is smaller than or equal to the item (lg s, where s is the maximum sequence length found so far, or lg n for the worst-case input), link your current item to that sequence and assign it to the sequence one level up, for a total of O(n lg n) time.

    So, with an input of 3,5,2,7,12,1...

    3: No items to compare to, so put in position 1:
    3

    5: Larger than 3, so added to the sequence in position 1 and moved to position 2:
    3 | 3,5

    2: No items smaller, so replaces the squence in position 1:
    2 | 3,5

    7: Larger than 5, so added to sequence in position 2 and put in position 3:
    2 | 3,5 | 3,5,7

    12: Added and put in position 4:
    2 | 3,5 | 3,5,7 | 3,5,7,12

    1: All items larger, so put in position 1:
    1 | 3,5 | 3,5,7 | 3,5,7,12

    The longest sequence is 3,5,7,12. Now, you might say that since you have to move each sequence up one as you go along, it doesn't matter that it takes only lg n time to find the proper sequence, since it could theoretically take n time to move the sequence with worst-case input. HOWEVER, if you use nodes and links rather than moving the entire sequence, each move only takes constant time, so you succeed in your objective of O(n lg n) or less time, and the most storage space you'll need is n nodes and n sequence storage.

    The main issues to think about are:

    How do you find the largest key smaller than (or equal to, if allowing non-unique keys) a certain key in a sorted array in O(lg n) time?

    How do you keep track of which nodes are linked to and which are no longer needed? I know Perl automatically deallocates structures that are no longer linked to, but if you were programming this in something like C++, you'd need to keep track.

    If you can solve these two problems, the basic algorithm itself is quite easy.

      This algorithm will not give us a O(nlogn) running time, because you are copying one of the subsequences on each step. So, it will give you O(n^2). To have O(nlogn) you should do binary search through the last elements of each subsequence and insert it respectively...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://543621]
Approved by wfsp
Front-paged by wfsp
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2014-10-25 20:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (148 votes), past polls