Re: Puzzle: Longest Increasing Sequence
by ikegami (Pope) on Apr 16, 2006 at 08:49 UTC

This is just a conversion to Perl of this pseudocode. It finds a Longuest Ascending Sequence in O(N log N). The sequence it happens to find is the first sequence when sorted numerically.
# Minimal Longest Ascending Subsequence
sub mlas {
my $n_las; # Length of output sequence
my @las; # Output sequence
my @terminals; # Array of terminals
my @backptrs; # Array of back pointers
$terminals[1] = 0;
$backptrs[0] = 1;
$n_las = 1;
for my $i (1..$#_) {
my $low = 1;
my $high = $n_las;
while ($low <= $high) {
my $mid = int(($low + $high) / 2);
if ($_[$i] <= $_[$terminals[$mid]]) {
$high = $mid  1;
} else {
$low = $mid + 1;
}
}
$terminals[$low] = $i;
if ($low <= 1) {
$backptrs[$i] = 1;
} else {
$backptrs[$i] = $terminals[$low  1];
}
if ($low > $n_las) {
$n_las++;
}
}
for (
my $i = $terminals[$n_las];
$i != 1;
$i = $backptrs[$i]
) {
unshift(@las, $_[$i]);
}
return @las;
}
{
my @a = (8, 6, 5, 1, 9, 3, 7, 4, 2, 10);
my @mlas = mlas(@a);
local $, = ", ";
local $\ = "\n";
print(@mlas); # 1, 3, 4, 10
}
{
my @a = (3, 10, 6, 1, 5, 7, 8, 2, 4, 9);
my @mlas = mlas(@a);
local $, = ", ";
local $\ = "\n";
print(@mlas); # 1, 5, 7, 8, 9
}
The page mentions a secant search would be faster than a binary search although the O() of would still be the same The O() would be slightly better, as noted by jdalbec.  [reply] [d/l] 

Actually, O(n*log(n)**((sqrt(5)1)/2)) is strictly less than O(n*log(n)). The difference probably won't be noticeable for reasonable values of n, though, since log(n) grows so slowly anyway.
 [reply] 

oh yeah. I thought it was *, not **
 [reply] 

Just for kicks, here is the same implemented in Ruby. I noticed that the original page has a bug you fixed  towards the end of the main loop, there is:
if ($low > $n_las) {
$n_las++;
}
While in the pseudocode it's "if (low .ge. n_as) n_as += 1" and I presume .ge. means "greater or equal" ?!
Anyways, here's my implementation. It is notably cleaner than yours, but twice slower:
def find_lis_greedy(seq)
n = seq.length
terminals = [0] * (n + 1)
backptrs = [1] + [0] * (n  1)
lis = []
n_lis = 1
1.upto(n  1) do i
low = 1
high = n_lis
while low <= high
mid = (low + high) / 2
if seq[i] <= seq[terminals[mid]]
high = mid  1
else
low = mid + 1
end
end
terminals[low] = i
if low <= 1
backptrs[i] = 1
else
backptrs[i] = terminals[low  1]
end
n_lis += 1 if low > n_lis
end
lis[n_lis  1] = seq[terminals[n_lis]]
temp = terminals[n_lis]
(n_lis  2).downto(0) do i
temp = backptrs[temp]
lis[i] = seq[temp]
end
return lis
end
 [reply] [d/l] [select] 

It is indeed a bug fix, not a typo.
The other thing I did was to clean up the code that builds the return value by using unshift. I no longer needed a counting loop, so I removed the counter variable ("i") and replaced the counting loop with a while loop (implemented as a for loop). "temp" was a bad name — even i, j, etc would be better since temp holds an index — so I renamed it to "i".
 [reply] [d/l] 

It finds a Longuest Ascending Sequence in O(N log N). The sequence it happens to find is the first sequence when sorted numerically.
I implemented a variant of this that will find all longest increasing sequences in the list.
It returns an iterator over all of the LAS'es in the list. The code contains an explanation of how it works.

$world=~s/war/peace/g
 [reply] [d/l] 
Re: Puzzle: Longest Increasing Sequence
by BrowserUk (Pope) on Apr 16, 2006 at 11:05 UTC

Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.  Rule 1 has a caveat!  Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] [d/l] 

 [reply] 
Re: Puzzle: Longest Increasing Sequence
by kvale (Monsignor) on Apr 16, 2006 at 08:50 UTC

 [reply] 
Re: Puzzle: Longest Increasing Sequence
by TedPride (Priest) on Apr 16, 2006 at 22:02 UTC

It was a homework problem assigned to my dad by his Advanced Algorithms instructor (5000level course), but the instructor only required an O(n^2) algorithm. I, on the other hand, was sure there was SOME way to do it in O(n lg n), so I lay there thinking about the problem for several nights (what does this say about me?) and finally figured out how to do it in a relatively efficient manner. Basically, you store the 1sequence, 2sequence, 3sequence, etc. with the lowest last element. The last elements are therefore arranged in order from the 1sequence to the xsequence, since a longer sequence has a larger smallest end item. So for all items in the original array (n), you do a find in the sequence array for the largest item that is smaller than or equal to the item (lg s, where s is the maximum sequence length found so far, or lg n for the worstcase input), link your current item to that sequence and assign it to the sequence one level up, for a total of O(n lg n) time.
So, with an input of 3,5,2,7,12,1...
3: No items to compare to, so put in position 1:
3
5: Larger than 3, so added to the sequence in position 1 and moved to position 2:
3  3,5
2: No items smaller, so replaces the squence in position 1:
2  3,5
7: Larger than 5, so added to sequence in position 2 and put in position 3:
2  3,5  3,5,7
12: Added and put in position 4:
2  3,5  3,5,7  3,5,7,12
1: All items larger, so put in position 1:
1  3,5  3,5,7  3,5,7,12
The longest sequence is 3,5,7,12. Now, you might say that since you have to move each sequence up one as you go along, it doesn't matter that it takes only lg n time to find the proper sequence, since it could theoretically take n time to move the sequence with worstcase input. HOWEVER, if you use nodes and links rather than moving the entire sequence, each move only takes constant time, so you succeed in your objective of O(n lg n) or less time, and the most storage space you'll need is n nodes and n sequence storage.
The main issues to think about are:
How do you find the largest key smaller than (or equal to, if allowing nonunique keys) a certain key in a sorted array in O(lg n) time?
How do you keep track of which nodes are linked to and which are no longer needed? I know Perl automatically deallocates structures that are no longer linked to, but if you were programming this in something like C++, you'd need to keep track.
If you can solve these two problems, the basic algorithm itself is quite easy.  [reply] 

This algorithm will not give us a O(nlogn) running time, because you are copying one of the subsequences on each step. So, it will give you O(n^2). To have O(nlogn) you should do binary search through the last elements of each subsequence and insert it respectively...
 [reply] 
Re: Puzzle: Longest Increasing Sequence
by gaal (Parson) on Apr 16, 2006 at 08:25 UTC

Isn't this O(n)? Scan the list and maintain the tuple (start index, length) for your best candidate so far, as well as for the subsequence you are looking at now. If you are currently looking at your best candidate, keep incremening both length field as long as the sequence is monotonic. Reset the 'current' tuple or install the 'best' tuple as appropriate.  [reply] 

 [reply] [d/l] 

9 7 5 3 1 x
(i.e. you've processed all but the last number) you won't know until you look at the last number which of the preceding numbers (if any) are in the solution.
x could be 10, or it could be 2, for example.
So you need to keep all the increasing subsequences seen so far; and that could be as many as n; and that means the algorithm is at least O(n log n).
We're building the house of the future together.
 [reply] [d/l] 

Well, no, you only need to keep at most one increasing subsequence for each length.
In your example, you can forget about the 9, 7, 5, and 3 entirely. Any increasing subsequence beginning with them will be just as long if they begin with the 1 instead.
 [reply] 

Yes, as BrowserUk++ points out, I was wrongly assuming contiguous subsequences in the problem statement.
 [reply] 
Re: Puzzle: Longest Increasing Sequence
by demerphq (Chancellor) on Apr 16, 2006 at 21:59 UTC

Heres my go. Originally I had something that was performing between 2N1 and N*((N+1)/2), but kaif provided the insight I needed to get it to N log N when he pointed out that you only need to store one path of a given length at any one time.
BTW, I wanted a single routine that could return either the indexes or the actual values, so this is a little clunkier than it need be if you didnt care about the indexes.
Update: I was wrong, the worse case run time for this routine is (N+1)/2*N, which is O(N^{2}).

$world=~s/war/peace/g
 [reply] [d/l] 