Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Patience Sorting To Find Longest Increasing Subsequence

by bart (Canon)
on May 05, 2006 at 07:51 UTC ( #547607=note: print w/ replies, xml ) Need Help??


in reply to Patience Sorting To Find Longest Increasing Subsequence

Your code doesn't do the complete patience sorting, it does more or less sort, but you're left with some sorted piles of cards after it finishes — typically 11 piles for a deck of 52 cards, I read. You still need to do the second step: pick the lowest card from every pile, but you don't know what pile that is — except when you just start, then it's the leftmost one. It would be some form of merge sort.

I found one of the original scientific papers in PDF format (30 pages, 277k), in year 1999, item 2.

On to your questions.

Do the algorithms still work if the original list sorted in ascending order is non-continguous (! 1..N)?
Of course they do. All you use for the sorting action, is compare two items. Apart from that, their actual value is never used.

Do the algorithms still work if the original list contains duplicates?
Yes, but depending on what you do when they compare the same, your sort might be a stable sort, or not. You would get a stable sort if you treat the current card as the larger one, when they do agree.

Are there ways to maintain the same speed and decrease the memory consumption?
If so, I haven't found one.

Would there be a benefit in replacing the linear card placement search with a binary one or would the overhead outweigh the benefit.
No idea... In practice, I found binary search often to be slower than linear search, for few items — sometimes not even that few, as I've seen linear search being faster for a search in 500 items.

But we're still stuck with an incomplete sorting.

Perhaps binary search, or rather, a binary tree, could be benificial to complete the sorting: when you have constructed the piles, put (just) the top cards in the binary tree, remove the lowest value from it and from the pile, and insert the next one from the same pile that it came from. Repeat until all piles are expleted and the tree is empty.


Comment on Re: Patience Sorting To Find Longest Increasing Subsequence
Re^2: Patience Sorting To Find Longest Increasing Subsequence
by Limbic~Region (Chancellor) on May 05, 2006 at 13:02 UTC
    bart,
    Your code doesn't do the complete patience sorting, it does more or less sort, but you're left with some sorted piles of cards after it finishes typically 11 piles for a deck of 52 cards, I read. You still need to do the second step: pick the lowest card from every pile, but you don't know what pile that is except when you just start, then it's the leftmost one. It would be some form of merge sort.

    No additional sorting is required to find the longest increasing subsequence. As far as picking the lowest card from every pile - that is easy, it is the top card in each pile.

    print "$_->[-1][VAL]\n" for @pile;
    I think you tripped over your wording. I believe you intended to say that at the end of the game, the cards are still not in complete order. While you can get the 1st card for free (the top card on the left most pile), the second card still requires scanning the lowest top card from each pile.

    As I said, this is unnessary for finding the longest increasing subsequence which is what this meditation was about. The straight forward approach to finish the sorting would indeed be a merge sort. To make it even more efficient, the piles could be part of a balanced btree. You always select from the left-most pile and then re-insert that pile back into tree based off the card underneath.

    The questions to ponder were from the perspective of still finding the longest increasing subsequence.

    As far as the binary search. I have 2 versions here to get to the partial sort (sufficient to find the longest increasing subsequence) so benchmarking should be trivial. As far as to complete the sorting - I mentioned one possible way of making it more efficient then a merge sort above but I am not planning on taking it that far. IOW - left as an exercise for the reader. Here is the merge sort though so you have a complete answer to your original question in the CB:

    Cheers - L~R

Re^2: Patience Sorting To Find Longest Increasing Subsequence
by Limbic~Region (Chancellor) on May 05, 2006 at 15:51 UTC
    bart,
    With regards to your follow up in the CB concerning efficiency. Without resorting to complex data structures, the partial patience sort can be done in O(N Log N) assuming the binary search. Finding the LIS is at max an additional N worst case so O(N Log N + N). The question you posed is if the merge sort O(N^2) was the most efficient way to finish the sort.

    If you take advantage of the fact that the top cards are already in ascending order, you can select the top card of the left most pile and then move that pile to keep the new top card in ascending order. To find the new location using a binary search you have O(Log N). To insert in the middle is O(N). Since you have to do this for N items, you result in O(N^2 Log N). A merge sort is only O(N^2) worst case so no, I don't think so.

    As I said in the other reply, using a different datastructure could make the finishing of the sort more efficient but it also adds a great deal more complexity. You are welcome to use a Van_Emde_Boas_tree which claims to be able to do the whole thing in O(N Log N) but that is an exercise left for the reader.

    Cheers - L~R

      A merge sort is only O(N^2) worst case so no,

      No, a merge sort is worst case O(n log n).

      You are welcome to use a Van_Emde_Boas_tree which claims to be able to do the whole thing in O(N Log N)

      Actually the paper by Bespamyatnikh & Segal contains a proof that you can do it in O(N log log N) time. I havent verified it tho.

      But I doubt that the vEB based algorithm would in practice beat a simpler algorithm to do patience sorting. Unless I guess if you were dealing with a deck with tens of thousands or even millions or billions of cards. The overhead of maintaining a vEB tree is prohibitive for small datasets. The cost of doing binary operations on the keys, maintaining the vEB tree and etc, would most likely outweigh that of a simpler less efficient algorithm.

      As bart said, sometimes a binary search algorithm is not as fast a scan, even though one is O(log N) and the other is O(N). The reason of course is that big-oh notation glosses over issues like cost per operation, and only focuses on the "overwhelming factor" involved. So in a binary search if it takes 4 units of work to perform an operation and in linear search it takes 1 then binary search only wins when 4 * log N < N, so for lists shorter than 13 elements there would be no win to a binary search. And I'd bet that in fact the ratio is probably something like 20:1 and not 4:1. Apply this kind of logic to a deck of 52 cards, and IMO its pretty clear that vEB trees are not the way to go for this, regardless of the proof.

      ---
      $world=~s/war/peace/g

        demerphq,
        I had taken bart's word for the merge sort in the CB. I later told him the math was wrong (in the CB) but didn't change it because I knew the math was also wrong for his desired method of finishing the sorting. The thing neither of us considered is that the problem space decreases with each pass. In any case, regardless of the accuracy of the math - the merge sort is still the most efficient given the data structure.

        Actually the paper by Bespamyatnikh & Segal contains a proof that you can do it in O(N log log N) time.
        What is the "it" that is O(N log log N) time though? The partial sort needed to obtain the LIS, obtaining the LIS itself, or completing the patience sort? What I understood from the paper, which I admittedly only read far enough to know that it was over my head, was that the O(N log log N) was not for a complete sort which Wikipedia agrees with.

        As far as the binary search is concerned - I have provided implementations to get to the partial sort using both methods so Benchmarking shouldn't be hard. Additionally, implementing a binary search & splice approach to bench against the merge sort is also straight forward.

        Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://547607]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (9)
As of 2014-08-23 17:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (175 votes), past polls