Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Heap sorting in perl

by pg (Canon)
on Apr 05, 2003 at 15:16 UTC ( #248298=note: print w/replies, xml ) Need Help??


in reply to Heap sorting in perl

It seems not neccessary to have a separate sort step.

You can do this:
  1. First take N elements from the begining of the unsorted array, and form a new array.
  2. Sort the new array, which would be much fast, assuming N is much smaller than the original array size.
  3. Go thru the rest array element in the original array, insert a element into the sorted array, if it is smaller than the largest element of the new array, and does not exist in the new array, and also get rid of the largest element in the new array.
  4. This algorithm goes thru the entire original array, but only for comparing, and most of the time, no extra effort (inserting) other than comparing is needed.
(strike out per adrianh's comment)If you are willing to spend a little bit more effort, then I sugegst this improvement:

In step 3, instead of going thru the entire original array, you can chop the original array into pieces, say each piece contains 10 * N element (tweak with this 10, it could be 5, could be 50...). Sort each piece (we are sorting some smaller arrays), then only take the first N elements of the sorted piece, and go thru them.

Replies are listed 'Best First'.
Re: Re: Heap sorting in perl
by blakem (Monsignor) on Apr 05, 2003 at 19:58 UTC
    A heap is a specialized datastructure that can be thought of as an "automatically sorted array" given a somewhat scaled down definition of "sorted". Sorted in this case means that the largest element is always easy to find and remove, and inserting new elements is also easy.

    Given that, our steps to find the smallest M elements would be:

    1. Build a heap from our first M items
    2. Compare next item to largest element in heap
    3. Replace largest with new if new < largest
    4. Repeat steps 2 and 3 for all remaining items

    As you can see, this is very similar to the strategy you proposed. In fact, you could view heaps as a datastructure designed specifically to implement this algorithm efficiently.

    -Blake

Re^2: Heap sorting in perl
by adrianh (Chancellor) on Apr 05, 2003 at 15:44 UTC
    In step 3, instead of going thru the entire original array, you can chop the original array into pieces, say each piece contains 10 * N element (tweak with this 10, it could be 5, could be 50...). Sort each piece (we are sorting some smaller arrays), then only take the first N elements of the soted piece, and go thru them.

    How is this an improvement (he asks curiously ;-) I don't see how creating a sorting several subsets of the data can be cheaper in time or space than a single pass through everything.

      adrianh is obviously right about this.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://248298]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (6)
As of 2017-10-23 09:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My fridge is mostly full of:

















    Results (277 votes). Check out past polls.

    Notices?