Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re: Heap sorting in perl

by Abigail-II (Bishop)
on Apr 07, 2003 at 00:15 UTC ( #248501=note: print w/replies, xml ) Need Help??

in reply to Re: Re: Heap sorting in perl
in thread Heap sorting in perl

The original question didn't specify how the data would be available, so I'm just going to assume it fits in memory. Note that a heapsort only requires a constant amount of memory additional to the data to be sorted.

If the data doesn't fit in memory, then that's a different problem. One can always tie the used array to whatever solution will keep the data to be worked on out of memory.


Replies are listed 'Best First'.
Re: Re: Heap sorting in perl
by Anonymous Monk on Apr 07, 2003 at 03:40 UTC
      Yeah, whatever. Here's a modification of the program. Give the number of elements to extract as a command line argument, and the elements to extract from on STDIN. It assumes that there are enough elements on STDIN.

      #!/usr/bin/perl use strict; use warnings; my @heap; my $M = @ARGV ? shift : 10; sub heapify; sub heapify { my $idx = shift; my $max = $idx; for my $try (2 * $idx + 1, 2 * $idx + 2) { $max = $try if $try < @heap && $heap [$try] > $heap [$max] } return if $max == $idx; @heap [$idx, $max] = @heap [$max, $idx]; heapify $max; } sub extract_max { return unless @heap; my $min = $heap [0]; my $tmp = pop @heap; if (@heap) { $heap [0] = $tmp; heapify 0; } return $min; } # First load the heap with initial data. for (1 .. $M) { push @heap => scalar <>; } chomp @heap; # # Heapify it. # for (my $i = int (@heap / 2); $i --;) { heapify $i; } # # Deal with the rest. # while (<>) { chomp; next if $_ >= $heap [0]; # It's too large. $heap [0] = $_; heapify 0; } my @result; push @result => extract_max while @heap; @result = reverse @result; print "@result\n"; __END__

      The running time of this is O (N log k), using O (k) memory.


        Every so often I really wish that I had learned more about algorithms. This is one of those times.


      In said note, blakem writes:
      Heap sorting is attractive because selecting the smallest M elements from a dataset of size N (N being much larger than M) requires storing only M elements in memory at any one time.
      That means that he can very well store the entire heap, the M elements, in memory.

      Makeshifts last the longest.

        Abigail's original solution as I read it requires that the entire dataset of size N be passed into the function smallest(). This requires that you have enough memory for N things up front. Which is the original poster's problem, he doesn't. He only comfortably has room for M things, but is fetching through a much larger dataset of N.

        Abigail's follow-up solution resolves this limitation admirably well...

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://248501]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2021-12-01 03:50 GMT
Find Nodes?
    Voting Booth?

    No recent polls found