in reply to Heap sorting in perl

#!/usr/bin/perl use strict; use warnings; my @heap; sub heapify; sub heapify { my $idx = shift; my $min = $idx; for my $try (2 * $idx + 1, 2 * $idx + 2) { $min = $try if $try < @heap && $heap [$try] < $heap [$min] } return if $min == $idx; @heap [$idx, $min] = @heap [$min, $idx]; heapify $min; } sub extract_min { return unless @heap; my $min = $heap [0]; my $tmp = pop @heap; if (@heap) { $heap [0] = $tmp; heapify 0; } return $min; } # # First argument the number of smallest elements to be returned. # Then the list to find the elements in. # sub smallest { my $k = shift; @heap = @_; for (my $i = int (@heap / 2); $i --;) { heapify $i; } my @result; push @result => extract_min while @heap && $k --; @result; } __END__

The running time of this is O (N + k log N), where N is the size of the set, and k the number of elements to be returned.


Replies are listed 'Best First'.
Re: Re: Heap sorting in perl
by Anonymous Monk on Apr 06, 2003 at 23:59 UTC

    Doesn't this require the whole list to be in memory? And doesn't that create a problem with the "from a very large dataset"?

      The original question didn't specify how the data would be available, so I'm just going to assume it fits in memory. Note that a heapsort only requires a constant amount of memory additional to the data to be sorted.

      If the data doesn't fit in memory, then that's a different problem. One can always tie the used array to whatever solution will keep the data to be worked on out of memory.