Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: sorting array of arrays reference

by kimlid2810 (Acolyte)
on Oct 25, 2013 at 20:23 UTC ( [id://1059752]=note: print w/replies, xml ) Need Help??


in reply to Re: sorting array of arrays reference
in thread sorting array of arrays reference

how is this auxiliary array going to be maintained?

Replies are listed 'Best First'.
Re^3: sorting array of arrays reference
by Laurent_R (Canon) on Oct 26, 2013 at 14:16 UTC

    OK, a bit more time now, this is one possible way of doing it:

    use strict; use warnings; use Data::Dumper; my @masterArray = ( ["this", "that", 12563, "something", "else"], ["this", "that", 10, "something", "else"], ["this", "that", 1, "something", "else"], ["this", "that", 125638, "something", "else"], ["this", "that", 300000, "something", "else"], ); my @top3 = sort {$b->[2] <=> $a->[2]} @masterArray[0..2]; my $min_top = $top3[2][2]; for my $sub_aref (@masterArray [3..$#masterArray]) { next if $sub_aref <= $min_top; @top3 = (sort {$b->[2] <=> $a->[2]} @top3, $sub_aref)[0..2]; $min_top = $top3[2][2]; } print Dumper @top3;

    This yields the following result:

    $ perl subdiscard.pl $VAR1 = [ 'this', 'that', 300000, 'something', 'else' ]; $VAR2 = [ 'this', 'that', 125638, 'something', 'else' ]; $VAR3 = [ 'this', 'that', 12563, 'something', 'else' ];

    A more general solution might be like this:

    use strict; use warnings; use Data::Dumper; my $nb_elements = shift; chomp $nb_elements ; my @masterArray; push @masterArray, ["", "", int rand (1e7), ""] for 1..$nb_elements; # print Dumper \@masterArray; my @top3 = sort {$b->[2] <=> $a->[2]} @masterArray[0..2]; my $min_top = $top3[2][2]; $nb_elements--; for my $sub_aref (@masterArray [3..$nb_elements]) { next if $sub_aref->[2] <= $min_top; @top3 = (sort {$b->[2] <=> $a->[2]} @top3, $sub_aref)[0..2]; $min_top = $top3[2][2]; } print Dumper \@top3;

    With one million records, the execution time is about 2.5 seconds:

    $ time perl subdiscard2.pl 1000000 $VAR1 = [ [ '', '', 9999996, '' ], [ '', '', 9999993, '' ], [ '', '', 9999990, '' ] ]; real 0m2.497s user 0m2.386s sys 0m0.108s

    Sorting the original array and taking the first 3 elements takes about 3 times longer:

    $ time perl subdiscard3.pl 1000000 $VAR1 = [ [ '', '', 9999980, '' ], [ '', '', 9999955, '' ], [ '', '', 9999944, '' ] ]; real 0m7.605s user 0m7.518s sys 0m0.093s

    But, in fact, in the 2.5 seconds taken by the program above, most of it (more than 2.2 seconds) is used for populating the array with random values, so that the difference between the algorithm presented above and a pure sort is much larger than it appears, probably at least a factor of 10. I'll do a real benchmark later if I can find the time.

Re^3: sorting array of arrays reference
by Laurent_R (Canon) on Oct 25, 2013 at 21:06 UTC
    Nothing really complicated, But I just can't do it right now. No time.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1059752]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2024-03-28 11:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found