Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: efficient perl code to count, rank (merge sort)

by LanX (Sage)
on Jul 18, 2021 at 12:36 UTC ( #11135129=note: print w/replies, xml ) Need Help??


in reply to Re^3: efficient perl code to count, rank (B-tree)(update)
in thread efficient perl code to count, rank

> (Well you could tie an AoA to a file representing a table. Sorting that array would go without much RAM but mean constant overhead with each access. This is certainly worth a shot, but as I said DBs have this already optimized.)

I don't think this will work because of the way merge sort is implemented in Perl

It is first comparing all pairs in a breadth first search, which would mean far too many disk operations and no sensible caching option

DB<7> sort { print "$a,$b |"; $a <=> $b } reverse 1..16 16,15 |14,13 |12,11 |10,9 |8,7 |6,5 |4,3 |2,1 |15,13 |15,14 +|11,9 |11,10 |13,9 |13,10 |13,11 |13,12 \ |7,5 |7,6 |3,1 |3,2 |5,1 |5,2 |5,3 |5,4 |9,1 |9,2 |9,3 |9,4 + |9,5 |9,6 |9,7 |9,8 | DB<8>

explained

16,15 |14,13 |12,11 |10,9 |8,7 |6,5 |4,3 |2,1 | + # sort pairs 15,13 |15,14 |11,9 |11,10 + # sort 1st and 2nd 4-tuple |13,9 |13,10 |13,11 |13,12 + # sort 1st 8 tuple |7,5 |7,6 |3,1 |3,2 + # sort 3rd and 4th 4-tuple |5,1 |5,2 |5,3 |5,4 + # sort 2nd 8tuple |9,1 |9,2 |9,3 |9,4 |9,5 |9,6 |9,7 |9,8 | + # sort 16 tuple

A caching/memoizing of data read from disk would be far more efficient if the chunks were strictly chosen depth first.

FWIW WP lists some divide-and-conquer approaches for merge-sort

https://en.wikipedia.org/wiki/Mergesort#Parallel_merge_sort

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11135129]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2021-09-28 05:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?