Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: fast, flexible, stable sort

by bart (Canon)
on Feb 12, 2004 at 20:46 UTC ( #328658=note: print w/ replies, xml ) Need Help??


in reply to fast, flexible, stable sort

It doesn't work. Well, it does work, but only if the XFORM routine returns a string of the same length for every item. Otherwise, the sorting could turn out wrong.

I've made a very contrived example, containing lots of null bytes, but actually, if you have a sufficiently large amount of array items, you can get the same effect on other bytes as well.

Let me demonstrate the effect by sorting a number of variable length strings as is, and with the packed index appended. As is shown, the sorted results are not in the same order at all.

use Data::Dumper; $Data::Dumper::Useqq = 1; my @data = map "\0" x $_, 0 .. 5; print Dumper [ sort @data ], [ sort map pack("a*N", $data[$_], $_), 0 .. $#data ];
Result:
$VAR1 = [ "", "\0", "\0\0", "\0\0\0", "\0\0\0\0", "\0\0\0\0\0" ]; $VAR2 = [ "\0\0\0\0", "\0\0\0\0\0\0\0\0\5", "\0\0\0\0\0\0\0\4", "\0\0\0\0\0\0\3", "\0\0\0\0\0\2", "\0\0\0\0\1" ];


Comment on Re: fast, flexible, stable sort
Select or Download Code
Re^2: fast, flexible, stable sort (works?)
by tye (Cardinal) on Mar 08, 2004 at 21:08 UTC
    It doesn't work.

    Really?

    Well, it does work,

    Oh...

    but only if the XFORM routine returns a string of the same length for every item.

    "only"? (:

    Actually, it always works if the fields are fixed-length. It also always works if the fields don't contain "\0" characters and you don't have more than 16 million records. Those two cases cover almost every sort I do.

    It often works when these guarentees don't apply.

    It is also fairly easy to fix it so it always works even if you have fields with lots of trailing null bytes. For example, a s#([\00\01])#\01$1#g and join "\0", is enough.

    I noticed the potential for this problem quite a while ago and hoped to address it in the module based on this idea, but working on such a module hasn't made it to the top of my list yet. Thanks for motivating me to address the problem here. :)

    - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://328658]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (13)
As of 2014-07-25 16:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (173 votes), past polls