Re: Advanced Sorting - GRT - Guttman Rosler Transform

in reply to Advanced Sorting - GRT - Guttman Rosler Transform

Wow. I use this trick all the time; I am happy to finally learn it has a name.

I came up with (so I thought) the idea of packing the array elements into decodable strings, sorting those, then decoding them, in the midst of trying to optimize a particulary hairy Sybase query. I was able to get rid of all the table scans (non-indexed searches) except the last one (the ORDER BY clause). I started experimenting with sorting in Perl and then got sucked into trying to speed it up.

I suppose I've always thought of it as a variation on the ST: precalculating the sort keys using map, sorting based upon those keys, and finally, another map to extract the actual data again. I never realized that the ST specified explicitly creating anonymous arrays.

Thanks (and ++), demerphq, for giving this a name.

Update: I forgot to add that, I've been using sprintf(), rather than pack(), which would obviously be even faster (did I say, "Thanks," demerphq?), but in my case, there were enough rows that the key generation phase was more-or-less insignificant compared with the actual sorting process. The hardest part is inverting the portions of the key which must sort descending.

In other words, if using dws's example, one had to sort ascending by the first column ('foo' in the example), then descending by the second column (47 and 103, respectively), you need to invert the second column by using 10's complement for the number of significant digits you have. E.g., if the name column has up to 10 characters, and the value column can go up to 10000 (four digits),

my @sorted = map { [substr($_, 0, 10), 10000-substr($_, 10, 4)] }
             sort
             map { sprintf("%10.10s%04d", $_->[0], 10000-$_->[1]) @uns
+orted;
[download]

dmm

If you GIVE a man a fish you feed him for a day
But, TEACH him to fish and you feed him for a lifetime

In Section Meditations