good chemistry is complicated,and a little bit messy -LW PerlMonks

### Re^2: pop sort strangeness

by Brovnik (Hermit)
 on Nov 15, 2004 at 16:18 UTC ( #407861=note: print w/replies, xml ) Need Help??

in reply to Re: pop sort strangeness

```my \$max = pop @{[ sort @vals ]};
is the answer I was looking for, thanks. I realise it is inefficient, just came across the error and didn't immediately twig the list/array issue.

Replies are listed 'Best First'.
Re^3: pop sort strangeness
by davido (Archbishop) on Nov 15, 2004 at 16:27 UTC

Indeed, it is very inefficient, by comparison to simply doing a linear search. First, you're sorting the list. That's O(N log N) for Perl's default sort type. Next, you're taking a copy of the list (yes, you're copying it) for the purpose of creating the anonymous array. That's about O(N). Next, you're dereferencing the array (a pretty quick action), and then you're popping something off the array (which is also pretty quick, but still is another step).

So what you've got is O( N + N log N ), when you could just have O(N). That's not so good. And as someone else already pointed out, sort @vals does a string sort, not a numeric sort, so 11 will be sorted next to 1 instead of next to 12. If you must use the sort routine, at least change it to doing a numeric sort:

```my \$max = pop @{[ sort { \$a <=> \$b } @vals ]};

And here's a linear search for max.

```my \$max = \$vals[0];
\$max = ( \$_ > \$max ) ? \$_ : \$max foreach @vals;

This version is a little terse, on purpose, to demonstrate that it's possible to do the linear search in just slightly more keystrokes than the sort pop method.

Dave

Just for reference, here's some timing results on my machine for the following code snippet...
```#!/usr/bin/perl -w

#Initialization
@vals = reverse "a".."daaaa";

#O(NlogN)
my \$max = pop @{[ sort @vals ]};

#O(N)
#my \$max = \$vals[0];
#\$max = ( \$_ gt \$max ) ? \$_ : \$max foreach @vals;

\$e=@vals;
print "elements=\$e max=\$max\n";
Commenting out everything but the initialization section (i.e. not finding any maximum at all), this snippet executes in about 4.4 seconds on my machine. Using the sort to find the max, it takes about 6.2 seconds. Finally, using the linear method, it executes in about 5.1 seconds. So, for the 1,846,183 elements in this example, the sort method is about 2.5 times slower (6.2-4.4)/(5.1-4.4) than the linear search method. But the test is still dominated by just generating the large array. So you might not notice the efficiency gain if your data set is smaller than 1.8 million elements.

-- All code is 100% tested and functional unless otherwise noted.
Here are my results, first with your "a" .. "daaa", and then with a much smaller dataset ("a".."daa", only 2731 values), and finally with just "a" .. "aa" (27 values).
```#!/usr/bin/perl
use strict;
use warnings;

use List::Util qw/ maxstr /;
use Benchmark qw/ cmpthese /;

my @vals = reverse "a" .. "daa";

cmpthese( -10, {
sorting => sub {my \$max = pop @{[sort @vals]}},
iterating => sub {
my \$max = \$vals[0];
\$max = ( \$_ gt \$max ) ? \$_ : \$max foreach @vals;
},
list_util => sub {my \$max = maxstr @vals}
});
__END__
Rate    sorting  iterating list_utils
sorting    0.659/s         --       -42%       -82%
iterating   1.13/s        72%         --       -69%
list_util   3.67/s       458%       224%         --

Rate    sorting  iterating list_utils
sorting     537/s         --       -32%       -80%
iterating   789/s        47%         --       -70%
list_util  2661/s       395%       237%         --

Rate    sorting  iterating list_utils
sorting     50180/s         --       -25%       -78%
iterating   66991/s        34%         --       -70%
list_utils 225804/s       350%       237%         --
So in all cases, using the built in List::Util proceedure was quite a bit more than twice as fast as either of the others. And it's also much simpler to use, so it seems like something of a no-brainer to me.

Create A New User
Node Status?
node history
Node Type: note [id://407861]
help
Chatterbox?
 [Corion]: LanX: The "good" approach here would be to use the appropriate DBI parameters to make the driver decode strings properly. But that will have a ripple-on effect of messing up all the places where manual decoding happens ;) [LanX]: which means albeit being broken UTF8 it'll be handled correctly [LanX]: and the problem only occurs since we changed the emails to base64 [LanX]: my main problem will be to cnvince my colleagues that our productive code is broken oO ... so in the end I will just make a workaround :-/ LanX hates UTF8 for causing knots in his brain and stomach [Corion]: LanX: Yes, that's the main problem - you have lots (and lots) of workarounds in various places and stages of the processing, and to clean that mess up requires action across the complete codebase. And it's almost impossible to do it piece-by-piece

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (11)
As of 2017-01-16 14:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Do you watch meteor showers?

Results (150 votes). Check out past polls.