Your skill will accomplishwhat the force of many cannot PerlMonks

### length() and the Schwartzian transform

by hazylife (Monk)
 on Feb 27, 2014 at 12:29 UTC Need Help??
hazylife has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!

My question is about the following code example, taken as is from Wikipedia:

```@sorted = map  { \$_->[0] }
sort { \$a->[1] <=> \$b->[1] } # use numeric comparison
map  { [\$_, length(\$_)] }    # calculate the length of the s
+tring
@unsorted;

The above is pure Perl, not some generic pseudocode, and what worries me is that length() doesn't seem like a particularly good example for illustrating the concept. It has always been my understanding that in Perl, length(\$var) just returns whatever value is stored in the corresponding internal SV structure (Devel::Peek calls this field "CUR"; non-string/non-PV scalars don't have this attribute, more on them below). And if it is immediately available, why cache it?

What if we're "length()-sorting" an array of numeric scalars?

```#!/usr/bin/perl

use strict;
use Devel::Peek;

my @arr = (22, 5.555, 1, 4444, 333);

print Dump(\@arr, @arr+0), "\n\n\n";
# at this point, there are 4 IVs and one NV in our array;
# naturally, none of the elements have their length ("CUR") field set

my (\$retval, \$just_once);

my @sorted = sort {
# does this alter the internal structures of our actual
# array elements, turning IVs and NVs into PVIVs/PVNVs?
\$retval = length \$a <=> length \$b;

print Dump \@arr, @arr+0 if !\$just_once++;
# of course it does!

\$retval;
} @arr;

print "\nsorted = @sorted\n";

I might be missing something here, so please correct me if I'm wrong.

Replies are listed 'Best First'.
Re: length() and the Schwartzian transform
by RMGir (Prior) on Feb 27, 2014 at 13:12 UTC
Looks like you're right - plain sort beats the heck out of ST in this case (and would do the same to GRT, I'm sure) - length is just too trivial.

Although as the num-* cases below show, it turns out that length is more expensive for data that isn't pre-stringified, at least on the perl 5.14.2 I tested.

```#!/usr/bin/perl

use strict;
use warnings;
use Benchmark qw(:all :hireswallclock);

foreach my \$arraySize(10,100,1_000,10_000,100_000) {
my @numdata=map rand, 0..\$arraySize;
my @strdata=map {sprintf "%d",\$_} @numdata;

print "========== ARRAY SIZE: \$arraySize\n";

cmpthese(
-1,
{
'str-ST' => sub {
my @arr = @strdata;
@arr = map { \$_->[0] }
sort {\$a->[1]<=>\$b->[1]}
map { [\$_, length(\$_)]}
@arr;
},

'str-sort' => sub {
my @arr = @strdata;
@arr = sort {length(\$a) <=> length(\$b)} @arr;
},

'num-ST' => sub {
my @arr = @numdata;
@arr = map { \$_->[0] }
sort {\$a->[1]<=>\$b->[1]}
map { [\$_, length(\$_)]}
@arr;
},

'num-sort' => sub {
my @arr = @numdata;
@arr = sort {length(\$a) <=> length(\$b)} @arr;
},

}
);
}

Results:
```\$ perl benchmark-sort.pl
========== ARRAY SIZE: 10
Rate   num-ST num-sort   str-ST str-sort
num-ST    5119/s       --      -9%     -77%     -90%
num-sort  5598/s       9%       --     -75%     -89%
str-ST   22719/s     344%     306%       --     -55%
str-sort 50539/s     887%     803%     122%       --
========== ARRAY SIZE: 100
Rate   num-ST num-sort   str-ST str-sort
num-ST    534/s       --     -13%     -82%     -92%
num-sort  615/s      15%       --     -79%     -91%
str-ST   2904/s     444%     372%       --     -58%
str-sort 6854/s    1184%    1014%     136%       --
========== ARRAY SIZE: 1000
Rate   num-ST num-sort   str-ST str-sort
num-ST   51.8/s       --     -16%     -82%     -93%
num-sort 61.4/s      18%       --     -79%     -92%
str-ST    290/s     460%     373%       --     -61%
str-sort  741/s    1330%    1107%     155%       --
========== ARRAY SIZE: 10000
Rate   num-ST num-sort   str-ST str-sort
num-ST   5.13/s       --     -16%     -82%     -93%
num-sort 6.15/s      20%       --     -78%     -92%
str-ST   28.2/s     449%     358%       --     -62%
str-sort 73.7/s    1336%    1099%     162%       --
========== ARRAY SIZE: 100000
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
(warning: too few iterations for a reliable count)
s/iter   num-ST num-sort   str-ST str-sort
num-ST     1.93       --     -13%     -81%     -93%
num-sort   1.68      15%       --     -78%     -92%
str-ST    0.364     432%     363%       --     -61%
str-sort  0.142    1258%    1083%     155%       --
```

Mike

The following one-liner neatly illustrates why length is faster on strings than numbers...

```perl -MDevel::Peek -e'Dump(\$_) for "42", 42'
use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

Wow you're quick. Thanks for the benchmark.

I guess 'num-sort' is not that much faster than 'num-ST' because in both cases there's still one (but only one!) round of stringification being performed per sub call (explicit pre-stringification in num-ST and implicit, sort of in-place stringification in num-sort).

Re: length() and the Schwartzian transform
by Anonymous Monk on Mar 06, 2014 at 06:06 UTC
It seems to me that the point of the example is to show how ST works as simply as possible, NOT to show its speed or lack thereof ...
NOT to show its speed or lack thereof
My main issue with that example is not that it's slow, but that it's misleading. For string scalars, Perl's length returns a value that has already been calculated (and cached) elsewhere. It's just a dumb "accessor method" for that internally stored value, which makes caching it pointless.

Create A New User
Node Status?
node history
Node Type: perlquestion [id://1076388]
Approved by Corion
Front-paged by toolic
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2018-04-26 00:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My travels bear the most uncanny semblance to ...

Results (94 votes). Check out past polls.

Notices?