http://www.perlmonks.org?node_id=522875


in reply to trim() magic

I tried to see if there were any knobs to twiddle on this one, using dragonchild’s benchmark cases.

First thing I tried: using a recursive call to alias $_. This lets you get rid of the ternary in the for list.

sub trim { return trim( $_ ) if not @_; @_ = @_ if defined wantarray; for ( @_ ) { s/^\s+//, s/\s+$// } return wantarray ? @_ : $_[ 0 ] if defined wantarray; }

On my setup this is about 15% slower for the “inplace replacement of implicit $_” case, but ekes out a few percentage points on the other cases. But it let me proceed to switch from duplicate defined wantarray tests to a duplicate inner loop:

sub trim2 { return trim2( $_ ) if not @_; return map { local $_ = $_; s/^\s+//, s/\s+$//; $_ } @_ if defined wantarray; for ( @_ ) { s/^\s+//, s/\s+$// } }

This gets back most of the lost speed in the “inplace replacement of implicit $_” case, has roughly the same performance in other void contexts, but is also about 50% faster in many other cases, including the IMHO most important one – passing a scalar and assigning to one.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese timethese ); sub trim1 { @_ = $_ if not @_ and defined wantarray; @_ = @_ if defined wantarray; for ( @_ ? @_ : $_ ) { s/^\s+//, s/\s+$// } return wantarray ? @_ : $_[ 0 ] if defined wantarray; } sub trim2 { return trim2( $_ ) if not @_; return map { local $_ = $_; s/^\s+//, s/\s+$//; $_ } @_ if defined wantarray; for ( @_ ) { s/^\s+//, s/\s+$// } } my $cpu = -1; cmpthese timethese $cpu => { undef_default_1 => sub { $_ = ' asdf '; trim1(); }, undef_default_2 => sub { $_ = ' asdf '; trim2(); }, }; cmpthese timethese $cpu => { scalar_default_1 => sub { $_ = ' asdf '; my $n = trim1(); }, scalar_default_2 => sub { $_ = ' asdf '; my $n = trim2(); }, }; cmpthese timethese $cpu => { scalar_passed_1 => sub { my $x = ' asdf '; my $n = trim1( $x ); +}, scalar_passed_2 => sub { my $x = ' asdf '; my $n = trim2( $x ); +}, }; cmpthese timethese $cpu => { list_default_1 => sub { $_ = ' asdf '; my @n = trim1(); }, list_default_2 => sub { $_ = ' asdf '; my @n = trim2(); }, }; cmpthese timethese $cpu => { list_passed_1 => sub { my @l = ( ' asdf ', ' asdf ' ); my @n = + trim1( @l ); }, list_passed_2 => sub { my @l = ( ' asdf ', ' asdf ' ); my @n = + trim2( @l ); }, }; cmpthese timethese $cpu => { undef_passed_1 => sub { my @l = ( ' asdf ', ' asdf ' ); trim1( + @l ); }, undef_passed_2 => sub { my @l = ( ' asdf ', ' asdf ' ); trim2( + @l ); }, };

Sample run:

Benchmark: running undef_default_1, undef_default_2 for at least 1 CPU seconds...
undef_default_1:  0 wallclock secs ( 1.05 usr +  0.01 sys =  1.06 CPU) @ 190933.96/s (n=202390)
undef_default_2:  1 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 174121.15/s (n=181086)
                    Rate undef_default_2 undef_default_1
undef_default_2 174121/s              --             -9%
undef_default_1 190934/s             10%              --
Benchmark: running scalar_default_1, scalar_default_2 for at least 1 CPU seconds...
scalar_default_1:  1 wallclock secs ( 1.11 usr +  0.00 sys =  1.11 CPU) @ 103321.62/s (n=114687)
scalar_default_2:  1 wallclock secs ( 1.11 usr +  0.00 sys =  1.11 CPU) @ 154982.88/s (n=172031)
                     Rate scalar_default_1 scalar_default_2
scalar_default_1 103322/s               --             -33%
scalar_default_2 154983/s              50%               --
Benchmark: running scalar_passed_1, scalar_passed_2 for at least 1 CPU seconds...
scalar_passed_1:  1 wallclock secs ( 1.09 usr +  0.00 sys =  1.09 CPU) @ 121405.50/s (n=132332)
scalar_passed_2:  2 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 183794.23/s (n=191146)
                    Rate scalar_passed_1 scalar_passed_2
scalar_passed_1 121406/s              --            -34%
scalar_passed_2 183794/s             51%              --
Benchmark: running list_default_1, list_default_2 for at least 1 CPU seconds...
list_default_1:  2 wallclock secs ( 1.09 usr +  0.00 sys =  1.09 CPU) @ 98641.28/s (n=107519)
list_default_2:  2 wallclock secs ( 1.04 usr +  0.00 sys =  1.04 CPU) @ 150376.92/s (n=156392)
                   Rate list_default_1 list_default_2
list_default_1  98641/s             --           -34%
list_default_2 150377/s            52%             --
Benchmark: running list_passed_1, list_passed_2 for at least 1 CPU seconds...
list_passed_1:  1 wallclock secs ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 80736.45/s (n=86388)
list_passed_2:  1 wallclock secs ( 1.08 usr +  0.00 sys =  1.08 CPU) @ 99554.63/s (n=107519)
                 Rate list_passed_1 list_passed_2
list_passed_1 80736/s            --          -19%
list_passed_2 99555/s           23%            --
Benchmark: running undef_passed_1, undef_passed_2 for at least 1 CPU seconds...
undef_passed_1:  1 wallclock secs ( 1.07 usr +  0.00 sys =  1.07 CPU) @ 133980.37/s (n=143359)
undef_passed_2:  1 wallclock secs ( 1.13 usr +  0.00 sys =  1.13 CPU) @ 138400.00/s (n=156392)
                   Rate undef_passed_1 undef_passed_2
undef_passed_1 133980/s             --            -3%
undef_passed_2 138400/s             3%             --

Makeshifts last the longest.