Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Idiomatic optimizations

by Juerd (Abbot)
on Apr 29, 2002 at 16:41 UTC ( #162863=perlmeditation: print w/ replies, xml ) Need Help??

We all optimize. Sometimes we do so needlessly, sometimes speed is of utter importance.

I think most of us would agree that readability, maintainability and speed of programming are often way more important than execution speed. But, what if optimizations actually make things more readable, what if optimization increase maintainability, and even let you code faster?

There are some simple tricks that have no down-side to them, and we often recommend them. There are simple optimizations like using $foo instead of "$foo", using $foo == 0 instead of $foo eq 0 (assuming $foo is numeric), etcetera. Then, there are the "doesn't really matter" optimizations, like using length $foo in boolean context instead of $foo ne '' (and of course, length with an implicit $_ is nicer compared to $_ ne ''). And of course, we have more complicated optimizations like the Schwartzian transform (map sort map).

Considering the alternatives, these optimizations make code more readable, maintainable and improve programming speed (although the more complicated ones might be a bit overwhelming the first time).

Can we list these tricks? What are your favourite idiomatic optimiziations?
For the overview, here's a summary of what I already mentioned:

  • $foo vs "$foo" (don't interpolate when not needed)
  • $foo eq 0 vs $foo == 0 (use the right scalar context)
  • length vs $_ ne '' (scalars already "know" their length)
  • scharzian transform vs disassembling and re-assembling data

- Yes, I reinvent wheels.
- Spam: Visit eurotraQ.

Comment on Idiomatic optimizations
Re: Idiomatic optimizations
by particle (Vicar) on Apr 29, 2002 at 17:10 UTC
    when printing, comma (,) is faster than concatenate (.). print $foo, ' -- ', $bar, $/; is faster than print $foo . ' -- ' . $bar . $/;

    ~Particle *accelerates*

      One caveat though: When printing a list, each element is evaluated in LIST context whereas concatenated expressions are evaluated in SCALAR context....
      % perl -le 'print "Today: ". localtime()' Today: Tue Apr 30 01:35:48 2002 % perl -le 'print "Today: ", localtime()' Today: 5435130310221191

      -Blake

Re: Idiomatic optimizations
by VSarkiss (Monsignor) on Apr 29, 2002 at 17:12 UTC

    Beyond performance improvement, some of these transformations usually render your code "more correct" (for some suitable definition of correct). Your first point, for example, about unnecessarily quoting variables: in my experience, it's a left-over bad habit of shell programmers. It can lead to errors in strange circumstances, particularly if $foo is an object, and the stringify operator does something you didn't expect, or you didn't realize the stringified version is not the same as the object itself.

    More than optimizations, I would categorize these as "Refactoring". Generally speaking, it means changing code without adding functionality, but improving it in some fashion, such as making it easier to maintain. Fowler has an entire book on the subject. Although it uses Java, some of the principles apply to other languages as well.

    I recall brother chromatic was working on a refactoring editor for Perl, based on using the B back-end compilers to generate a uniform code tree, then applying these types of changes. He referred to it in his journal on use perl, but I don't know its current state (though I'm sure it'll be thoroughly tested when it's released ;-).

Re: Idiomatic optimizations
by jlongino (Parson) on Apr 29, 2002 at 17:45 UTC
    Most of the optimizations listed can be found in the "Other Oddments" chapter of the Camel in the section "Efficiency". But one of my favorites (also in the Camel, but paraphrased using $x/$y/$z instead of $a/$b/$c):
    Use $foo = $x || $y || $z; This is much faster (and shorter to say) +than: if ($x) { $foo = $x; } elsif ($y) { $foo = $y; } elsif ($z) { $foo = $z; }

    --Jim

Re: Idiomatic optimizations
by dws (Chancellor) on Apr 29, 2002 at 17:57 UTC
    $foo vs "$foo" (don't interpolate when not needed)

    I look at this as removing a pessimization, rather than introducing an optimization.

Re: Idiomatic optimizations
by thelenm (Vicar) on Apr 29, 2002 at 18:00 UTC
    Ever since I read in Mastering Regular Expressions that perl makes a copy of the base string when doing a case-insensitive match, I've tried to use character classes instead of /i.

    ... Before submitting this post, though, I decided to actually benchmark some variations to see whether character classes were faster. To my surprise, it turns out that /i is about 50% faster in the test I used:

    use strict; use Benchmark qw(cmpthese); my $foo = "abcdefghijklmnopqrstuvwxyz"x500; my $re = "[Aa][Bb][Cc]"; cmpthese(1000000, { 'i' => sub { $foo =~ /abc/ig }, 'chars' => sub { $foo =~ /[Aa][Bb][Cc]/og }, 'charvar' => sub { $foo =~ /$re/og }, });
    yielding these results on my machine:
    Benchmark: timing 1000000 iterations of chars, charvar, i... chars: 2 wallclock secs ( 1.97 usr + 0.00 sys = 1.97 CPU) @ 50 +7614.21/s (n=1000000) charvar: 3 wallclock secs ( 2.04 usr + -0.01 sys = 2.03 CPU) @ 49 +2610.84/s (n=1000000) i: 1 wallclock secs ( 1.31 usr + 0.00 sys = 1.31 CPU) @ 76 +3358.78/s (n=1000000) Rate charvar chars i charvar 492611/s -- -3% -35% chars 507614/s 3% -- -34% i 763359/s 55% 50% --
    Results are similar for strings of various lengths. So was Mastering Regular Expressions incorrect, or has the problem just been fixed since it was written?
      The problem with //i isn't (wasn't?) that it's slower on small strings. It's that it uses twice the memory as an equivalent character class. And when you start matching against huge strings that can really make a difference. Try your example against a 50MB string and I think you'll see what I mean. If not you can justly castigate me for being too lazy to test my own assertions.

      Eagerly awaiting the second edition,
      -sam

        Always pass referances not data structures hence \ operator is an optimistaion sub(\@array) instead of sub(@array)
        Only use what you need from modules use CGI qw(:standard);
        Also I like shortcut operators
        my $i ||=0 ; my $i =shift || 0;
        also I like ? operator instead of if's
        $i?$i=1:$i=0;
        is !~ an optimisation over just negating the result of =~, I dunno but I think !~ looks better
        Nah, no castigation here. When I whipped up my test, I had thought that the alphabet x 500 was a pretty big string, but now that I'm thinking clearly that's not very big at all.

        To test out a really big string, I replicated Romeo and Juliet 500 times, read the whole thing into a string, then ran the same regular expressions almost the same regular expressions. I removed /o from the 'chars' sub, which actually made it a little faster. The string was about 70 MB. Here is my new test code:

        use strict; use Benchmark qw(cmpthese); local $/ = undef; open IN, "romeo-and-juliet-500-times.txt"; my $text = <IN>; close IN; # Ten iterations is enough with a 70 MB string! cmpthese(10, { 'i' => sub { $text =~ /abc/ig }, 'chars' => sub { $text =~ /[Aa][Bb][Cc]/g }, });
        To my surprise (again!), the /i version ran in about 1/3 the time as the character-class version. Here is the output on my machine:
        Benchmark: timing 10 iterations of chars, i... chars: 40 wallclock secs (38.37 usr + 0.04 sys = 38.41 CPU) @ 0 +.26/s (n=10) i: 12 wallclock secs (11.43 usr + 0.01 sys = 11.44 CPU) @ 0 +.87/s (n=10) s/iter chars i chars 3.84 -- -70% i 1.14 236% --
        I'm amazed. Am I not testing the right thing? Or has /i really been cleaned up in recent versions of Perl? I'm running 5.6.1.
Re: Idiomatic optimizations
by BlueLines (Hermit) on May 01, 2002 at 00:37 UTC
    If you really want to nitpick, try using single quotes instead of double quotes around static text:
    #!/usr/bin/perl -w use Benchmark qw(cmpthese); cmpthese (10000000, { single => sub { $foo = 'foo'}, double => sub { $foo = "foo"} });
    gives me:
    [jon@valium jon]$ ./test.pl Benchmark: timing 10000000 iterations of double, single... double: 1 wallclock secs ( 1.91 usr + 0.00 sys = 1.91 CPU) @ 52 +35602.09/s (n=10000000) single: 2 wallclock secs ( 1.25 usr + 0.00 sys = 1.25 CPU) @ 80 +00000.00/s (n=10000000) Rate double single double 5235602/s -- -35% single 8000000/s 53% --
    50% improvement when strings aren't interpolated.

    BlueLines

    Disclaimer: This post may contain inaccurate information, be habit forming, cause atomic warfare between peaceful countries, speed up male pattern baldness, interfere with your cable reception, exile you from certain third world countries, ruin your marriage, and generally spoil your day. No batteries included, no strings attached, your mileage may vary.
      What perl are you using? Those should be identical at runtime, since interpolation is converted to concatenation by the tokenizer. For evidence, try:
      % perl -MO=Deparse -e '$foo=q(foo)' $foo = 'foo'; -e syntax OK % perl -MO=Deparse -e '$foo=qq(foo)' $foo = 'foo'; -e syntax OK
      I ran your benchmark program, and got +/- 3%. Try running it a few times.
        I was logged in remotely to my home machine when i ran this, which means that xscreensaver was running with 100% of my cpu (yay OpenGL). I ran the test again without xscreensaver and i got the same results you did. oh well.

        BlueLines

        Disclaimer: This post may contain inaccurate information, be habit forming, cause atomic warfare between peaceful countries, speed up male pattern baldness, interfere with your cable reception, exile you from certain third world countries, ruin your marriage, and generally spoil your day. No batteries included, no strings attached, your mileage may vary.
        I wonder if the results have anything to do with you using q instead of a single quote and qq instead of double quotes.
      That is very odd. I get absolutely no difference with Perl 5.6.1:
      Rate single double single 3937008/s -- 0% double 3937008/s 0% --

      -sam

      With 5.6.1...

      On an otherwise unloaded P4 1.g GHz

      1 Rate single double 2 Rate single double single 4219409/s -- -7% single 4273504/s -- -18% double 4524887/s 7% -- double 5208333/s 22% -- 3 Rate single double 4 Rate single double single 4201681/s -- -8% single 4166667/s -- -22% double 4566210/s 9% -- double 5347594/s 28% --

      On an otherwise reasonably loaded (0.31) sun4u

      1 Rate double single 2 Rate double single double 3079766/s -- -18% double 3054368/s -- -11% single 3763643/s 22% -- single 3419973/s 12% -- 3 Rate double single 4 Rate double single double 2866972/s -- -18% double 3107520/s -- -10% single 3495281/s 22% -- single 3437607/s 11% --

      --
      perl -pew "s/\b;([mnst])/'$1/g"

        On an otherwise unloaded P4 1.g GHz

        For certain very specific meanings of the word "unloaded", I'm sure. "foo" is optimized to 'foo' at compile time, so neither of them can possibly be faster. As you can see, the generated bytecode is equivalent:

        2;0 juerd@ouranos:~$ perl -MO=Concise -e'$foo = "foo"' 6 <@> leave[t1] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 5 <2> sassign vKS/2 ->6 3 <$> const(PV "foo") s ->4 - <1> ex-rv2sv sKRM*/1 ->5 4 <$> gvsv(*foo) s ->5 -e syntax OK 2;0 juerd@ouranos:~$ perl -MO=Concise -e'$foo = '\''bar'\' 6 <@> leave[t1] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 5 <2> sassign vKS/2 ->6 3 <$> const(PV "bar") s ->4 - <1> ex-rv2sv sKRM*/1 ->5 4 <$> gvsv(*foo) s ->5 -e syntax OK
        Even "Hello, $name!" is optimized to 'Hello, ' . $name . '!'.

        Note: I _do_ think people should use single quotes when not interpolating. It saves you a lot of hits on the backslash key, and avoids getting bitten by: "$You're kidding me" (eq. "$You::re kidding me"), "my@emailaddress.com" (intended: "my\@emailaddress.com").

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://162863]
Approved by broquaint
Front-paged by claree0
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (13)
As of 2014-12-18 20:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (62 votes), past polls