http://www.perlmonks.org?node_id=163010


in reply to Re: Idiomatic optimizations
in thread Idiomatic optimizations

The problem with //i isn't (wasn't?) that it's slower on small strings. It's that it uses twice the memory as an equivalent character class. And when you start matching against huge strings that can really make a difference. Try your example against a 50MB string and I think you'll see what I mean. If not you can justly castigate me for being too lazy to test my own assertions.

Eagerly awaiting the second edition,
-sam

Replies are listed 'Best First'.
Re: Re: Re: Idiomatic optimizations
by hakkr (Chaplain) on Apr 30, 2002 at 11:40 UTC
    Always pass referances not data structures hence \ operator is an optimistaion sub(\@array) instead of sub(@array)
    Only use what you need from modules use CGI qw(:standard);
    Also I like shortcut operators
    my $i ||=0 ; my $i =shift || 0;
    also I like ? operator instead of if's
    $i?$i=1:$i=0;
    is !~ an optimisation over just negating the result of =~, I dunno but I think !~ looks better
      Just a few remarks (though i've got a sneaking suspicion i'm correcting typos here):

      > my $i ||=0 ;

      here $i will always turn out to be 0 (because of the my operator), so my $i=0; is more efficient.

      > $i?$i=1:$i=0;
      How about $i=$i?1:0; - that's also a bit more readable. (at least to my eyes).

      Joost.

      Don't forget that ?: can get dangerous, not unlike juggling running chainsaws. It's a great show, but is liable to injure yourself something fierce:
      $foo = $a? $b? $c : $d? $e : $f : $g : $h;
      Sometimes an if is more verbose, but undeniably precise.

      Instead of getting carried away with ?:, you can sometimes compact it using the regular logical operators || and &&. It really depends on what you're working with.
        Hmm I sort of agree with your point, but Im troubled by your fierce but syntactically incorrect example (ternary ops should always have the same number of '?' as ':' )
        $a=1;$b=2;$c=3;$d=4;$f=5;$g=6;$h=7; $foo = $a? $b? $c : $d? $e : $f : $g : $h; print $foo; __END__ syntax error at C:\temp\ternary.pl line 2, near "$g :" Execution of C:\temp\ternary.pl aborted due to compilation errors.
        I believe that you meant to say
        $foo = $a ? $b ? $c : $d ? $e : $f : $g;
        My personal rule of thumb is that ternary ops should not be nested but may be chained. Thus I would say that your example could be rewritten
        $foo = !$a ? $g : $b ? $c : $d ? $e : $f;
        And its a little less troublesome. Even then I personally would add some whitespace so it would look like
        $foo = !$a ? $g : $b ? $c : $d ? $e : $f;
        This and a bit of paretheses would also help your nested example
        $foo = $a ? ($b ? $c : ($d ? $e : $f)) : $g;
        I find that formatting ternaries like this makes them only marginally more difficult to read than if statements, but I still tend to avoid nested ternaries.

        Yves / DeMerphq
        ---
        Writing a good benchmark isnt as easy as it might look.

      $i?$i=1:$i=0;

      Puh-lease, use some whitespace! Here are some alternatives:

      $i ? $i = 1 : $i = 0; $i = $i ? 1 : 0; $i = !!$i || 0; $foo = $foo ? 1 : 0; # Single-letter variable names: # easy to type, hard to read

      Always pass referances not data structures

      Most references are the root of data structures, so I think you meant "Always pass references instead of flattened hashes or lists". Note that you can't use this if the sub in question doesn't expect it.

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.
      

Re: Re: Re: Idiomatic optimizations
by thelenm (Vicar) on Apr 30, 2002 at 16:59 UTC
    Nah, no castigation here. When I whipped up my test, I had thought that the alphabet x 500 was a pretty big string, but now that I'm thinking clearly that's not very big at all.

    To test out a really big string, I replicated Romeo and Juliet 500 times, read the whole thing into a string, then ran the same regular expressions almost the same regular expressions. I removed /o from the 'chars' sub, which actually made it a little faster. The string was about 70 MB. Here is my new test code:

    use strict; use Benchmark qw(cmpthese); local $/ = undef; open IN, "romeo-and-juliet-500-times.txt"; my $text = <IN>; close IN; # Ten iterations is enough with a 70 MB string! cmpthese(10, { 'i' => sub { $text =~ /abc/ig }, 'chars' => sub { $text =~ /[Aa][Bb][Cc]/g }, });
    To my surprise (again!), the /i version ran in about 1/3 the time as the character-class version. Here is the output on my machine:
    Benchmark: timing 10 iterations of chars, i... chars: 40 wallclock secs (38.37 usr + 0.04 sys = 38.41 CPU) @ 0 +.26/s (n=10) i: 12 wallclock secs (11.43 usr + 0.01 sys = 11.44 CPU) @ 0 +.87/s (n=10) s/iter chars i chars 3.84 -- -70% i 1.14 236% --
    I'm amazed. Am I not testing the right thing? Or has /i really been cleaned up in recent versions of Perl? I'm running 5.6.1.