Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Re: Idiomatic optimizations

by samtregar (Abbot)
on Apr 30, 2002 at 07:59 UTC ( #163010=note: print w/replies, xml ) Need Help??

in reply to Re: Idiomatic optimizations
in thread Idiomatic optimizations

The problem with //i isn't (wasn't?) that it's slower on small strings. It's that it uses twice the memory as an equivalent character class. And when you start matching against huge strings that can really make a difference. Try your example against a 50MB string and I think you'll see what I mean. If not you can justly castigate me for being too lazy to test my own assertions.

Eagerly awaiting the second edition,

Replies are listed 'Best First'.
Re: Re: Re: Idiomatic optimizations
by hakkr (Chaplain) on Apr 30, 2002 at 11:40 UTC
    Always pass referances not data structures hence \ operator is an optimistaion sub(\@array) instead of sub(@array)
    Only use what you need from modules use CGI qw(:standard);
    Also I like shortcut operators
    my $i ||=0 ; my $i =shift || 0;
    also I like ? operator instead of if's
    is !~ an optimisation over just negating the result of =~, I dunno but I think !~ looks better
      Just a few remarks (though i've got a sneaking suspicion i'm correcting typos here):

      > my $i ||=0 ;

      here $i will always turn out to be 0 (because of the my operator), so my $i=0; is more efficient.

      > $i?$i=1:$i=0;
      How about $i=$i?1:0; - that's also a bit more readable. (at least to my eyes).


      Don't forget that ?: can get dangerous, not unlike juggling running chainsaws. It's a great show, but is liable to injure yourself something fierce:
      $foo = $a? $b? $c : $d? $e : $f : $g : $h;
      Sometimes an if is more verbose, but undeniably precise.

      Instead of getting carried away with ?:, you can sometimes compact it using the regular logical operators || and &&. It really depends on what you're working with.
        Hmm I sort of agree with your point, but Im troubled by your fierce but syntactically incorrect example (ternary ops should always have the same number of '?' as ':' )
        $a=1;$b=2;$c=3;$d=4;$f=5;$g=6;$h=7; $foo = $a? $b? $c : $d? $e : $f : $g : $h; print $foo; __END__ syntax error at C:\temp\ line 2, near "$g :" Execution of C:\temp\ aborted due to compilation errors.
        I believe that you meant to say
        $foo = $a ? $b ? $c : $d ? $e : $f : $g;
        My personal rule of thumb is that ternary ops should not be nested but may be chained. Thus I would say that your example could be rewritten
        $foo = !$a ? $g : $b ? $c : $d ? $e : $f;
        And its a little less troublesome. Even then I personally would add some whitespace so it would look like
        $foo = !$a ? $g : $b ? $c : $d ? $e : $f;
        This and a bit of paretheses would also help your nested example
        $foo = $a ? ($b ? $c : ($d ? $e : $f)) : $g;
        I find that formatting ternaries like this makes them only marginally more difficult to read than if statements, but I still tend to avoid nested ternaries.

        Yves / DeMerphq
        Writing a good benchmark isnt as easy as it might look.


      Puh-lease, use some whitespace! Here are some alternatives:

      $i ? $i = 1 : $i = 0; $i = $i ? 1 : 0; $i = !!$i || 0; $foo = $foo ? 1 : 0; # Single-letter variable names: # easy to type, hard to read

      Always pass referances not data structures

      Most references are the root of data structures, so I think you meant "Always pass references instead of flattened hashes or lists". Note that you can't use this if the sub in question doesn't expect it.

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.

Re: Re: Re: Idiomatic optimizations
by thelenm (Vicar) on Apr 30, 2002 at 16:59 UTC
    Nah, no castigation here. When I whipped up my test, I had thought that the alphabet x 500 was a pretty big string, but now that I'm thinking clearly that's not very big at all.

    To test out a really big string, I replicated Romeo and Juliet 500 times, read the whole thing into a string, then ran the same regular expressions almost the same regular expressions. I removed /o from the 'chars' sub, which actually made it a little faster. The string was about 70 MB. Here is my new test code:

    use strict; use Benchmark qw(cmpthese); local $/ = undef; open IN, "romeo-and-juliet-500-times.txt"; my $text = <IN>; close IN; # Ten iterations is enough with a 70 MB string! cmpthese(10, { 'i' => sub { $text =~ /abc/ig }, 'chars' => sub { $text =~ /[Aa][Bb][Cc]/g }, });
    To my surprise (again!), the /i version ran in about 1/3 the time as the character-class version. Here is the output on my machine:
    Benchmark: timing 10 iterations of chars, i... chars: 40 wallclock secs (38.37 usr + 0.04 sys = 38.41 CPU) @ 0 +.26/s (n=10) i: 12 wallclock secs (11.43 usr + 0.01 sys = 11.44 CPU) @ 0 +.87/s (n=10) s/iter chars i chars 3.84 -- -70% i 1.14 236% --
    I'm amazed. Am I not testing the right thing? Or has /i really been cleaned up in recent versions of Perl? I'm running 5.6.1.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://163010]
[ambrus]: Today I accidentally cut my hand while carrying a computer. On the display part of the motherboard that sticks out at the back of the chasis and has ports, there's this thin metal sheet with holes cut for the ports, to guide plugs into the sockets.
[ambrus]: This sheet has sharp needle-like parts, 0.004 long and less than 0.001 wide, that can get bent to point outwards, and one of these cut into my palm when I lifted the box.

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (6)
As of 2017-01-16 19:32 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (151 votes). Check out past polls.