Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Benchmarking chop/substr/split

by larryl (Scribe)
on Mar 17, 2001 at 01:04 UTC ( #65029=perlquestion: print w/replies, xml ) Need Help??
larryl has asked for the wisdom of the Perl Monks concerning the following question:

I tried a few benchmarks in response to discussion around Big Willy's Frequency Analyzer. The code I tested and the results can be summarized by:

  • chop: 909.92/s
    while ($_ ne '') { $letters{lc chop}++; }
  • split_0: 519.41/s
    $letters{$_}++ for split//, lc;
  • split_1: 481.23/s
    $letters{lc $_}++ for split//;
  • substr: 475.51/s
    while ($_ ne '') { $letters{lc substr($_,-1)}++; substr($_, -1) = ""; }

I have two questions:

  1. I'm not that familiar with the for syntax used in the split_[01] tests. Can someone help me mentally parse those, and possibly explain why one would be faster than the other?

  2. It seems odd that chop() benchmarks so much faster than substr($_,-1). I've seen the same in other stuff I've benchmarked. Wouldn't you think the compiler would treat substr($_,-1) as a special case?
  3. (Ok, that was really just one question and one musing...)

Replies are listed 'Best First'.
Re: Benchmarking chop/substr/split
by Albannach (Prior) on Mar 17, 2001 at 01:14 UTC
    Deparse is cool!:

    perl -MO=Deparse -e "$letters{$_}++ for split //, lc;"

    foreach $_ (split(//, lc $_, 0)) { ++$letters{$_}; }
    Which I imagine you can understand.

    I'd like to be able to assign to an luser

(tye)Re: Benchmarking chop/substr/split
by tye (Sage) on Mar 17, 2001 at 01:34 UTC

    Well, I bet it would be possible to teach Perl to optimize substr($str,-1) (when not being assigned to and when the "-1" is a literal constant) into chop. But then, why? That is, would the bit of bloat (and the potential for introducing bugs) be worth allowing people to write chop long-hand without the performance penalty?

    BTW, they are all close enough to the same speed that I don't think I'd ever worry about the speed difference. (:

            - tye (but my friends call me "Tye")

      True, it might not be worth the bloat for something not frequently used. I'm just in a fuss because I read that gnat wants to take chop() out of the core.  I like chopchop is my friend.  I don't want to have to type  $a = substr($_,-1); substr($_,-1) = '' when what I want is  $a = chop .


      Besides, it's at least twice as fast every place I've tested it. If you need to check a few thousand credit cards a day, it makes a noticeable difference.

        Yeah, I like chop too. FYI, you don't have to repeat the substr: $a= substr($_,-1,1,"");
        Though this doesn't work in older versions of Perl.

                - tye (but my friends call me "Tye")
Re: Benchmarking chop/substr/split
by MeowChow (Vicar) on Mar 17, 2001 at 13:55 UTC
    If efficiency is your primary concern, then use an array instead of a hash:
    $letters[ord chop]++ while $_;
    I also intentionally left out the lc function. Frequency counts can be consolidated at the end of the loop, instead of performing an lc upon every iteration. Save yourself the O(n) step, and get more detailed frequency information while you're at it.
                   s aamecha.s a..a\u$&owag.print
Re: Benchmarking chop/substr/split
by larryl (Scribe) on Mar 17, 2001 at 02:58 UTC

    Update: A new data point based on tye's suggestion

    • substr_1: 714.54/s
      while ($_ ne '') { $letters{lc substr($_,-1,1,'')}++; }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://65029]
Approved by root
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (7)
As of 2018-05-23 13:43 GMT
Find Nodes?
    Voting Booth?