Benchmarking chop/substr/split

larryl has asked for the wisdom of the Perl Monks concerning the following question:

I tried a few benchmarks in response to discussion around Big Willy's Frequency Analyzer. The code I tested and the results can be summarized by:

chop: 909.92/s

while ($_ ne '') {
   $letters{lc chop}++;
}
[download]

split_0: 519.41/s
$letters{$_}++ for split//, lc;
split_1: 481.23/s
$letters{lc $_}++ for split//;

substr: 475.51/s

while ($_ ne '') {
   $letters{lc substr($_,-1)}++;
   substr($_, -1) = "";
}
[download]

I have two questions:

I'm not that familiar with the for syntax used in the split_[01] tests. Can someone help me mentally parse those, and possibly explain why one would be faster than the other?
It seems odd that chop() benchmarks so much faster than substr($_,-1). I've seen the same in other stuff I've benchmarked. Wouldn't you think the compiler would treat substr($_,-1) as a special case?

Comment on Benchmarking chop/substr/split Select or Download Code

Replies are listed 'Best First'.
Re: Benchmarking chop/substr/split by Albannach (Monsignor) on Mar 17, 2001 at 01:14 UTC
Deparse is cool!: `perl -MO=Deparse -e "$letters{$_}++ for split //, lc;"` gives `foreach $_ (split(//, lc $_, 0)) { ++$letters{$_}; }` [download] Which I imagine you can understand. -- I'd like to be able to assign to an luser	[reply] [d/l] [select]
Re: Re: Benchmarking chop/substr/split by larryl (Monk) on Mar 17, 2001 at 01:19 UTC
Excellent! Thanks, Brother Albannach!	[reply]
(tye)Re: Benchmarking chop/substr/split by tye (Sage) on Mar 17, 2001 at 01:34 UTC
Well, I bet it would be possible to teach Perl to optimize `substr($str,-1)` (when not being assigned to and when the "-1" is a literal constant) into chop. But then, why? That is, would the bit of bloat (and the potential for introducing bugs) be worth allowing people to write chop long-hand without the performance penalty? BTW, they are all close enough to the same speed that I don't think I'd ever worry about the speed difference. (: - tye (but my friends call me "Tye")	[reply] [d/l]
Re: (tye)Re: Benchmarking chop/substr/split by larryl (Monk) on Mar 17, 2001 at 02:15 UTC
True, it might not be worth the bloat for something not frequently used. I'm just in a fuss because I read that gnat wants to take chop() out of the core. I like chop. chop is my friend. I don't want to have to type `$a = substr($_,-1); substr($_,-1) = ''` when what I want is `$a = chop`. Pout. Besides, it's at least twice as fast every place I've tested it. If you need to check a few thousand credit cards a day, it makes a noticeable difference.	[reply] [d/l] [select]
(tye)Re2: Benchmarking chop/substr/split by tye (Sage) on Mar 17, 2001 at 02:19 UTC
Yeah, I like chop too. FYI, you don't have to repeat the substr: `$a= substr($_,-1,1,"");` Though this doesn't work in older versions of Perl. - tye (but my friends call me "Tye")	[reply] [d/l]
Re: Benchmarking chop/substr/split by MeowChow (Vicar) on Mar 17, 2001 at 13:55 UTC
If efficiency is your primary concern, then use an array instead of a hash: `$letters[ord chop]++ while $_;` [download] I also intentionally left out the lc function. Frequency counts can be consolidated at the end of the loop, instead of performing an lc upon every iteration. Save yourself the O(n) step, and get more detailed frequency information while you're at it. MeowChow s aamecha.s a..a\u$&owag.print	[reply] [d/l]
Re: Benchmarking chop/substr/split by larryl (Monk) on Mar 17, 2001 at 02:58 UTC
Update: A new data point based on tye's suggestion substr_1: 714.54/s `while ($_ ne '') { $letters{lc substr($_,-1,1,'')}++; }` [download]	[reply] [d/l]

Back to Seekers of Perl Wisdom