in reply to what's faster than .=

Devel::Peek is very useful for finding out this sort of low-level detail about perl's variables. When using it to dump a string, you'll see two values listed at the end - "CUR" is the current length of the actual string, and "LEN" is the size of the memory area currently reserved for it.

A simple test of $string .= $nextchar shows that (with perl-5.8.0 on this platform at least) the LEN is changing in lockstep with CUR under this approach - perl is resizing the string buffer to be just big enough for the new string each time, so there is a lot of reallocing and therefore slow string copying going on.

One simple workaround is to presize the buffer, which of course will work best if you've got a good idea how big the string is likely to get. Here's a benchmark to demonstrate that:

use Benchmark; Benchmark::cmpthese(shift, { a => q{ $a[++$i] = ""; $a[$i] .= "a" for 1 .. 10000; }, b => q{ $b[++$j] = "b" x 10000; $b[$j] = ""; $b[$j] .= "b" for 1 .. 10000; }, });
and the results:
Benchmark: running a, b for at least 1 CPU seconds... a: 1 wallclock secs ( 1.03 usr + 0.01 sys = 1.04 CPU) @ 12 +4.04/s (n=129) b: 1 wallclock secs ( 1.08 usr + 0.01 sys = 1.09 CPU) @ 19 +1.74/s (n=209) Rate a b a 124/s -- -35% b 192/s 55% --

Update: added "on this platform"


Replies are listed 'Best First'.
Re: Re: what's faster than .=
by pg (Canon) on Mar 08, 2003 at 17:36 UTC
    Your post and testing results, made me thinking why the approach Perl uses to allocate memory for hash is totally different from what it does for string.

    My answer is that they had different expectations for string and hash.

    For string, most of the time, you don't expect it to grow that much. More importantly, even it grows, it does not indicate it will continue to grow.

    But hash is different, as a collection of elements, you expect it to grow all the time. More importantly, if you see some growth, it would be reasonable for you to expect more growth. To speed up, Perl simply assume that what happened would happen again, so let's double the memory allocated, when what has been allocated is all used up.

    It is all about expectation and analysis of behavior.