http://www.perlmonks.org?node_id=241333


in reply to what's faster than .=

Devel::Peek is very useful for finding out this sort of low-level detail about perl's variables. When using it to dump a string, you'll see two values listed at the end - "CUR" is the current length of the actual string, and "LEN" is the size of the memory area currently reserved for it.

A simple test of $string .= $nextchar shows that (with perl-5.8.0 on this platform at least) the LEN is changing in lockstep with CUR under this approach - perl is resizing the string buffer to be just big enough for the new string each time, so there is a lot of reallocing and therefore slow string copying going on.

One simple workaround is to presize the buffer, which of course will work best if you've got a good idea how big the string is likely to get. Here's a benchmark to demonstrate that:

use Benchmark; Benchmark::cmpthese(shift, { a => q{ $a[++$i] = ""; $a[$i] .= "a" for 1 .. 10000; }, b => q{ $b[++$j] = "b" x 10000; $b[$j] = ""; $b[$j] .= "b" for 1 .. 10000; }, });
and the results:
Benchmark: running a, b for at least 1 CPU seconds... a: 1 wallclock secs ( 1.03 usr + 0.01 sys = 1.04 CPU) @ 12 +4.04/s (n=129) b: 1 wallclock secs ( 1.08 usr + 0.01 sys = 1.09 CPU) @ 19 +1.74/s (n=209) Rate a b a 124/s -- -35% b 192/s 55% --

Update: added "on this platform"

Hugo