Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

non schlemiel painter string concat

by Anonymous Monk
on Dec 18, 2013 at 16:27 UTC ( #1067653=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello all. In Ruby, one can have for String variable :  var << "lok"
This appends "lok" to var, but the order of complexity is independent of var's length. How can one do this in PERL?
Thanks.

Comment on non schlemiel painter string concat
Download Code
Re: non schlemiel painter string concat
by toolic (Chancellor) on Dec 18, 2013 at 16:31 UTC
Re: non schlemiel painter string concat
by Anonymous Monk on Dec 18, 2013 at 16:57 UTC

    You don't really need to worry about it, because perl's concatenation operator is optimized for that.

    use Time::HiRes qw( time ); for my $exp (3 .. 8) { my $count = 10**$exp; my $start = time(); my $x = ''; for (1 .. $count) { $x .= 'x'; } printf "%d %.5f\n", $count, time() - $start; }
    1000 0.00012 10000 0.00113 100000 0.01161 1000000 0.11535 10000000 1.09300 100000000 11.02793
Re: non schlemiel painter string concat
by davido (Archbishop) on Dec 18, 2013 at 19:06 UTC

    Putting Perl aside for a moment, there are approximately four ways to grow a string:

    • Allocate more space than anyone could ever want, and hope for the best. Null-terminate, or keep a pointer to the end. This is fragile at best, and is the source of security problems caused by buffer overruns. Perl doesn't do this.
    • Allocate just enough space. If the string grows, allocate a new buffer and move everything to the larger buffer. This becomes quite inefficient, and is probably the "schmliel the painter" algorithm to which you refer.
    • Allocate just enough space and store a pointer to it. Concatenation becomes the logical operation of allocating just enough space again, and adding another pointer to the list of discontiguous memory blocks. But discontiguous storage is complicated, and doesn't benefit from the efficiencies of accessing contiguous blocks of memory.
    • Allocate a little extra space so that the string can grow a little without the "allocate and copy" dance. Every time the string's growth runs out of buffer space, allocate a new buffer with room for the concatenated string, plus extra space.

    The last one is what Perl does. By allocating extra space, the "allocate and copy" inefficiencies are mitigated. In brief searching I haven't found the definitive source on how Perl approaches this, but my recollection is this:

    1. For strings under 256 bytes, allocate a 256 byte buffer.
    2. When a string grows to exceed buffer space, jump to 1kb and copy into new buffer.
    3. When a string grows to exceed the 1kb, allocate 2kb, and copy everything over.
    4. Repeat, doubling each time.

    Moving contiguous blocks of memory around is a pretty quick operation for most processors. The key is to avoid doing it too often. The other key is to not waste too much memory by allocating way too much extra. Just where Perl sets the transition points I don't recall exactly.

    The process does grow in time complexity as the string's growth continues doubling. But it doesn't (ideally) happen on each iteration of growth. I believe that people suggest that concatenation occurs in amortized constant time. In other words, any given iteration may be expensive, but the bulk of the iterations will be cheap, to the point that it's as good as constant time in the aggregate. The same principle applies to hash insertions; constant amortized time, even though an individual insertion may be relatively expensive.

    Ruby probably uses the same strategy as Perl. There really isn't any better strategy. The generalized form of the algorithm also applies to Perl's arrays, C++'s std::vector container, C++'s std::string data type, and so on. It just keeps popping up because it's so useful. Details of threshold points and other optimizations may change, but the underlying algorithm is mostly the same.


    Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1067653]
Approved by toolic
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2014-08-30 09:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (292 votes), past polls