Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Re: Re: Re: Re: Re: Slowness when inserting into pre-extended array

by tilly (Archbishop)
on Jul 20, 2003 at 16:54 UTC ( [id://276061]=note: print w/replies, xml ) Need Help??

in reply to Re: Re: Re: Re: Re: Slowness when inserting into pre-extended array
in thread Slowness when inserting into pre-extended array

You're right, I just glanced at av.c and it isn't a power of 2. It is 20%. Perhaps I hadn't read it carefully enough originally? (In which case someone might look at my unshift speedup since that does go by powers of 2.) Or perhaps I am mixing it up with the fact that many things I have looked at (hash resizing, some malloc arrangements) like working with powers of 2?

However 20% per time is still a geometric arrangement and therefore is a constant amortized cost per new element added. It is just a higher amortized cost. Let's calculate it in theory. Suppose that we have been adding one element at a time with push, and we have just had to recopy the array. Suppose that our size is now N. How many recopyings have we needed (ignoring the rounding because we only move an integer number)? Well N elements got moved this time. When we moved we wind up with 6/5'ths as much space, so 5/6'ths of our elements got moved on the previous time we had to move. And 5/6'ths of 5/6'ths of them got moved the round before that. etc. This is just a geometric series with r=5/6. The well-known trick for calculating those is to multiply and divide by 1-r:

N + r*N + r*r*N + r*r*r*N = (N + r*N + r*r*N + r*r*r*N)*(1-r)/(1-r) = (N + (r*N-r*N) + (r*r*N-r*r*N) + (r*r*r*N-r*r*r*N) + ...)/(1-r) = N/(1-r)
In our case r is 5/6, so 1-r is 1/6 and 1/(1-r) is 6.

Therefore, worst case, the amount of recopying that we would have to do averages out to 6 times per element in the array. Best case, just before we have to recopy is 5 times. So growing an array incrementally we have to recopy each element 5-6 times.

Now how do theory and practice work out? Well consider the following silly program to calculate the exact number of resizes after building up an array to any size, taking into account rounding etc:

my $max_size = 0; my $size = 0; my $recopies = 0; my $last_pow = 1; while (++$size) { if ($size >= $max_size) { use integer; my $old_size = $size - 1; $max_size = $size + $old_size/5; $recopies += $old_size; } if (not $size%2 and ($size>>1) == $last_pow) { my $avg = $recopies/$size; printf("% 8d: % 9d recopies. Avg %1.5f per element\n", $size, $recopies, $avg); $last_pow = $last_pow + $last_pow; } }
and the output from this is:
2: 1 recopies. Avg 0.50000 per element 4: 6 recopies. Avg 1.50000 per element 8: 28 recopies. Avg 3.50000 per element 16: 81 recopies. Avg 5.06250 per element 32: 195 recopies. Avg 6.09375 per element 64: 390 recopies. Avg 6.09375 per element 128: 783 recopies. Avg 6.11719 per element 256: 1332 recopies. Avg 5.20312 per element 512: 2726 recopies. Avg 5.32422 per element 1024: 5605 recopies. Avg 5.47363 per element 2048: 11566 recopies. Avg 5.64746 per element 4096: 23916 recopies. Avg 5.83887 per element 8192: 41278 recopies. Avg 5.03882 per element 16384: 85513 recopies. Avg 5.21930 per element 32768: 177228 recopies. Avg 5.40857 per element 65536: 367397 recopies. Avg 5.60603 per element 131072: 761722 recopies. Avg 5.81148 per element 262144: 1316176 recopies. Avg 5.02081 per element 524288: 2729094 recopies. Avg 5.20533 per element 1048576: 5658908 recopies. Avg 5.39676 per element 2097152: 11734158 recopies. Avg 5.59528 per element 4194304: 24331784 recopies. Avg 5.80115 per element 8388608: 42045207 recopies. Avg 5.01218 per element
So you see, as long as Perl can continue allocating more and more space as it wants, the amount of recopying work needed scales linearly with the size of the array.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://276061]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (2)
As of 2024-07-19 03:42 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.