Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^5: perl process slower and slower when loop number increase

by dave_the_m (Monsignor)
on Jan 22, 2018 at 20:57 UTC ( #1207704=note: print w/replies, xml ) Need Help??

in reply to Re^4: perl process slower and slower when loop number increase
in thread perl process slower and slower when loop number increase

You are misinterpreting the graph: note that it has a logarithmic x-axis. Everything in this thread has shown that (subject to measurement noise), the execution time of this perl code:
for ($i = 0; $i < N, $i++) {}
increases linearly with N.

For the purpose of focusing the discussion, do you (1) dispute that the relationship between time and N is linear, or (2) agree that it is linear, but wish it to be sub-linear?


Replies are listed 'Best First'.
Re^6: perl process slower and slower when loop number increase
by pryrt (Prior) on Jan 22, 2018 at 22:11 UTC

    If the Anonymous OP is confused by the logarithmic x-axis: for an equal distance along the x-axis, the x-value goes up by a constant factor. (So if the ticks were 1cm apart, and 10 is on the far left, then 1cm to the right of 10 is 100, and 1cm to the right of 100 is 1000: ie, every factor of 10 gets an equal amount of space on the x-axis). gnuplot labels the ticks, so it was unambiguous; however, not everyone has experience looking at a graph with a logarithmic x-axis, so the curve the OP saw might not be immediately identified in the OP's mind as linear.

    If the OP wants a graph with a linear x-axis, then edit choroba's code by changing the following lines in sub plot { ...: change the lines here:

    print {$gp} join "\n", 'set term png;', 'set output "measure.png";', 'set key left;', 'set logscale x;';
    into the lines here:
    print {$gp} join "\n", 'set term png;', 'set output "measure.png";', 'set key left;'; # 'set logscale x;';

    When I plotted with a linear x-axis, the curve for perl is almost perfectly straight -- I think that might communicate the idea to the OP with more clarity.

      in my plot,

      if I comment 'set logscale x;';

      yes the line perfectly straight, but my X axis start after 1x10e6

      so my X axis list is: 0 1x10e7 2x10e7 3x10e7 4x10e7 5x10e7 6x10e7 7x10e7 8x10e7 9x10e7 1x10e8

      its same as my reply for choroba comment, what is "1MB limit" before it slowing down ?

      and why beyond 1Mb number,perl (and now similiar ruby) slowing down more extreme than another language

      and this 1MB limit is not just about limit of $i but like the whole loop operation

      in previous comment, using this variant: time perl -e 'for($i=0;$i<=1000;$i++){for($j=0;$j<=1000;$j++){for($k +=0;$k<=100;$k++){}}}' have the same time with just using 1 loop with $i=1x10e8

      I tought if I make the loop with smaller number It will break 1MB limit, but it's not...

      so i still want to find this "magic setting" (maybe through ENV or system setting ) that can increase this 1MB limit, and hoping without need to hack perl core or recompile from source

        Did you see the line between 0 and 1e7 (again, your notation is flawed)? That's also from your data... an it's linear down there, too. Everything says it's linear. Why do you think differently?

        it's linear. There isn't a limit. Using the same code, change the input x data to as many points as you want, going from 1e5 to 1e7, distributed however you want. You will see it makes a straight line. There is no discontinuity, there is no limit. If you counted at one per second, it would take you a million seconds to count to a million, and a billion seconds to count to a billion. On a different scale, that's what is happening here (since CPU counts faster than you).

        The reason your multi-loop (1000 in the i loop, 1000 in the j, and 100 in the k) doesn't change behavior is because that is equivalent to a single loop of 1e8 elements. The perl time is behaving linearly, as we keep saying. One thousand loops takes 1000 times as long as one loop; one million loops takes 1000 times as long as one thousand loops. A hundred million takes 100 times as long as one million. That's the nature of a loop.

        To add more mods for you to try, to better understand what's going on with perl's linearity: If you want to be able to explore your data and zoom into different ranges, still keeping the linear axis, change to

        my $ubound = 1e7; # set this to change the upper bound for graph +ing; [choroba]'s would be 1e8. print {$gp} join "\n", 'set term png;', 'set output "measure.png";', 'set key left;', # 'set logscale x;', "set xrange [10:$ubound]", ;
        where you can now edit the $ubound value to change where you zoom in to.

        Also, a couple lines down from the block above, change with lines to with linespoints to get it to draw points as well as lines.

        Finally, let's change the main for loop to the following. This will add a perfectly linear set of data as well.

        for my $n (10, 100, 1e3, 1e4, 1e5, 1e6, 2.5e6, 5e6, 7.5e6, 1e7, 2.5e7, 5e7, 7.5e7, 1e8 ) { print STDERR "$n\n"; for my $lang (sort keys %run) { $time{$n}{$lang} = measure($run{$lang}, $n, prepare($lang, + $n)); } $time{$n}{linear} = 7e-8 * $n + 1e-6; # [pryrt] added th +is perfect mathematical line: y = mx + b # Anonymous Monk should use + 7e-7 instead of 7e-8 } $run{linear} = undef; # [pryrt] added so + that plot() below will include 'linear'

        For example, on my system, with $ubound = 1e7 and linespoints, see for a side-by-side of logscale-x and linear-x, and noticing that there is data below "1x106" (1e6) on the linear-x graph -- it's just hard to see because it's all overlapping. (I also included the complete data set to 1e8 on both logscale and linear-scale)

        When I do this, the perl graph looks linear. The C graph is mostly flat, implying that either the setup time for c is dwarfing its per-loop time, or the c compiler has optimized out the dummy s+=i;, since the result was never used.

        Since you keep saying there's a "1MB limit" (which, as I already said, is incorrect notation; we aren't dealing with bytes): look at my graphs. Where do you see this limit at 1e6 (1.0x106)? I see a reasonably-straight line. It's not perfectly straight, like y=mx+b, but very little in the real world is. And I picked my y=mx+b slope to match the data you've shown1, not to match my data, so there are different slopes between the my perl and the math-line. But as someone who looks at real world electronics measurements for a living, I would call that perl data "a straight line".

        1: oops, no, it's off by a factor of 10, sorry; I should have used a slope of 7e-7, not 7e-8, to get ~70sec at n=1e8, sorry.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1207704]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2019-07-20 09:16 GMT
Find Nodes?
    Voting Booth?

    No recent polls found