http://www.perlmonks.org?node_id=175261

moof1138 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I have a script in which I have a few arrays of about 100 strings, each string being about 200 characters long, give or take. When I wrote this thing I declared them in the main scope, even though some arrays are only really used by one function, assuming that it was better loading them once right at the beginning, since otherwise they would be pushed into memory anew each time the function was called, slowing things down a little. Now I wonder if this might actually not be the case. Is it better to put the arrays in the scope of the function that is using them, leave them scoped as they are now, or does it matter at all, and why? Any thoughts appreciated.

Replies are listed 'Best First'.
Re: scoping large arrays - newbie Q
by perrin (Chancellor) on Jun 18, 2002 at 05:36 UTC
    Why not benchmark it and see? Use the Benchmark module. One thing you may not know is that the memory used by lexical variables is not freed when they go out of scope. Perl keeps it allocated in case you use the same lexical again.
      That was news to me as well - perrin++

      Makeshifts last the longest.

Re: scoping large arrays - newbie Q
by samtregar (Abbot) on Jun 18, 2002 at 04:11 UTC
    It's a little hard to know exactly what you're talking about since you didn't include any code. But my guess is you need to learn about references. With references you can create an array in one function and pass it to another without paying the penalty to recreate it. This is known as "pass-by-reference" to comp-sci geeks. Here's an example:

    sub foo { my @array = ( 0 .. 100 ); # create a new array bar(\@array); # pass it to bar() by reference } sub bar { my $array_ref = shift; # get reference to foo()'s array foreach (@$array_ref) { # print out each value print; } }

    If this is your first encounter with references then you've got some learning to do. I suggest you pick up a copy of Learning Perl or Programming Perl and dig in!

    -sam

      Thank you for the response. I do know about using references, I have used them here and there to pass something to a function that was not in that function's scope. That's not really what I am looking for here. I was not clear enough. I just really wonder whether it is more efficient to set up my array at point A, or point B in the pseudocode below which is trying illustrate my question:
      #!/usr/bin/perl -w use strict; my @bigarray = ("insert", "a very", "long list here");#point A my $thing1 = mySub(); my $thing2 = mySub(); #...etc - using mySub repeatedly sub mySub { #point B - should I declare my @bigarray here instead, and why? for (@bigarray){ #do stuff with array } }
        With the example you have given, i say point A, because if you declare your array at point B, you will re-declare it as many times as you call mySub(). Also, consider passing @bigarray to mySub() as a reference. Just be sure to declare mySub() before @bigarray, otherwise @bigarray is accessible by mySub():
        use strict; sub mySub { my $ref = shift; for (@$ref) { #do stuff with array } } my @bigarray = ("insert", "a very", "long list here"); my $thing1 = mySub(\@bigarray); my $thing2 = mySub(\@bigarray);
        If the array in question is only pertinent to the subroutine, and either that sub will only be called once or the array will change with each sub call, then declare the array inside the subroutine.

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
        Can I choose point C? Here's an alternative:

        { my @bigarray = ("insert", "a very", "long list here"); # c sub mySub { for (@bigarray){ # do stuff with array } } }

        This keeps @bigarray private to the subroutine but only initializes it once. You get the best of both worlds at no added cost!

        -sam

Re: scoping large arrays - newbie Q
by Aristotle (Chancellor) on Jun 18, 2002 at 04:39 UTC

    Generally, from what I've seen so far Perl is rather good at managing memory efficiently and quickly.

    Having the array persistent may speed things up if the function is being called ten thousands of times, but it also is an invitation to hogging memory, not to mention global variables just generally tend to lead to headaches. Creating a lexically scoped array a new slows things down a wee notch, but keeps the code clean and the memory footprint lean.

    We're only talking 20,000 bytes of data here, that's not something I'd consider much of an array.

    Two options you have if you really need the performance is passing around references instead, to keep things properly scoped without the overhead of array allocation; or using a closure like so:

    { my @not_global; sub operate_on_not_global { @not_global = whatever(); } }

    Bottome line: personally, I always take the defensive approach - I can always add dirty tricks later if my clean code is not fast enough, but if I start out dirty at the scratchpad, I'll never get a handle on things and end up with a big ball of mud.

    Makeshifts last the longest.

      Thank you Screamer, that confirms what I had been getting the nagging feeling about. I had been leery of the globally scoped vars, since I would not consider using them in other languages. While I was wondering about the question in general, after I though about it for a minute, I realized that performance is not really an issue for this project - it is just an AIM bot, it winds up needing to sleep at times to keep it from hitting the AIM flood control limits anyway. I just reworked it scoping all arrays proper to their functions, and I don't feel any performance difference, really.