Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

is 'my' that much faster than 'local'?

by gregw (Beadle)
on Mar 26, 2001 at 21:13 UTC ( [id://67242]=perlquestion: print w/replies, xml ) Need Help??

gregw has asked for the wisdom of the Perl Monks concerning the following question:

I've heard that "my" variables are faster than "local" variables, but I lack the understanding of how things work to assess A) why they're faster, and B) to what degree they'd be faster. My basic question is: is converting a program with all "local" variables in my program to "my" variables going to add any significant performance improvement?

Background: I've been converting a series of 20-150k scripts our company uses to work under use strict to accelerate them with mod_perl. But they use almost all global variables. We've figured out that we can automate 90% of the localizing process with the following TIMTOWTDI hack: create another script to scan through the source code, wrap the whole target script in a subroutine named wrapper which is called once, create declarations for all the formerly implicit global variables making them "local" variables with package name wrapper::variablename. While inelegant, this seemed to be the fastest "time-to-market" approach. But I'm wondering about performance tradeoffs. Given that we have a couple hashes, a dozen arrays and 50-100 scalar variables, I'm trying to figure out if we'd get more than a modest 5-25%-type performance payoff from me spending the time converting these scripts to fully localized my-based ones.

Also, are there other gotchas with this approach that leap out at you?

Replies are listed 'Best First'.
Re: is 'my' that much faster than 'local'?
by arturo (Vicar) on Mar 26, 2001 at 21:32 UTC

    local saves away the value of a global variable and substitutes a new one within the block in which it is called as well as for any code called from that block; my declares a new variable that is only visible within the block in which it is declared. my is faster because it doesn't have to save anything away. My understanding is that my is about 10% faster. Given that variable declaration is probably a small part of the overall program, you will see a whiz-bang speed improvement only if you're calling local a lot (like in loops).

    The 'gotchas' that lurk have to do with the fact that local acts on global variables, whereas my does not; so if your subroutines call other subroutines that read globals, you could very well get unexpected behavior. However, it's a very good idea to use my when that's what you mean; when you're passing info to your subroutines via globals, you're just asking for confusion. So I would recommend getting it to work under my for reasons other than pure performance.

    For the longer version of all this, see Variable Scoping in Perl: the basics =)

    Philosophy can be made out of anything. Or less -- Jerry A. Fodor

      Great, thanks! ~10%, more if calling local repeatedly in loops is what I was looking for. We don't call local within any loops; all the locals are at the top, just inside the sub wrapper { } block which surrounds all the other code in the script (except for the call to &wrapper).

      I think I'm avoiding the problem you raise in the second paragraph by making *all* variables local, with a long set of "local $wrapper::varname;"s at the top of the wrapper subroutine. Thus subroutines pass variables by using local variables all defined in their parent (or grandparent) subroutine, &wrapper. So I don't think I have any globals that local would mess with. And the few modules we're using are pretty common so I presume they're well-behaved in regards to using globals.

        Some of the answers here seem to be missing the point - or maybe it is me missing the point? The transformation you describe on your code is, as I understand it,

        sub Main { local ($list_of_vars); [old code, which thinks the above locals are globals] }

        Yes? In which case, the 'global' variables (global to the old code, but not _global_ :) are only instantiated _once_, throughout the lifetime of the program. So, unless the main amount of work you do is defining a load of variables (highly unlikely), you won't gain anything switching local to my.

        To be honest, if you want faster scripts with minimal optimization (i.e., no algorithm analysis, etc.) just run it on a faster machine. A lot of people will baulk at that, but look at the costs involved and make the decision.


        Cheers.

Re: is 'my' that much faster than 'local'?
by tadman (Prior) on Mar 26, 2001 at 21:24 UTC
    It would be difficult to Benchmark a so-called typical script, since the way the variables are used, how many there are, and what kind they are would likely affect performace, gains or otherwise.

    The "wrapper" approach seems interesting, but might be missing out on some gains.

    Why not this? A 'local' finder program that generates a list of local declarations using a loose regexp. You will have to prune this because some things can't be declared using my, such as '$_' and filehandles, among others. Parsing Perl perfectly (PPP?) is difficult, but this is a pretty specific task, and any failures in parsing should be easy to detect and correct. Perhaps, as a start (quick hack):
    $_ = join ('', <STDIN>); s/\blocal(\s*\(?([\$\@\%][a-zA-Z0-9_]*\s*,?\s*)+\)?\s*[;=])/my$1/g; print;
    Obviously a more sophisticated example would try and figure out what is being declared, and avoid some traps such as 'my ($_)'. I have tried to avoid 'my-ifying' strings which contain the word 'local' by checking for a '=' or ';' somewhere in there, as well as things that look like variables ($x, @x, %x).

    The 'de-localizer' program, for lack of a better name, would output a 'my-ified' version of any source you feed it. You could diff the two, and edit the diff if any errors crop up, or alternatively, write a second utility to analyse the Perl warnings and automatically switch back to local for those particular lines.

    Just my $2e-2 worth.

    Of course, this program could completely bust your logic because of variable scoping issues that arise from the subtle differences between local() and my(). Be sure to test thoroughly.
Re: is 'my' that much faster than 'local'?
by MeowChow (Vicar) on Mar 26, 2001 at 22:31 UTC
    Realistically, it's quite unlikely that switching all your global variables to lexicals will significantly affect your script's performance, unless the script is CPU-bound rather than IO-bound.

    Your "hack" puzzles me, because you are describing a conversion of implicit local variables to explicit package-specified global variables, not local to lexical. Regardless, there are serious problems with any automated approach, and I would never attempt such a thing.

    Perl is virtually impossible to parse, and even a "simple" substitution can lead to nasty, unexpected results. If your substitution actually succeeds, you would immediately break any code that accesses the symbol table directly, uses soft (symbolic) references, or relies on the stack behaviour of "local" (though this last problem can, I suppose, be eliminated by not converting locals in subroutines). Moreover, these would be elusive run-time bugs, rather than compile-time errors (even under strict), as values would quietly be set to undef, and errors/warnings are reported from seemingly unrelated locations.

    Additionally, turning all your subroutines into inner named subroutines may not necessarily be a source of bugs, but could easily lead to confusion. You also mention that you plan to run this under mod_perl, which means that you would have doubly nested named subroutines, and I don't even want to begin pondering the implications of that.

       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
'my' IS much faster than 'local' !
by arhuman (Vicar) on Mar 26, 2001 at 22:07 UTC
    my is definitly faster than local,
    according to Advanced Perl programming and the Camel Book, the real gurus will explain why in details...
    (tilly, merlyn, anyone else ???)

    But in short (I don't master the topic yet ;-) 'my variables' uses 'scratchpad' (a special table assigned to a scope),
    rather than the usual typeglobs table (and they don't use the assign/reassign value mechanism used by local).

    The gain is double :
    You access you variable directly through the scratchpad ($a)->(the address of $a)
    rather than via the typeglob table (a)->(the typeglob table)->(the address of $a in this typeglob table)
    You avoid the the saving/restoring of the previous value of the local variable when you enter/leave the scope.

    "Trying to be a SMART lamer" (thanx to Merlyn ;-)
Re: is 'my' that much faster than 'local'?
by Dominus (Parson) on Mar 26, 2001 at 22:33 UTC
    Who cares which one is faster? They don't do the same thing, so it's like asking if you should use a screwdriver instead of a drill because it's lighter. That's an irrational reason for using a screwdriver. You use a screwdriver when you want to drive screws, not because it's light.

      Who cares which one is faster ?
      Larry Wall, Merlyn, And Tom Christiansen who wrote it in the Camel book...
      gregw who ask the question, all those who answered...

      They don't do the same thing
      But they often can achieve the same result.
      And usually people use them without even noticing the difference...

      But you may be right if you say that we should underline the difference rather than to focus on the speed.
      Anyway knowing ALL the aspects of my and local (and speed is one of them) can only be a good thing IMHO.

      UPDATE :
      This was NOT supposed to be an attack toward Dominus...
      But just a rather (goofed) attempt to say : This question (about the speed difference) is interesting even if it's not the most important one (the often ignored difference is more important to me)
      The middle of my post, which seems to be the controversial point,
      Is not saying my == local ! I just give my idea on why they are usually associated...

      I JUST put in bold the important sentence in my post, no other modification was made...
        zenmaster wrote:
        But they often can achieve the same result.
        And usually people use them without even noticing the difference...
        Well, let's look at a few other things that people use "without even noticing the difference":
        • chop and chomp.

          They're pretty darned close, until you discover the last line of the file you read from doesn't have a newline.

        • &somesub and somesub().

          Usually you won't notice a difference, but the first one makes the sub call and passes the current value of @_ as the arguments. Hmm... if you make such a sub call from within another sub and are expecting no arguments to be passed...

        • $somearray[0] and @somearray[0].

          This one bites quite a few programmers. It usually doesn't cause a problem, but guess what happens when you do something like @vals[3] = somesub() and &somesub uses wantarray? Then you've got trouble and it can be a nasty bug to find.

        • my and local.

          My creates a lexical variable that's private to a package (note that this is NOT a package variable). local temporarily replaces the current package variable with another value.

        Run the following code:

        use strict; use warnings; print test(); sub test { local $x; $x = 17; }
        Guess what? That generates an error similar to the following:
        Global symbol "$x" requires explicit package name at C:\WINNT\Profiles +\Ovid\Desktop\test.pl line 7
        That's because local does not create variables. Replace the local with a my and it works fine.

        For more information, see Coping with Scoping and Seven Useful Uses of local. Both, oddly enough, were written by Dominus, the monk you chose to take to task.

        Cheers,
        Ovid

        Update: After chatting with zenmaster and reading his update, I see that this mostly appears to be a misunderstanding of what he meant to say, but I'll leave the post as it's useful information.

        Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        But they often can achieve the same result. And usually people use them without even noticing the difference... And ther is a difference. . . a quick example:
        use Time::HiRes qw(gettimeofday); $start_time = gettimeofday(); $global_variable = "hello!"; for ($i = 0; $i < 10000000; $i++) { # replace this with "my" local $global_variable = "redefined. . "; } print gettimeofday - $start_time." seconds\n"; print "$global_variable\n";
        outputs
        31.4016380310059 seconds hello!
        and
        14.1771370172501 seconds hello!
        when using "my" instead.
Re (tilly) 1: is 'my' that much faster than 'local'?
by tilly (Archbishop) on Mar 27, 2001 at 07:02 UTC
    Your scripts are 20-150k and almost all globals?

    That must be a maintainance nightmare!

    I would recommend (over time of course) reorganizing, using strict, and using tightly scoped lexicals. The reduction in debugging effort alone will make this pay back many times over...

      Agree.

      You want performance improvements? Analyse the box to see what the load problem is (cpu/memory/disk/network).

      If you are doing lots of perl CGI is it because you do a lot of processing per-request or is it a high number of requests with little processing (so the CGI fork is killing you).

      In the former case, you have some CGIs which do a lot of processing. Benchmark those and try and improve the hotspots (think about algorithm changes, moving stuff out of loops, changing your data structures around).

      In the latter case you need to get away from CGI. So bite the bullet and go for a maintainable re-implementation (with all that tilly mentioned) of what you have now (whilst possibly adding a little more grunt to your server(s) in the short term to cut you some slack until the mod_perl version is ready).

      You *really* don't want to take your existing code, run it through a local->my mangler than *then* port it to mod_perl. Or maybe you do, but I wouldn't want to.

Re: is 'my' that much faster than 'local'?
by darobin (Monk) on Mar 27, 2001 at 01:53 UTC

    A recent thread on the mod_perl mailing list covered precisely this topic (you might want to explore one of the archives pointed to from http://perl.apache.org/). The conclusion was roughly: my() doesn't make that much difference performance-wise. It can even be slightly slower in some cases (though benchmarking it truthfully is a pain).

    However, do keep in mind that under mod_perl you want to avoid globals as much as possible. There'll be your plague, your worst nightmare. When you use globals, you have to be sure that they are all reset to their initial values at the end of your processing (unless you want to keep cross-request data, but that's the exception). Otherwise, you start seeing really weird stuff happening.

    There's a lot in the Guide about this. You probably want to read that first.

    -- darobin

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://67242]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-28 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found