Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Why doesn't perl optimize this?

by nbtrap (Sexton)
on Jun 03, 2013 at 23:44 UTC ( #1036853=perlquestion: print w/replies, xml ) Need Help??
nbtrap has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

When I run the following code:

#!/usr/bin/perl use strict; use feature 'say'; sub foo { my @a = 'aaaaa' .. 'zzzzz'; say time; return [ @a ]; } say time; my $a = foo(); say time;

I get this for output:

1370302025 1370302028 1370302032

Whereas, when I run this code:

#!/usr/bin/perl use strict; use feature 'say'; sub foo { my @a = 'aaaaa' .. 'zzzzz'; say time; return \@a; } say time; my $a = foo(); say time;

I get this output:

1370302601 1370302604 1370302604

In other words, in the former case, two huge arrays are created, but in the latter case, only one huge array is created. My question is: Why doesn't perl optimize the former code to only produce one huge array? It must know that the reference count on @a goes to zero at the same time it's copied. Can't it optimize the former's return statement to work just like the latter's? Is it really as simple as it seems, or am I overlooking some complicating factor(s)?

Replies are listed 'Best First'.
Re: Why doesn't perl optimize this?
by BrowserUk (Pope) on Jun 04, 2013 at 02:20 UTC
    Why doesn't perl optimize the former code to only produce one huge array?

    Because no one has programmed it to do so. And that in major part is because there is no call for it.

    The number of occasions when such an optimisation would be triggered is so few. Very few people would construct an anonymous array, from a list, from an existing array, in order to return a reference.

    Conversely, the cost of checking for those extremely rare opportunities would have to be paid for every subroutine that returned an array reference.

    Its a high-cost, low-benefit complication to the core that no one has seen fit to write.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why doesn't perl optimize this?
by ikegami (Pope) on Jun 04, 2013 at 04:35 UTC
    Note that It can't safely be optimised that way at compile-time. Consider
    sub f { my @a; tie @a, 'SomeClass' if $_[0]; g(\@a); return [ @a ]; }

    If @a or one of its elements is magical, if an element of @a overloads =, or if the REFCNT of @a or one of its elements is greater than one, then the optimisation can't be applied.

    So if f's first argument is true, then the optimisation can't be applied. If g saved the reference to @a, then the optimisation can't be applied. etc.

    We're talking about a massive amount of resources to avoid a rare annoyance. Make a perlcritic rule instead.

Re: Why doesn't perl optimize this?
by rjt (Deacon) on Jun 04, 2013 at 01:55 UTC

    Perl doesn't know whether you are going to use @a again in your first example. It's true that you don't use it, but the reference count doesn't actually go to zero until the end of the block, which comes after the [ @a ] construct.

    But that is really beside the point: square brackets convert a list into an array ref. \@a, on the other hand, creates a reference to an array. Lists are not arrays.


      Maybe clearer: [@a] creates the ref to a copy (and copying is expensive) while \@a is the ref to the original array (no costs!).

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        I know the difference. My question was not about the difference, it was about why the difference is not optimized away in cases like the one I cited.
      Yes, perl doesn't know, else it wouldn't copy the array. But my question is _why_ doesn't it know? Unless I'm mistaken, there's enough information in the code to know that @a will not be used again--even at compile time. This seems like a trivial optimization to me, but as I say, perhaps I'm overlooking something.


        All tasks are simple ... if you're not the one doing the work. Sure, there's enough information for perl to do the optimization you suggest. In fact, there may be enough information for it to simplify the code to:

        say time; say time; say time;

        However, code optimization is harder than it looks[1]. Notice that both say and time are calls to other code, which could possibly change @a. Not in this case, obviously[3][5][6], but for perl to know that a priori, it would have to track whether or not @a was possibly aliased to another glob. Using static analysis, that could be determined, except that perl has features that may make static analysis intractable or impossible.

        Disclaimer: I know nothing substantial about the internals of perl. I wrote a compiler some years ago, and spent a little (very little) time trying to put in some optimization. It was shortly after reading the Dragon book[2] . When you read the book and start imagining the data structures and algorithms you need to build it gives you a good appreciation of the problem.

        It's easy to go down rabbit-holes, too. How much time should it work on optimizing the code? If there are easy optimizations with big payoffs, then it's a no brainer. But after a while, you'll start running into diminishing returns. If you spend too much time on the optimization phase, then you can lose more time than you save.


        1. If it were easy, the code we write wouldn't need optimization, would it?
        2. The first edition, as the second didn't exist yet. It's a fascinating read, you should pick it up if you're interested in language construction and/or design!
        3. To us humans[4], anyway.
        4. Yes, the phrasing might be odd, considering my moniker.
        5. I don't really feel like renumbering these notes.
        6. I was wondering whether superscripts would stack, now I know.
        7. With silly notes...


        When you don't get enough sleep for an extended period of time, your posts can come off as a bunch of stream-of-consciousness blather[7].

        What if @a is tied? What if it has other magic attached? You and I both know it almost never does, but the cost of detecting those cases is substantial.

        Personally, I do prefer that it does not know. Any attempt to make a program smarter than the user, turns this program into monster prone to errors and hard to control. There must be a balance between "smartness" and "complexity". After all we, people, also should use our brains :)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1036853]
Approved by davido
Front-paged by ww
erix stops reading as recommended by the blog itself
[erix]: my attention being jealously guarded ;)
[shmem]: erix, yes, it's a WTF. Since...
[erix]: antisemitic && (pro-)russian. Makes sense :P
[shmem]: erix: what's antisemitic in there? Tell me.
[erix]: AngloZionist
[shmem]: and it is *not* pro-russian. It describes the western world.
[erix]: I'm not going to read it so I could be mistaken but it reeks of useful idiocy
[shmem]: AngloZionist is not an antisemitic term.
[shmem]: erix: be assured that we all are horribly mistaken, always. What would we write tests for, if not for that fact?

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2018-03-19 20:55 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (246 votes). Check out past polls.