Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Pre vs Post Incrementing variables

by SavannahLion (Pilgrim)
on Sep 12, 2010 at 07:22 UTC ( [id://859811]=perlquestion: print w/replies, xml ) Need Help??

SavannahLion has asked for the wisdom of the Perl Monks concerning the following question:

Can someone help me understand why Perl does this with a Pre-Increment:

my $i = 0; my ($j, $k) = (++$i, ++$i); #Pre-increment print "\$j: $j\t\$k: $k\n";
which outputs:
$j: 2 $k: 2
When what I actually expected was:
$j: 1 $k: 2

Whereas Perl outputs the following with the following code

my $i = 0; my ($j, $k) = ($i++, $i++); #Pre-increment print "\$j: $j\t\$k: $k\n";
Which, as expected, outputs:
$j: 0 $k: 1

According to the Camel:

The ++ and -- operators work as in C. That is, when placed before a variable, they increment or decrement the variable before returning the value, and when placed after, they increment or decrement the variable after returning the value.
To my surprise, a similar test program in a C++ IDE netted the exact same result. Wow...OK, fair enough.

I can accept this is and it just means I'll have to expand my code a few extra lines to accommodate. I would like to know the why of it though. Why does both ++$i variables get evaluated before being fed into print whereas $i++ is evaluated in sequence? What's the logic behind that?

Replies are listed 'Best First'.
Re: Pre vs Post Incrementing variables
by Erez (Priest) on Sep 12, 2010 at 08:14 UTC

    I think that if you read on, on the description of the auto-increment operators, or check perlop, you'll notice the following warning:

    Note that just as in C, Perl doesn't define when the variable is incremented or decremented. 
    You just know it will be done sometime before or after the value is returned. 
    This also means that modifying a variable twice in the same statement will lead to undefined behaviour. 
    Avoid statements like:
    1. $i = $i ++; 2. print ++ $i + $i ++;
    Perl will not guarantee what the result of the above statements is.
    

    "Principle of Least Astonishment: Any language that doesn’t occasionally surprise the novice will pay for it by continually surprising the expert..

      The main reason it's declared as "undefined" is that the current implementation surprises many people, and can only be justified by exposing the implementation. Out of the two evils "having undefined behaviour" and "exposing the implementation to the language", I think the first one is the lesser.

      If you really need to increment $i twice, just write it in more than one statement:

      print $i + 1, $i + 1; $i += 2;
      Or:
      print do {$i += 1}, do {$i += 1};
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Pre vs Post Incrementing variables
by moritz (Cardinal) on Sep 12, 2010 at 08:11 UTC
    What's the logic behind that?

    The logic is that if you don't define strict rules for order of execution within statements, you have some potential for optimizations.

    This was particularly important for C, where assembler level statement re-ordering was (and maybe still is) an important optimization technique. I don't know how much perl 5 benefits from this freedom, if at all.

    Curiously we had a similar discussion in #perl6 the other day, and it turns out the Perl 6 specification defines sequence point operators within statements which synchronize evaluation. That means that on the right-hand side of such an operator you can rely on changes to variables made on the left-hand side.

    But in general it's much safer to use a variable only once in a statement where it's modified.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: Pre vs Post Incrementing variables
by BrowserUk (Patriarch) on Sep 12, 2010 at 09:42 UTC

    Because long, long ago, in a land far away, the designer's of a language wrote in their specification:

    The precedence and associativity of all the expression operators is summarised in section 18. Otherwise, the order of evaluations of expressions is undefined. In particular the compiler considers itself(*) free to compute sub-expressions in the order it believes(*) most efficient, even if the sub-expressions involve side effects. The order in which side effects take place is unspecified. Expressions involving a commutative and associative operator (*, +, &, |, ^) may be rearranged arbitrarily, even in the presence of parentheses; to force a particular order of evaluation, an explicit temporary must be used.

    (*) Who knew they had sentient software way back then :)

    Essentially, the designers of the C language traded source code ambiguity, programmer intuition, and program correctness, for performance.

    Instead of allowing the programmer to specify exactly the order in which sub-expressions would be executed--through the intuitive use of parentheses and precedence, rather than clumsy hack of creating unnecessary temporary variables--they opted to allow compiler writers to re-order those sub-expressions "for efficiency".

    Perhaps the most short-sighted and pervasive premature optimisation ever.

    In the days when cpu clock speeds were measured in low megahertz and single opcodes could take dozens of those clock cycles to perform; in a language that was explicitly designed to compiled directly to machine code; such optimisations could have a significant impact on program performance. So programmers would tolerate such inconveniences, for the gains that could be achieved.

    Now, it is arguable that the trade-off no longer makes sense: when you have processors with speeds measured in Gigahertz executing multiple instruction per clock cycle; in a language were even the simplest of sub-expressions takes dozens if not hundreds of op-codes. The scope for performance optimisation through the reordering of sub-expressions is essentially non-existent.

    But, for better or worse, Perl apes many of the rules laid down by the C language. In part I believe, because it made things easier for C programmers moving to Perl. So, we have the situation whereby the order of sub-expression evaluation is "unspecified", meaning that you cannot predict the result of using two or more side-effectful operations on a single variable within the same expression, on the basis of the language description alone.

    It also means that when ambiguities arise in the implementation, such as the one you describe, you cannot point a finger and called it a bug, because "the behaviour is unspecified". You'll simply be told: "Don't do that!".

    The fact that there is no logical reason, performance or otherwise, for having pre- and post-increment operate differently in this regard:

    $n=0; print ++$n, ++$n, ++$n, ++$n;; 4 4 4 4 $n=0; print $n++, $n++, $n++, $n++;; 0 1 2 3

    is simply dismissed as user error. "Because you shouldn't be doing it anyway!".

    One day, someone will see through these hoary ol' chestnuts of programming lore, and define a language that allows the programmer to write source code that specifies exactly what he wants to happen, and know it will happen. Without having to recourse to inventing unnameable intermediary variables for no good reason.

    Till then, you'll just have to get used to not using two or more side-effectful operators, on the same variable, within the same statement. {shrug}


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      It's got nothing to do with operand evaluation order. It has to do with pre-increment returning lvalues.

      f(++$i, ++$i)
      is more or less equivalent to
      do { local @_; alias $_[0] = ++$i; alias $_[1] = ++$i; &f; }

      Since the pre-increment operator returns the variable itself and not a copy, the above is the equivalent to

      do { local @_; ++$i; alias $_[0] = $i; ++$i; alias $_[1] = $i; &f; }

      As you see, f() sees the same value for both arguments as both arguments are the same variable.

        It's got nothing to do with operand evaluation order.

        Who said anything about operand evaluation order? (How could there be an operand evaluation order for a unary operator.)

        I'll come back to this.

        It has to do with pre-increment returning lvalues.
        $i = 0; ++$i = 'fred';; Can't modify preincrement (++) in scalar assignment

        So, not an lvalue.

        is more or less equivalent to ... alias

        Ah! That well know Perl keyword 'alias'....

        It's got nothing to do with operand evaluation order.

        Hm. "It" has everything to do with the fact that the evaluation order of sub-expressions is unspecified.

        Where "it" is the contradictory and useless behaviour, observed by the OP, and repeated in my post.

        Because, if were specified, then the implementation would not be able to get away with producing those totally illogical, useless results, for those unwise or unknowing enough to try and use, what could be a useful behaviour.

        Your attempts to explain how the implementation produces these useless results from this unspecified and therefore deprecated code, does naught to detract from the reason why it has been possible to enshrine this broken behaviour in the implementation.

        The fact that f( ++$n, ++$n ) passes an alias to $n, rather than the value resulting from the preincrement, is just another broken behaviour. It is equivalent to C allowing:

        #include <stdio.h> void f( int *a, int *b ) { printf( "a: %d, b: %d \n", *a, *b ); *a = 1; *b = 2; } int main( int argc, char ** argv ) { int x = 0; f( &( ++x ), &( ++x ) ); return 0; }

        Which it doesn't:

        junk.c junk.c(11) : error C2102: '&' requires l-value junk.c(11) : error C2102: '&' requires l-value junk.c(11) : error C2198: 'f' : too few arguments for call

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Pre vs Post Incrementing variables
by repellent (Priest) on Sep 12, 2010 at 08:55 UTC
    The docs mention no guarantees for the result of such code. Here's what it seems to be doing (at this time of writing):
    • ++$i increments $i and then returns an alias to $i
    • $i++ returns a copy of $i and then increments $i

    (++$i, ++$i) returns two aliases to $i, which has been incremented twice, resulting in (2, 2).

    ($i++, $i++) returns two copies of $i, one before and one after the first post-increment, resulting in (0, 1).

Re: Pre vs Post Incrementing variables
by cdarke (Prior) on Sep 12, 2010 at 11:14 UTC
    a similar test program in a C++ IDE netted the exact same result.

    In my experience, C and C++ compilers differ. Take the code:
    int x = 0; printf("%d %d %d\n",++x,++x,++x);
    A similar line in Perl (allowing for syntax differences) gives 3 3 3, because of optimisation.
    PHP give 1 2 3, and probably denotes a lack of same.
    MS Visual Studio 6 C++ you get 3 3 3 when compliled in Release (again, optimisation) but (wait for it) 3 2 1 in Debug (no optimisation).
    gcc (Windows and Linux) gives 1 2 3 regardless of the optimisation level.

    Upshot of that? Don't do it!

      Well... In the samples I gave above, the C++ IDE I used was MS Visual Studio 8 (or 2005, whatever it wants to call itself now) which gave 2 2 (or 3 3 3 in your sample) in the default Debug mode. So I guess virtually add it to the list there.

      It is, however, useful to know this about the compilers. I know the C family compilers have their own idiosyncrasies, I guess I'll put this on the pile things to note.

        I have noticed differences in later .Net versions of Visual Studio. I did not have them to hand when I wrote the node, and didn't want to guess. It could be that there is optimisation even in Debug. Check the project settings?

        So far as I can tell though, the order of execution (in C) of parameters is not defined. The reason why VS 6.0 (Debug) executed them from right to left was because that is the order of the C calling convention, __cdecl.
Re: Pre vs Post Incrementing variables
by ikegami (Patriarch) on Sep 12, 2010 at 16:14 UTC

    Given my explanation of the problem, the workaround would be to convert the lvalue returns by the pre-increment operator into rvalues.

    >perl -E"say ++$i,++$i;" 22 >perl -E"say 0+(++$i),0+(++$i);" 12

    Technically, the operand evaluation order for the comma operator in list context is not defined, but it's not likely to ever change, especially since the operand evaluation order for the comma operator in scalar context is defined.

      Technically, the operand evaluation order for the comma operator in list context is not defined,
      The perlop manual page says in the section about the comma operator:
      In list context, it’s just the list argument separator, and inserts both its arguments into the list. These arguments are also evaluated from left to right.
      This has been in perlop since April 2006.
Re: Pre vs Post Incrementing variables
by tomfahle (Priest) on Sep 12, 2010 at 08:32 UTC
      I don't see how this is a precedence problem, ++ has tighter precedence than the comma, both for pre- and post increment - no surprises here.

      Also adding parens around the (++$i) doesn't change anything.

      Perl 6 - links to (nearly) everything that is Perl 6.
Re: Pre vs Post Incrementing variables
by SavannahLion (Pilgrim) on Sep 12, 2010 at 14:29 UTC

    Many thanks for the answers. As much as it annoys me to split up a nice single line of code into several to accommodate something like this, it is something that I can readily deal with and lose little sleep over.

Re: Pre vs Post Incrementing variables
by DrHyde (Prior) on Sep 15, 2010 at 10:16 UTC

    In this:

    my $i = 0; my ($j, $k) = (++$i, ++$i); print "\$j: $j\t\$k: $k\n"; $j: 2 $k: 2

    It first evaluates the ++es (because it's a pre-increment, so it happens before anything else), so $i = 2. Then, it builds a list of (2, 2), and finally assigns that list to ($j, $k).

    In this:

    my $i = 0; my ($j, $k) = ($i++, $i++); print "\$j: $j\t\$k: $k\n"; $j: 0 $k: 1

    Because it's a post-increment, the ++es get evaluated *after* $i is "evaluated" to build the list. So the first $i is evaluated, it's 0, so 0 goes into the list. Then the ++ happens, setting $i = 1. Then the next $i is evaluated, putting 1 into the list, and finally $i is incremented again. End result, a list of (0, 1) is assigned to ($j, $k).

    But as others have pointed out, having multiple ++es and --es in a statement is Bad Juju. What I've explained is what happens in current perls, but I wouldn't trust it to work the same in the next point release of perl 5, let alone in perl 6. See perlop.

Re: Pre vs Post Incrementing variables
by girarde (Hermit) on Sep 14, 2010 at 12:44 UTC
    My guess: the entire right hand expression is evaluated before assignments are made. Reasonable except when you want something else, but in that case, what should the rule be instead?

      My guess: the entire right hand expression is evaluated before assignments are made.

      Of course it does, but it doesn't explain anything. The question can be phrased as "Why is the output of the first snippet different from the output of the following snippets in the following code?"

      $ perl -E'$i=0; ($j, $k) = (++$i, ++$i); say "j:$j k:$k";' j:2 k:2 $ perl -E'$i=1; ($j, $k) = ($i++, $i++); say "j:$j k:$k";' j:1 k:2 $ perl -E'$i=0; ($j, $k) = (++$i+0, ++$i+0); say "j:$j k:$k";' j:1 k:2

      The answer is here.

      My guess: the entire right hand expression is evaluated before assignments are made.
      Good guess. Doesn't really answer the question though, as any assignment will evaluate the RHS before doing the assignment. (If it didn't, what would it assign? Purple monkeys?)

        Pink. Purple is on Thursdays.

        --MidLifeXis

        Actually, it does. Evaluating that right hand increments $i twice. This occurs before either of the assignments.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://859811]
Approved by zwon
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2025-06-20 10:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.