SavannahLion has asked for the wisdom of the Perl Monks concerning the following question:
Can someone help me understand why Perl does this with a Pre-Increment:
my $i = 0;
my ($j, $k) = (++$i, ++$i); #Pre-increment
print "\$j: $j\t\$k: $k\n";
which outputs:
$j: 2 $k: 2
When what I actually expected was:
$j: 1 $k: 2
Whereas Perl outputs the following with the following code
my $i = 0;
my ($j, $k) = ($i++, $i++); #Pre-increment
print "\$j: $j\t\$k: $k\n";
Which, as expected, outputs:
$j: 0 $k: 1
According to the Camel:
The ++ and -- operators work as in C. That is, when placed before a variable, they increment or decrement the variable before returning the value, and when placed after, they increment or decrement the variable after returning the value.
To my surprise, a similar test program in a C++ IDE netted the exact same result. Wow...OK, fair enough.
I can accept this is and it just means I'll have to expand my code a few extra lines to accommodate. I would like to know the why of it though. Why does both ++$i variables get evaluated before being fed into print whereas $i++ is evaluated in sequence? What's the logic behind that?
Re: Pre vs Post Incrementing variables
by Erez (Priest) on Sep 12, 2010 at 08:14 UTC
|
I think that if you read on, on the description of the auto-increment operators, or check perlop, you'll notice the following warning:
Note that just as in C, Perl doesn't define when the variable is incremented or decremented.
You just know it will be done sometime before or after the value is returned.
This also means that modifying a variable twice in the same statement will lead to undefined behaviour.
Avoid statements like:
1. $i = $i ++;
2. print ++ $i + $i ++;
Perl will not guarantee what the result of the above statements is.
"Principle of Least Astonishment: Any language that doesn’t occasionally surprise the novice will pay for it by continually surprising the expert..
| [reply] [d/l] |
|
print $i + 1, $i + 1;
$i += 2;
Or:
print do {$i += 1}, do {$i += 1};
| [reply] [d/l] [select] |
A reply falls below the community's threshold of quality. You may see it by logging in. |
Re: Pre vs Post Incrementing variables
by moritz (Cardinal) on Sep 12, 2010 at 08:11 UTC
|
What's the logic behind that?
The logic is that if you don't define strict rules for order of execution within statements, you have some potential for optimizations.
This was particularly important for C, where assembler level statement re-ordering was (and maybe still is) an important optimization technique. I don't know how much perl 5 benefits from this freedom, if at all.
Curiously we had a similar discussion in #perl6 the other day, and it turns out the Perl 6 specification defines sequence point operators within statements which synchronize evaluation. That means that on the right-hand side of such an operator you can rely on changes to variables made on the left-hand side.
But in general it's much safer to use a variable only once in a statement where it's modified.
Perl 6 - links to (nearly) everything that is Perl 6.
| [reply] |
Re: Pre vs Post Incrementing variables
by BrowserUk (Patriarch) on Sep 12, 2010 at 09:42 UTC
|
Because long, long ago, in a land far away, the designer's of a language wrote in their specification:
The precedence and associativity of all the expression operators is summarised in section 18. Otherwise, the order of evaluations of expressions is undefined. In particular the compiler considers itself(*) free to compute sub-expressions in the order it believes(*) most efficient, even if the sub-expressions involve side effects. The order in which side effects take place is unspecified. Expressions involving a commutative and associative operator (*, +, &, |, ^) may be rearranged arbitrarily, even in the presence of parentheses; to force a particular order of evaluation, an explicit temporary must be used.
(*) Who knew they had sentient software way back then :)
Essentially, the designers of the C language traded source code ambiguity, programmer intuition, and program correctness, for performance.
Instead of allowing the programmer to specify exactly the order in which sub-expressions would be executed--through the intuitive use of parentheses and precedence, rather than clumsy hack of creating unnecessary temporary variables--they opted to allow compiler writers to re-order those sub-expressions "for efficiency".
Perhaps the most short-sighted and pervasive premature optimisation ever.
In the days when cpu clock speeds were measured in low megahertz and single opcodes could take dozens of those clock cycles to perform; in a language that was explicitly designed to compiled directly to machine code; such optimisations could have a significant impact on program performance. So programmers would tolerate such inconveniences, for the gains that could be achieved.
Now, it is arguable that the trade-off no longer makes sense: when you have processors with speeds measured in Gigahertz executing multiple instruction per clock cycle; in a language were even the simplest of sub-expressions takes dozens if not hundreds of op-codes. The scope for performance optimisation through the reordering of sub-expressions is essentially non-existent.
But, for better or worse, Perl apes many of the rules laid down by the C language. In part I believe, because it made things easier for C programmers moving to Perl. So, we have the situation whereby the order of sub-expression evaluation is "unspecified", meaning that you cannot predict the result of using two or more side-effectful operations on a single variable within the same expression, on the basis of the language description alone.
It also means that when ambiguities arise in the implementation, such as the one you describe, you cannot point a finger and called it a bug, because "the behaviour is unspecified". You'll simply be told: "Don't do that!".
The fact that there is no logical reason, performance or otherwise, for having pre- and post-increment operate differently in this regard:
$n=0; print ++$n, ++$n, ++$n, ++$n;;
4 4 4 4
$n=0; print $n++, $n++, $n++, $n++;;
0 1 2 3
is simply dismissed as user error. "Because you shouldn't be doing it anyway!".
One day, someone will see through these hoary ol' chestnuts of programming lore, and define a language that allows the programmer to write source code that specifies exactly what he wants to happen, and know it will happen. Without having to recourse to inventing unnameable intermediary variables for no good reason.
Till then, you'll just have to get used to not using two or more side-effectful operators, on the same variable, within the same statement. {shrug}
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
f(++$i, ++$i)
is more or less equivalent to
do {
local @_;
alias $_[0] = ++$i;
alias $_[1] = ++$i;
&f;
}
Since the pre-increment operator returns the variable itself and not a copy, the above is the equivalent to
do {
local @_;
++$i; alias $_[0] = $i;
++$i; alias $_[1] = $i;
&f;
}
As you see, f() sees the same value for both arguments as both arguments are the same variable.
| [reply] [d/l] [select] |
|
$i = 0; ++$i = 'fred';;
Can't modify preincrement (++) in scalar assignment
So, not an lvalue.
is more or less equivalent to ... alias
Ah! That well know Perl keyword 'alias'....
It's got nothing to do with operand evaluation order.
Hm. "It" has everything to do with the fact that the evaluation order of sub-expressions is unspecified.
Where "it" is the contradictory and useless behaviour, observed by the OP, and repeated in my post.
Because, if were specified, then the implementation would not be able to get away with producing those totally illogical, useless results, for those unwise or unknowing enough to try and use, what could be a useful behaviour.
Your attempts to explain how the implementation produces these useless results from this unspecified and therefore deprecated code, does naught to detract from the reason why it has been possible to enshrine this broken behaviour in the implementation.
The fact that f( ++$n, ++$n ) passes an alias to $n, rather than the value resulting from the preincrement, is just another broken behaviour. It is equivalent to C allowing:
#include <stdio.h>
void f( int *a, int *b ) {
printf( "a: %d, b: %d \n", *a, *b );
*a = 1;
*b = 2;
}
int main( int argc, char ** argv ) {
int x = 0;
f( &( ++x ), &( ++x ) );
return 0;
}
Which it doesn't: junk.c
junk.c(11) : error C2102: '&' requires l-value
junk.c(11) : error C2102: '&' requires l-value
junk.c(11) : error C2198: 'f' : too few arguments for call
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
Re: Pre vs Post Incrementing variables
by repellent (Priest) on Sep 12, 2010 at 08:55 UTC
|
The docs mention no guarantees for the result of such code. Here's what it seems to be doing (at this time of writing):
- ++$i increments $i and then returns an alias to $i
- $i++ returns a copy of $i and then increments $i
(++$i, ++$i) returns two aliases to $i, which has been incremented twice, resulting in (2, 2).
($i++, $i++) returns two copies of $i, one before and one after the first post-increment, resulting in (0, 1).
| [reply] [d/l] [select] |
Re: Pre vs Post Incrementing variables
by cdarke (Prior) on Sep 12, 2010 at 11:14 UTC
|
a similar test program in a C++ IDE netted the exact same result.
In my experience, C and C++ compilers differ. Take the code: int x = 0;
printf("%d %d %d\n",++x,++x,++x);
A similar line in Perl (allowing for syntax differences) gives 3 3 3, because of optimisation. PHP give 1 2 3, and probably denotes a lack of same. MS Visual Studio 6 C++ you get 3 3 3 when compliled in Release (again, optimisation) but (wait for it) 3 2 1 in Debug (no optimisation). gcc (Windows and Linux) gives 1 2 3 regardless of the optimisation level.
Upshot of that? Don't do it! | [reply] [d/l] |
|
Well... In the samples I gave above, the C++ IDE I used was MS Visual Studio 8 (or 2005, whatever it wants to call itself now) which gave 2 2 (or 3 3 3 in your sample) in the default Debug mode. So I guess virtually add it to the list there.
It is, however, useful to know this about the compilers. I know the C family compilers have their own idiosyncrasies, I guess I'll put this on the pile things to note.
| [reply] |
|
I have noticed differences in later .Net versions of Visual Studio. I did not have them to hand when I wrote the node, and didn't want to guess. It could be that there is optimisation even in Debug. Check the project settings?
So far as I can tell though, the order of execution (in C) of parameters is not defined. The reason why VS 6.0 (Debug) executed them from right to left was because that is the order of the C calling convention, __cdecl.
| [reply] |
|
Re: Pre vs Post Incrementing variables
by ikegami (Patriarch) on Sep 12, 2010 at 16:14 UTC
|
Given my explanation of the problem, the workaround would be to convert the lvalue returns by the pre-increment operator into rvalues.
>perl -E"say ++$i,++$i;"
22
>perl -E"say 0+(++$i),0+(++$i);"
12
Technically, the operand evaluation order for the comma operator in list context is not defined, but it's not likely to ever change, especially since the operand evaluation order for the comma operator in scalar context is defined.
| [reply] [d/l] |
|
Technically, the operand evaluation order for the comma operator in list context is not defined,
The perlop manual page says in the section about the comma operator:
In list context, it’s just the list argument separator, and inserts
both its arguments into the list. These arguments are also evaluated
from left to right.
This has been in perlop since April 2006.
| [reply] |
Re: Pre vs Post Incrementing variables
by tomfahle (Priest) on Sep 12, 2010 at 08:32 UTC
|
| [reply] |
|
| [reply] [d/l] |
Re: Pre vs Post Incrementing variables
by SavannahLion (Pilgrim) on Sep 12, 2010 at 14:29 UTC
|
Many thanks for the answers. As much as it annoys me to split up a nice single line of code into several to accommodate something like this, it is something that I can readily deal with and lose little sleep over.
| [reply] |
Re: Pre vs Post Incrementing variables
by DrHyde (Prior) on Sep 15, 2010 at 10:16 UTC
|
In this:
my $i = 0;
my ($j, $k) = (++$i, ++$i);
print "\$j: $j\t\$k: $k\n";
$j: 2 $k: 2
It first evaluates the ++es (because it's a pre-increment, so it happens before anything else), so $i = 2. Then, it builds a list of (2, 2), and finally assigns that list to ($j, $k).
In this:
my $i = 0;
my ($j, $k) = ($i++, $i++);
print "\$j: $j\t\$k: $k\n";
$j: 0 $k: 1
Because it's a post-increment, the ++es get evaluated *after* $i is "evaluated" to build the list. So the first $i is evaluated, it's 0, so 0 goes into the list. Then the ++ happens, setting $i = 1. Then the next $i is evaluated, putting 1 into the list, and finally $i is incremented again. End result, a list of (0, 1) is assigned to ($j, $k).
But as others have pointed out, having multiple ++es and --es in a statement is Bad Juju. What I've explained is what happens in current perls, but I wouldn't trust it to work the same in the next point release of perl 5, let alone in perl 6. See perlop.
| [reply] [d/l] [select] |
Re: Pre vs Post Incrementing variables
by girarde (Hermit) on Sep 14, 2010 at 12:44 UTC
|
My guess: the entire right hand expression is evaluated before assignments are made. Reasonable except when you want something else, but in that case, what should the rule be instead? | [reply] |
|
$ perl -E'$i=0; ($j, $k) = (++$i, ++$i); say "j:$j k:$k";'
j:2 k:2
$ perl -E'$i=1; ($j, $k) = ($i++, $i++); say "j:$j k:$k";'
j:1 k:2
$ perl -E'$i=0; ($j, $k) = (++$i+0, ++$i+0); say "j:$j k:$k";'
j:1 k:2
The answer is here. | [reply] [d/l] |
|
My guess: the entire right hand expression is evaluated before assignments are made.
Good guess. Doesn't really answer the question though, as any assignment will evaluate the RHS before doing the assignment. (If it didn't, what would it assign? Purple monkeys?)
| [reply] |
|
| [reply] |
|
|
Actually, it does. Evaluating that right hand increments $i twice. This occurs before either of the assignments.
| [reply] [d/l] |
|
|