Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
In quantum physics, watching something happen can change its behavior. In particular, a single particle going through two slits interferes with itself - unless you measure which slit it goes through, which destroys the interference.

Perl's ++ operator seems to behave like this, when the "watching" is done with a subroutine call.

Consider the output of

$m = 20; print ++$m + $m++;
and
$m = 20; print noop(++$m) + $m++; sub noop{ return shift }
What's going on here?

If you'd like to see how arcane your knowledge of Perl really is, try guessing what this prints without reading ahead or evaluating the code.

For those with way too much time on their hands, this document and the code samples it refers to can be found at at http://cs.marlboro.edu/talks/increment_weirdness/.


Disclaimer

Before going into any detail, I'd like to make it clear that this is an academic exercise only.

Steve Oualline has a nice description of the use of increment operators and their side effects in his book "Practical C++ Programming", pg 79. Essentially he says that if you want to understand how these tricky increment expressions work, the right answer should really be

"If you don't write code like this, then you don't have to worry about these sorts of questions."

The problem

Even though clearly none of *us* would write code like this (ahem), a friend (Mark Francillon) gave me some expressions like (++$m + $m++) as a puzzlers, and I was curious enough to look into it.

To really understand the pre and post increment ops, I wrote my own preInc and postInc subroutines, doing what I *thought* these operators were supposed to do.

# This is an attempt at emulating ++$m with preInc($m) sub preInc { $_[0] = $_[0] + 1; # Increment input argument (the side effec +t), return $_[0]; # and return the new, incremented value. } # And this is an attempt at emulating $m++ with postInc($m) sub postInc { my $temp = shift; # Remember original value, $_[0] = $_[0] + 1; # increment input argument (the side effec +t), return $temp; # and return the old, un-incremented value +. } my $m = 20; print preInc($m) + $postInc($m); # This prints 42. # The final value is $m is 22.
This all makes perfect sense to me. What's going on here is
  1. The preInc($m) increments $m to 21, and returns 21.
  2. The postInc($m) returns 21 ($m's current value), then
  3. increments $m a second time to its final value of 22, leaving
  4. the value of the sum as 21+21=42.
Both C and Java give this same value 42 for similar expressions, by the way; see the files run, Inc.c, Inc.java, Inc.pl, and their outputs in run_output.txt

If this was the whole story then I wouldn't be writing all this down. However, the value returned by the Perl interpreter is *not* 42, but 43. If you don't believe me, try it for yourself.

my $m=20; print ++$m + $m++; # This prints 43 ! # The final value is $m is still 22.
And then I started pulling out my hair.
look()-ing at intermediate results

My first attempt at understanding what was going on was to write a subroutine that would examine the intermediate results. You can find all the gory minutia in increment_detail.pl.

# Print values and addresses of passed argument and $m. sub look { print "look was passed '" . $_[0] . "' at . \$_[0] . ".\n"; print "while \$m is '" . $m . "' at " . \$m . ".\n"; return $_[0]; } my $m = 20; my $p = look(++$m) + look($m++); print $p;
But that's where the quantum weirdness popped up.

After a variety of attempts it became clear that any subroutine call wrapped around (++$m) changes the result of the calculation to 42.

sub noop { # do nothing return shift; } my $m=20; print noop(++$m) + $m++; # This prints 42 !
So if I tried to watch it do this weird thing, it wouldn't do it.

By now this felt like a conspiracy.


overload-ing to look without touching

Another friend (Brandt Kurowski) suggested using operator overloading to watch the intermediate steps, without disturbing the calculation. This works, and has helped me understand what's going on, but hasn't quite answered all my questions.

See IncrementOverload.pm for that analysis.

==== increment weirdness: ++$m + $m++ ========== m = 20 at 0x80ab23c p = ++m + m++ *** inc 0x80ab23c : 20 --> 21 m is 21 at 0x80ab23c *** copy 0x80ab23c --> 0x804c120 m is 21 at 0x80ab23c *** inc 0x804c120 : 21 --> 22 m is 22 at 0x804c120 *** add : 22 at 0x804c120 + 21 at 0x80ab23c = 43 at 0x80ab11c m is 22 at 0x804c120 p = 43 at 0x80ab11c
The discussion on pg 357 and thereabouts of the Camel describes some of the inner workings of the increment operators; in particular, if there's more than one pointer to something then it makes a copy first and increments the copy. Thus to overload ++, you must also overload the copy operator. In the listing above, the "inc", "copy" and "add" lines are printed out by overloaded subroutines, all invoked while evaluating ++$m + $m++.

Here's a blow by blow account.

The calculation starts out as I'd expect, with ++$m incrementing $m (..23c) in place.

Since the left term of the sum is going to be needed later, I'm guessing that another name (pointer), say left_sum, is also now given to that (..23c) value.

The second increment, $m++, now sees something with several names, and so makes a copy (..120) before proceeding. The original address, namely (..23c) which now contains 21, is given another name, something like right_sum, to be used later when the terms are added together.

The copy is then incremented to 22 (..120), which is the final value of $m.

In the last addition step, the left_sum pointer has been apparently been carried along with the renaming of $m; that is, by the time the sum is evaluated its value is 22.

On the other hand, when ++$m is wrapped in a subroutine call, left_sum isn't "carried along" in this way; left_sum remains 21 after $m gets to 22.

How exactly the addition operation keeps track of what it's going to be adding isn't entirely clear to me; the devil seems to be in the details of what the thing I'm calling left_sum ends up referring to.

Staring at all this long enough gives the gist of *how* Perl gets 43 : ++$m evaluates to $m itself, which is incremented *again* by $m++. So by the time the addition operation actually needs a value for the term on the left, its value is 22.

There are more comments at the end of Increment.pm, along with printouts for some other (perhaps illuminating) variations.


Why?

After all this analysis on *how* this works, I'm still left with one big question.

Why is 43 the right answer?
Clearly Perl's answer must be the right one, and so I'm sure there's some really good reason why the operators *should* behave this way in this context - I just can't quite get my head around what that reason might be. :)

Anyway, it was fun trying to look a bit under the hood.

  - barrachois


In reply to Quantum Weirdness and the Increment Operator by barrachois

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-04-18 07:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found