Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: Undocumented join() feature, now defunct? (optimization)

by tye (Sage)
on Oct 30, 2014 at 00:48 UTC ( [id://1105574]=note: print w/replies, xml ) Need Help??


in reply to Undocumented join() feature, now defunct?

My expectation was that the EXPR would be evaluated once and the invariant result used between every element of the LIST.

Then you should probably adjust (and loosen) your expectations about exactly how much optimization has or hasn't been done to a particular feature in a particular build of Perl.

If you manage to construct code that behaves differently depending on whether the implementation of some feature (especially one that uses the value more than once) evaluates the value once or twice, then you've written code that is likely to break when an optimization gets done.

If I had reason to pass a magical scalar that changes values to join, then I'd tell Perl to fix the value first via:

say join "$inc", @arr;
and the wording looks the be the same for for all versions

Yeah, people don't usually update the documentation of the basic functionality when an optimization is made.

In your mind, one of these two has to be broken?

sub join1 { my( $sep, @vals ) = @_; my $str = ''; while( @vals ) { $str .= shift @vals; $str .= $sep if @vals; } return $str; } sub join2 { my( $sep, @vals ) = @_; $sep = "$sep"; # Added my $str = ''; while( @vals ) { $str .= shift @vals; $str .= $sep if @vals; } return $str; }

Now if I go and optimize out the copying of most of @_ into @vals, that breaks something too? Or are both of those broken because they don't avoid copying from @_?

Next you'll tell me that join2 is broken because it evaluates $sep even if when @vals is empty and so $sep isn't actually used.

IMO, none of these are bugs. They are implementation details that reasonably should be expected to change at any time when some bug gets fixed, some code gets optimized, some code gets refactored, etc.

Making scalars that change every time you look at them is weird stuff. You have to put up with weird things happening and take extra care when you do stuff like that.

- tye        

Replies are listed 'Best First'.
Re^2: Undocumented join() feature, now defunct? (optimization)
by johngg (Canon) on Oct 30, 2014 at 10:18 UTC

    Thank you for your reply but I think you might have misunderstood the direction I'm coming from. I was not attempting to write code to exploit an undocumented, or unclearly documented, feature of join. Rather, I was going to write an equivalent function that did allow for an EXPR that evaluates more than once. The discovery that earlier versions of join also did this was purely by chance.

    Then you should probably adjust (and loosen) your expectations

    My expectation was based on the wording of the documentation which does not explicitly state that EXPR could be evaluated more than once.

    then you've written code that is likely to break when an optimization gets done.

    To my mind, an optimization improves performance but does not alter behaviour. A bug fix alters behaviour. I should, of course, have read perl5180delta before writing the OP. In it under "Selected Bug Fixes" we find

    join and "@array" now call FETCH only once on a tied $" [perl #8931].

    This answers my question: the behaviour was considered to be a bug which has now been fixed. Anonymonk's reply informs us that it was "a bug fixed by accident" as part of an optimization but perhaps the documentation ought to clarify the behaviour and how it has now changed.

    I should add that none of this was for production code but was just playing around exploring language features.

    Cheers,

    JohnGG

      Thank you for your reply

      You are most welcome.

      but I think you might have misunderstood the direction I'm coming from.

      No. I didn't assume you were trying to do any of the things you speculate about (including impacting production code). I was just commenting on your expectations that you explicitly expressed.

      My expectation was based on the wording of the documentation which does not explicitly state that EXPR could be evaluated more than once.

      Well, EXPR is certainly only evaluated once (per call). But it also didn't explicitly state that the (possibly tied) scalar resulting from EXPR could be accessed more than once.

      Of course, it doesn't explicitly state that the scalar might be accessed only once. Documentation doesn't generally (for good reason) specify exactly how many times the implementation might decide to just look at some value you gave to it. Documenting it would mean that you'd be breaking a documented feature if you came up with an optimization that involved just looking at the value one more or one fewer time. It is wise to not tie your implementers' hands so tightly.

      Documenting that there is no guarantee as to how many times join() (in particular) might look at the value of one specific argument would be quite silly. Documenting this rather mundane consequence of internal details of implementations changing in a general manner would be fine (and it may already be done somewhere in the Perl docs).

      The root problem is your expectation that documentation will mention how many times a value is looked at. It usually doesn't. It shouldn't.

      join and "@array" now call FETCH only once on a tied $" [perl #8931].
      This answers my question: the behaviour was considered to be a bug which has now been fixed.

      But you can tell that it was considered an optimization "bug" not a feature "bug", because no feature documentation was updated to assure users of this detail (so not really a "bug" by how I would use that word, just an optimization). This "behavior" wasn't even nailed down. The comment says "only once" not "exactly once". Based on that comment, how many times will FETCH be called for join( $tied, $one ) ? Maybe 1, maybe 0. Neither choice would be a feature bug. And the answer is fairly likely to change at some point (even if accessing it many times continues to be considered an unfortunate/inefficient implementation choice).

      To my mind, an optimization improves performance but does not alter behaviour.

      Then I guess you don't have much experience with optimizations. Optimizations very often change subtle behavior. Optimizations should not break feature behavior. How many times the code simply looks at something isn't feature behavior. The fact that using tie makes it possible to notice how many times your variable is looked at doesn't mean that how many times your variable is looked at is something that must be specified and controlled for every feature implementation. Far from it.

      Using tie to make a scalar whose value is different every time you look at it isn't hard and certainly can be cute, but it also is fundamentally fragile. And there are tons and tons of optimizations that can have an impact when such is done. That isn't due to a problem with those optimizations. It is due to somebody doing something so fragile. When you do that, you should expect weird stuff and/or be very careful about how you pass that value around.

      You most certainly should not expect to not be surprised by the results you get.

      - tye        

        But you can tell that it was considered an optimization "bug" not a feature "bug"

        I don't know about this instance specifically, but other operators have similarly been changed as a feature (so the result is more in line with expectations), not as an optimization (thought that's obviously a benefit as well).

        However, I completely agree that the developers don't want to be held down by any promises in this area. I would consider the exact number of times a variable is accessed to be subject to change.

Re^2: Undocumented join() feature, now defunct? (optimization)
by Loops (Curate) on Oct 30, 2014 at 01:04 UTC

    Hi Tye,

    You make some good points with which I agree. But don't you think Perl (or join?) should reliably specify the behavior for say:

    print join ++$i,  qw( a b c);

      The first argument to join is not code to be executed over and over again like with map and grep. So your example of join ++$i, @a doesn't actually pose a problem, unlike passing in a magical scalar that changes each time you look at it.

      So, the behavior of your example code is already well defined.

      - tye        

        "The first argument to join is not code to be executed over and over again like with map and grep. "

        Agreed. But I think Loops' point is that nowhere in the documentation of join makes the above explicit.

        Given:

        my $str1 = $a . $x . $b . $x . $c; my $str2 = join $x, $a, $b, $c;

        You might assume they are equivalent. However, if $x is tied like in the example above, or is an overloaded object, they might not be. Given that johngg's code demonstrates that the behaviour of join in this regard has actually changed recently, it's not unreasonable to expect this behaviour to now be documented and ensured by the test suite.

        Yes, okay I see that. To expect otherwise would require the creation of a temporary every time you access a scalar more than once in your code. One implication of executable code being well defined as the first parameter to join, is that the following will fix johngg's code to work on old and new versions of Perl:

        <c>say join $inc+0, @arr;

        Where the creation of a temporary to hold the result is implied.

      What makes you think it isn't?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1105574]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-04-16 18:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found