Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Why is the execution order of subexpressions undefined?

by BrowserUk (Pope)
on Apr 11, 2005 at 23:41 UTC ( #446796=perlquestion: print w/ replies, xml ) Need Help??
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

The title is the entire question really. I just got to wondering why the execution order for some expressions, for example:

my $rv = func( $i, ++i, $i+2 );

are institutionally undefined? Is because defining the execution order is:

  • too hard.
  • would disable some potential compile time or runtime optimisation.
  • It would have some syntactic or semantic knock-on effect that would be undesirable?
  • other

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?

Comment on Why is the execution order of subexpressions undefined?
Download Code
Re: Why is the execution order of subexpressions undefined?
by ikegami (Pope) on Apr 11, 2005 at 23:53 UTC

    Arguments are evaluated from left to right. (For me. I don't know if this is defined or undefined behaviour.) However, everything is passed by reference (or would that be "passed by alias" in Perl terminology). That explains why ++$i affected the $i. It must be this way to allow $_[0] = 4 to work.

    The following is the C++ equivalent (ignoring the memory leaks). Try it and you will see similar results.

    int func(int &at0, int &at1, int &at2); func(i, *(new int(++i)), *(new int(i+2)));

    The following is a good example of the execution order in Perl:

    my $i = 5; func($i+0, $i, ++$i, $i+0); # 5 6 6 6 for me. # ^^^^ ^^ # rvalue lvalue

    From the above, you can also see the solution to your problem. You can work around the problem by converting lvalues to rvalues:

    my $i = 5; func($i, ++i, $i+2); # 6 6 8 for me. ^^ lvalue my $i = 5; func($i+0, ++i, $i+2); # 5 6 8 for me. ^^^^ rvalue

    Of course, you can no longer do $_[0] = 4; from within func if you apply this fix.

Re: Why is the execution order of subexpressions undefined?
by blokhead (Monsignor) on Apr 12, 2005 at 00:07 UTC
    It is my understanding that for the purposes of efficiency, the actual opcodes that constitute a statement like ++$i can be split up in expressions like this. Possible example: the value of $i is fetched, incremented, then other (unrelated) opcodes are done, then finally the new value is placed in $i.

    The expressions that constitute the arguments are no longer atomic transactions! That would explain why in these cases you can sometimes even get behavior that differs from any execution order (at least, any that evaluates arguments atomically).

    I could be totally off-base here, but maybe some perlgutsgeek can authoritatively clarify the situation.

    blokhead

Re: Why is the execution order of subexpressions undefined?
by dragonchild (Archbishop) on Apr 12, 2005 at 03:11 UTC
    The answer is actually pretty simple - Perl's syntax is derived primarily from C, especially vis-a-vis operators. The order of evaluation is undefined within the C spec, thus it's undefined within the Perl spec.

    Now, the reason it's undefined within the C spec is that K&R felt that different machines may have a better optimization for the evaluation of certain subexpressions. In other words, it was recognition from the writers of the C spec that micro-optimizations mattered in 1972.

      The order of evaluation is undefined within the C spec, thus it's undefined within the Perl spec.

      I understand the reasoning behind this decision in C, but my question really centres upon one word in the sentance I quoted above: "thus". The most salient definition of that word in relation to that sentance is:

      Therefore; consequently:

      Does it follow that because code could be made more efficient at the C-level by not defining execution order, that the same would be true for Perl? Given that most of not all Perl opcodes are fairly heavy relative to the equivalent C code, I wonder if there is any real scope for improving efficiency at the Perl level by not defining execution order.

      Ultimately, I wonder if Perl 6 could define an execution order without penalty? And whether that would be a good thing to do in terms of the usability and teh principil of least surprise?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
        I personally agree with you that the order of subexpression evaluation could be defined in all cases, within Perl. (Heck, I think it could be defined now within C, as well.) But, I think tye has a very good point - who's going to go through the work of defining them? I certainly don't want Larry, Damian, Alison, or anyone on the P6L team doing it for P5. Maybe, this could be your big contribution to P6, but I'd get Larry's signoff on it, first. (Or, at the very least, run it past him and hope he doesn't deep-six the idea.)
        Ultimately, I wonder if Perl 6 could define an execution order without penalty?
        Actually, I wouldn't surprised if the execution order becomes more undefined in Perl6. If the execution order is undefined, the execution may happen in parallel. Considering the rise of the multi-CPU machines, there can be a nice performance gain by not defining execution order.

        IMO, the less code relies on execution order, the clearer the code is.

Re: Why is the execution order of subexpressions undefined? (why not?)
by tye (Cardinal) on Apr 12, 2005 at 04:25 UTC

    Why should it be strictly defined? Why tie your hands so thoroughly?

    There really isn't some one compelling argument on either side.

    There are advantages to defining the order of things and advantages to leaving the order of things unspecified. Your question really boils down to "Why don't we define the order of everything?". And I think it doesn't take much contemplation to realize that the middle ground is the place to be.

    Perl goes futher than C in defining the order of some things. There is advantage in knowning that ( foo(), bar() )[-1] will call foo() before it calls bar(). It allows you to express some complex instructions in concise ways (which is a very Perlish thing to allow).

    A similar expression in C might call the functions in either order, depending on the implementation. There is good reason for this. C runs on systems were the low-level calling conventions specified that you pushed the first argument first (and where C wanted to use the same conventions to ease interoperability with the non-C routines on such systems) but C also runs on systems that take good advantage of pushing the first argument last (which allows you to pass in extra arguments with little impact). Overspecifying such things could mean slower code on some system (having to compute the values in one order and then push them onto the stack in the reverse order) that was also more complicated and more likely to be buggy.

    To go all the way in one direction would require writing up all of the details of exactly how order is determined and require a lot of work to figure out all of the bugs in this specification and the implementation. It just isn't worth the small gain of being able to define exactly how things happen even if you write something that most people won't have a good feel for exactly in what order it should be evaluated.

    Leaving some things unspecified gives you the freedom to re-implement things in a way that makes more sense, is more efficient, or to just not have to sweat so dang much when making a trivial change because it might have some subtle effect of the order of evaluation of some convoluted expression that noone should have to bother to figure out exactly what order it should be evaluated in.

    You define what it makes sense to nail down and give warning that the rest may or may not change because the work hasn't been done to define exactly how it must stay the same and there may be a real benefit to changing it some day (like the real benefit of recently making the order of hash keys vary between runs of the same Perl script).

    A major goal of Perl 6 is to support code optimization. If you over-specify the order of evaluation, then you lose the ability to perform some optimizations. You can't move opcodes around to take advantage of efficient ways of combining certain things.

    The reason that sometimes the fact that some orders are not defined is so pointedly noted is that people have a tendency to assume that once they figure out X, that X will always be true. You can see it happening with the expression you posted. Some people have already convinced themselves that they know exactly what order things are done and why and even say things like "It must be this way to allow ..." and then they go and write code that breaks if things aren't exactly the way they think they must be (I'm not saying that ikegami writes such code, just that ikegami has demonstrated the first step or two along that path to doom).

    And that scenario is worse than having the order exactly specified, because you have different groups tying peoples' hands in different ways for no benefit. So it is important to at least occasionally beat people up and remind them that some things are not defined and if they depend on them being one particular way, the fact that it works for them now doesn't mean it will always work.

    - tye        

      Why tie your hands so thoroughly?

      I don't think it would tie my hands (where I am the Perl 6 programmer? Indeed, I think that it would free me from having to consider whether I am writing an expression that could have dependancy issues--which is the basic reason for my asking if it would be possible and if so, what penalties it would carry.

      It may tie the hands of the Perl 6 compiler/interpreter writer to some degree, but I think that this is much less than you are implying. I've been playing around with some expression parsing code and I've reached the conclusion that the only time that execution order could be varied in a way that might be an optimisation is when dealing with native integer manipulations. There are occasionally some small gains to be had by reordering operations on integers and uints that allow you to avoid performing intermediate stores by retaining them in a register across higher level sub-expression boundaries.

      But not only are these rare, you cannot generate these occasions directly from your Perl code--as far as I can see. All Perl level manipulations involve PMCs, which are effectively objects and all operations will (again, as far as I can see) be effected through method calls. The order of execution of the Perl level subexpressions will be a chain of load a VM register, index through the vtable, call the methods, move onto the next subexpression. There is no scope for optimations across those boundaries that I can see.

      There is potentially scope for optimisation across method call boundaries once the code has been JITed, but that would require a very clever machine-code level optimiser capable of performing out-of-order execution analysis, and would obviously need to be specific to the processor upon which it runs. Do you envisage Parrot providing that level of optimiser?

      If I am right, and it's not possible to optimise across perl-level subexpression boundaries--which would mean that there would be no point in moving (Parrot-level) opcodes around--then defining an execution order could be beneficial by removing another level on uncertainty from the Perl code. And noone would have to beat anyone up for writing code that different implementations might choose to execute differently on some whim.

      I'm also not at all sure that the task of specifying the execution order is that onerous either. There are, as far as I can tell, actually not that many situations where the execution order isn't aready well-defined by context and other influences, but I'm not very sure of my ground here. Is there a document that identifies when the execution or is not well-defined?

      So, in answer to your question:

      Why should it be strictly defined?

      Because it would save me guessing when the were not, and anything that removes uncertainty is a good thing--I think. But I am not sure, hence my question.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
      Some people have already convinced themselves that they know exactly what order things are done and why and even say things like "It must be this way to allow ..."

      I don't know how I could have convinced myself I "know exactly what order things are done and why" when I specifically mentioned I didn't know whether execution order is defined in this circumstance.

      "It" in "It must be this way to allow ..." refers to passing by reference, not execution order. There is no way of allowing $_[0] = ... to edit the caller's variable without passing the variable in by reference (by definition). Perl5 doesn't currently have a mechanism to specify whether a variable should be passed by value or by reference (or maybe it does and it wasn't used here), so everything must be passed by reference to allow $_[0] = ... to edit the caller's variable.

      My post only explains why ++$i affected the $i, nothing more.

        Sure it does. func( $pass_by_reference, "$pass_by_value" ). You can do that to strings, booleans, and numbers easily. Use Storable::dclone() if you want to pass in an object or data structure as a value.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: Why is the execution order of subexpressions undefined?
by Zaxo (Archbishop) on Apr 12, 2005 at 05:03 UTC

    Perl somehow knows what the execution order should be!

    $ perl -MO=Concise,-exec -e'sub func {@_} my $rv = func( $i, ++$i, $i+ +2 )' 1 <0> enter 2 <;> nextstate(main 2 -e:1) v 3 <0> pushmark s 4 <$> gvsv(*i) s 5 <$> gvsv(*i) s 6 <1> preinc sKM/1 7 <$> gvsv(*i) s 8 <$> const(IV 2) s 9 <2> add[t2] sKM/2 a <$> gv(*func) s b <1> entersub[t3] sKS/TARG,1 c <0> padsv[$rv:2,3] sRM*/LVINTRO d <2> sassign vKS/2 e <@> leave[1 ref] vKP/REFC -e syntax OK $

    Perl does make some guarantees about execution order - they are implied by the rules of precedence and associativity in perlop. In your example, the parens around the argument list make it a TERM, the highest precedence, so the enclosed expression is evaluated first. Within that expression, the preincrement operator has highest precedence, so it is applied next. Then comes addition in the third term and finally the commas. The commas tie in precedence, so associativity comes into it and they evaluate left to right - a trivial operation in this case. I think that func acts as a list operator, so assignment is evaluated next. Being right associative, assignment evaluates func first with the arguments we already calculated, and copies the result into $rv (in scalar context).

    As far as I know, Perl doesn't make the distinction between operator-comma and punctuation-comma that C and C++ do. I believe that comma is always an operator in Perl (if I'm mistaken, I expect I'll hear about it!). It seems to be only scalar vs. list context which distinguishes between the sequence operator and list seperators - giving func the ($) prototype results in the same order of operations exposed by B::Concise.

    Where C's undefinedness does creep in is through the increment and decrement operators. They are "nonassoc" in the precedence table, and are explicitly documented to have undefined results when used multiple times on the same variable in a single statement. It also seems to be discouraged to evaluate the variable itself, as in your example.

    Currently, Perl permits those undefined operations and does them consistently, though oddly. Such expressions turn up pretty regularly in SoPW. Their behavior can be understood by realizing that post-inc and -dec return values while pre-inc and -dec return aliases (so reflect changes from subsequent operations), and that the operators are evaluated from left to right. Again, I emphasize that Perl docs warn against relying on this.

    The precedence table approach to deciphering what perl will do is not perfect. For instance, being punctuation, brackets for array indexing or referencing are not covered in the table (though they obviously have a high effective precedence). I wonder if they come in under the TERM heading?

    After Compline,
    Zaxo

      I think you've fallen victim to the temptation that I posted about. I don't believe that Perl defines the order of evaluation of expressions based on the precedence table. That determines what the results will be but need not restrict Perl to arriving at the proscribed value in the exact order that you appear to have defined based on how Perl has so far been implemented.

      It would be sad if Perl 6 were not free to optimize

      my $x = $z * ( $y + 1 ) + $w / ( $y + 1 );

      By noting that $y + 1 is used twice and therefore could be computed beforehand and the result just used twice rather than computing it twice. But if you tie Perl's hand by saying that execution order is defined for such things, then Perl 6 will have no choice but to not optimize expressions, thwarting part of what was advertised as a big motivation for Parrot: the ability to optimize bytecode.

      - tye        

        Under an "as-if" rule, a bytecode optimizer need not obey the precedence rules, but the initial interpretation of your source must. Precedence and associativity are there to ensure that the expression you write is the one that gets calculated.

        After Compline,
        Zaxo

        It would be sad if Perl 6 were not free to optimize
         my $x = $z * ( $y + 1 ) + $w / ( $y + 1 );

        Why would defining execution order affect that optimisation?

        That just common subexpression elimination--clearly defined by the parens. The only sane defined ordering would be to execute the contents of parens first?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco.
        Rule 1 has a caveat! -- Who broke the cabal?
      Perl somehow knows what the execution order should be!
      Eh, no. That's perl that knows, for a particular version of perl. The fact that it isn't defined in the language may mean that a next version may do it differently.
      they are implied by the rules of precedence and associativity in perlop.
      Eh, no. The rules of precedence and associativity only deal with parsing the language. They do not imply order of evaluation. The rules of precedence say that value of the expression:
      E1 + E2 * E3
      is calculated by multiplying the result of E2 and E3, and adding that to E1. It doesn't say E2 needs to be evaluated before E1 or E3. And indeed, current perl doesn't. In fact, it will evaluate E1, E2 and E3 in the same order as:
      E1 * E2 + E3
      even if the precedence of + and * are different.
Re: Why is the execution order of subexpressions undefined?
by ysth (Canon) on Apr 12, 2005 at 08:59 UTC
    Because there's no pressing need to define it? Perl has historically been a try-it-and-see language, with the actual details of how different features interact left up to the implementation. There was at one point a counter-movement picking some areas and trying to make them well defined, resulting in doc such as perlnumber, but it never really took off.

    So, it boils down to "try it and see". And if you can imagine a different implementation doing something differently, be prepared for a different implementation doing it differently :).

Re: Why is the execution order of subexpressions undefined?
by ambrus (Abbot) on Apr 12, 2005 at 11:00 UTC

    As others have explained, this is evaluated left to right, but <coed>$i</code> is passed by reference. You can try something like

    sub copy { my @t = @_ }; my $rv = func( copy($i), copy(++i), copy($i+2) );
    or just
    my $rv = func( 0 + $i, 0 + ++i, $i + 2);

    (This one's tricked me too once, until the monks was kind enough to explain it in the cb.)

    In fact, perl evaluates most operators from left to right, and surely always evaluates comma from left to right (whether in scalar or list context), even if this is undocumented. There can still be exceptions, for example an assigment is evaluated rhs first.

    While execution order is undocumented in perl, you can count much more on it then in languages where it is actually undefined (liek scheme or C or C++), because there's only one perl interpreter and it's not likely to change much.

    There is a certain amount of relegious war between fixed and undefined evaluation order. The arguments are like this. Some say that undefined evaluation order is better because it can be optimized better. Others say that this is non-sense, since the compiler can often determine that simple sub-expressions don't have side effects, and thus evaluate them in a different order anyway; and when the subexpression do have side effects, undefined evaluation order can cause confusion and misterious bugs. There's a point in both arguments, and I'm not sure which one I support.

      Thanks ambrus. You summed up the situation exactly.

      It was actually really hard to remember a situation where the evaluation order affected the outcome of the statement. I've been bitten by it a couple of times over the years. I've taken part in a couple of threads where it was the cause of a problem or otherwise the subject of discussion.

      It means that each time I write any kind of a compound statement, or when I look over someone elses code here, looking for the source of their problem, I have to stop, look and think hard about whether there is some kind of evaluation order dependancy involved. Often, I will break compound statements into two parts, frequently needing to add a temporary variable, just so I can try it, to see if it makes any difference. Mostly it doesn't, but I'm sure it did once a long while ago.

      And that's the point. Perl allows, and even encourages the use of compound statements. Like 'em or hate 'em, Perl allows them and they do have utility.

      Sometimes they can clarify. Or simplify. Or reduce the chance of errors by making it harder for a single compound action to be interspersed accidently. Sometimes they produce an elegant solution (I wish I could find the discussion from around 3 years ago where Abigail-II demonstrated his triple assignment trick!?).

      But as is, with the nebulous existance of the possibility of an evaluation order dependancy, using compound statements always leaves a doubt in my mind, and one which I find extremely hard to erase through knowledge or practice.

      I think that given the rarity with which Perl code could actually benefit in any meaningful way from the possibility of the optimisations that an undefined evaluation order affords, I'd prefer that the loophole was quashed and my doubts eradicated.

      This is one Perl 5 ambiguity that could and should be closed.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
Re: Why is the execution order of subexpressions undefined?
by Anonymous Monk on Apr 12, 2005 at 16:50 UTC
    What we'd really like to do is outlaw goofy side-effecting code like...
    my $rv = func( $i, ++i, $i+2 );
    ...but that would be a hard problem. So instead we merely say the results are undefined, and hope that it scares away anybody who wants to do something ugly like that.
      What we'd really like to do is outlaw goofy side-effecting code like.

      Hey! Don't knock it. It took me hours of trial and error to come up with that example!


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
Re: Why is the execution order of subexpressions undefined?
by gam3 (Curate) on Apr 12, 2005 at 19:42 UTC
    Becuase that is the way it is in C.

    Update: Sure vote me down. It is still true.

    -- gam3
    A picture is worth a thousand words, but takes 200K.
      maybe its because dragonchild already said that? its not important
      Becuase that is the way it is in C.

      Update: Sure vote me down. It is still true.

      No, it isn't. Perl expressions do not get translated into C expressions. They get translated into opcodes. So there is no real connection between evaluation order of expressions in C and evaluation order of expressions in Perl. Perl can (and does) define more order restrictions on expressions and can (but doesn't) undefine ones that are defined in C. And (just to be extra clear) the evaluation order of expressions in Perl is determined by the Perl implementation and is not affected by the evaluation order of the same / similar expression in the C compiler used to build Perl.

      - tye        

        You are right, but that doesn't mean gam3 is wrong. Autodecrement/increment is the way it is because it's that way in C. That doesn't mean Perl gets translated to C expressions - it means that people implemented perl to mimic the C behaviour (for numbers, strings are different), and they decided to not define the order of evaluation. Presumably, for similar reasons as that C doesn't.

        And (just to be extra clear) the evaluation order of expressions in Perl is determined by the Perl implementation
        Eh, no. Take pre-5.8.1 hashes. The order how keys are returned by the keys function is "determined by the Perl implementation". That neither meant the order was defined, nor did it mean the order would be the same from version to version. Had Perl ever decided to define the order in which hash keys would be returned (oh, this will never change), we now wouldn't have a defence against a DoS.
Re: Why is the execution order of subexpressions undefined? (basics)
by tye (Cardinal) on Apr 16, 2005 at 21:04 UTC

    There is one point that BrowserUk keeps insisting on that everyone else in this thread (as far as I can tell) has the opposite conclusion. So let's discuss just that one point: Does a defined or undefined evaluation order allow or prevent the compiler from introducing implicit parallelization?

    If I write $x = f() + g(); and the order of evaluation is defined, let's say from left to right, then f() must be called before g(). If the evaluation order is undefined, then f() could be called first or g() could be called first or f() and g() could be called in parallel.

    So, undefined evaluation order allows the compiler to perform operations in parallel (because the programmer can indicate that f() and g() can be done in any order by putting them into the same expression, inside of which evaluation order is not restricted by a definition).

    If EO is defined, then f() and g() can't be optimized to be called in parallel unless the compiler can fully analyze f() and g() and determine that they have no effect on each other. This is a fundamentally difficult problem and so, in the general case, this analysis cannot be done. So defined EO prevents the compiler from automatically adding parallelization (or other optimizations made possible by changing the order in which steps are performed).

    With undefined EO, if f() and g() should not be called in parallel, then the program instead writes $x= f(); $x += g();. With defined EO, there is no difference between the two ways of writing that sum so the programmer has no way to communicate a distinction between "these must be run in a particular order" from "these can be run in parallel".

    Now, BrowserUk had an example closer to $x = f($i) + g(++$i); and so perhaps BrowserUk is thinking that a defined EO is important to allow the compiler to add parallelization because otherwise writing this expression is basically an error -- because what gets passed to f() and/or g() depends on the implementation. If so, then he has a point there. But that point is invalidated by two points.

    Firstly, the first part of this node shows that this same definition of EO prevents f() and g() from being parallelized so the ability to write $x = f($i) + g(++$i); can't provide the specified gain. You can now write this expression safely, but the compiler can't (in general) do any optimization on it that involves calling g() other than after f() has finished.

    Secondly, defined EO is not required for the programmer to express that. If that is what the programmer wants, then they write

    $x = f($i) + g(1+$i); $i+=1;
    or
    $i+=1; $x = f($i-1) + g();
    or
    my $t = $i++; $x = f($t) + g($i);

    or whatever they really meant (I don't know, because I don't know what precise evaluation order was in the mind of this theoretical programmer when they wrote that expression which is ambiguous in the absense of a specific, strict definition of EO).

    Note that such an expression of the algorithm would prevent the compiler from moving the assignment to $i around because the programmer ends up specifying the assignment in a separate statement which introduces a sequence point (denoted by the semicolor, ";") across which the compiler is forbidden to move operations (in time) -- well, unless the compiler can perform an analysis that shows that such a movement would not change the outcome.

    So, you could make the point that the undefined EO has required the programmer to write code in a way that prevents the compiler from optimizing parts of the code. This is true. However, the gain of being able to call f() and g() in parallel is likely much greater than any gain to be had by being able to move around the much simpler operation of assigning to $i. Further, the compiler is much more likely to be smart enough to be able to tell when a simple assignment can be moved than to be able to tell when two arbitrary expressions (calls to the f and g functions in our examples) are totally independent.

    And that is part of the point. f() and g() are stand-ins for arbitrary expressions and so are the important things to consider parallelizing. And the process of making them parallelizible (because EO is undefined) while still well-defined (in the face of undefined EO) is easy to do by, at worst, introducing temporary variables. And the fact that the simple assignments to these temporary variables are restricted by being in separate statements is still a net win because 1) they are tiny operations so optimizing them is much less important and 2) they are extremely simple operations so analysing them to the point of determining that they can be safely moved despite the sequence points is often feasible.

    In summary, a defined EO prevents the compiler from moving around operations (in time) in order to optimize them (including running them in parallel) unless the compiler can analyse the operations to the point of determining that such a move does not impact the results.

    An undefined EO allows the programmer to write expressions that tell the compiler "you don't need to perform a difficult or impossible analysis of the parts of this expression in order to be able to prove that the order in which parts are done doesn't matter, because I am confident that the order doesn't matter" but can also require the programmer to move certain simple operations out in to separate statements. This prevents the compiler from optimizing these separate statements, except (as in the previous paragraph) when it can prove that the optimizations are safe.

    Finally, from experience, I assert that it is not hard to come across cases of f() and g() in the real world where it is easy for the programmer to understand (without being able to prove it) that they are not intertwined while the amount of analysis required by a compiler to prove independence is substantial, often to the point of being infeasible or even impossible.

    I hope that helps cut through the near total lack of communication that I perceive in the majority of this thread and allows someone to get to the crux of this persistent misunderstanding.

    - tye        

      Maybe, just maybe, I've hit upon the explanation that will calrify the defined EO thing.

      Your assertion is that :

      If I write $x = f() + g(); and the order of evaluation is defined, let's say from left to right, then f() must be called before g().

      First, let's assume that for reasons of generalisation, that the order defined is right to left. That's okay in your example because addition is communicative, so your expression will still do the right thing.

      But, now let's switch to a non communicative operator:  $x = f() - g();. What happens?

      Nothing. Because the defined EO only pertains to the order in which the functions are called, not to the order in which the results they produce are used. The defined EO doesn't force the results to be used in reverse.

      Now come back to your assertion that defined EO means that f() and g() cannot be parallelised because f() must be called before g().

      Why?

      Once f() and g() have returned their values, you have a simple expression: $rv1 + $rv2; and there is nothing about defined EO that prevents that from being completed.

      And prior to that, f() and g() can be called in parallel, because I'm explicitly stating that with defined EO, parallelisable operations either side of a non-serialising operator can be called in parallel.

      That is, I'm not just saying I think they can, I'm saying that defined EO can be defined such that it is so.

      Where the defined EO comes into play is when you have a compound expression. This is, as you picked up on when you said:

      Now, BrowserUk had an example closer to $x = f($i) + g(++$i); and so perhaps BrowserUk is thinking that a defined EO is important to allow the compiler to add parallelization because otherwise writing this expression is basically an error -- because what gets passed to f() and/or g() depends on the implementation. If so, then he has a point there.

      the purpose of the parameters I showed in my examples. They are the simplest sub-expressions that have a side effect, which is their entire purpose for being there.

      It is at the level of the compound expression, where the defined EO becomes important.

      I'm going to contrive an example for the purposes of discussion. It will probably be a weak and somewhat non-sensical in it's contrivance, but I'm admitting that up front to avoid the ensuing argument about it. I'm contriving the circumstance using the simplest expressions possible to demonstrate the point, but the expressions and logic behind them act as placeholders for any similar set of circumstances that could exist.

      Here I want the first value returned by $o->next given to f() and the second to g():

      $x = f( $o->next ) + g( $o->next ); $y = f( $o->next ) - g( $o->next );

      Here, I want the reverse:

      $x = g( $o->next ) + f( $o->next ); $y = -g( $o->next ) + f( $o->next );

      Yes, I can achieve the same thing in the presence of undefined EO by doing:

      my $temp1 = $o->next; my $temp2 = $o->next; $x = f( $temp1 ) + g( $temp2 ); $temp1 = $o->next; $temp2 = $o->next; $y = g( $temp1 ) - f( $temp2 );

      and reversed

      my $temp1 = $o->next; my $temp2 = $o->next; $x = f( $temp2 ) + g( $temp1 ); $temp1 = $o->next; $temp2 = $o->next; $y = f( $temp1 ) - g( $temp2 );

      but why should I have to? Did you see how much clearer the first pair are than the second?

      And did you notice that the second pair contain an error? An error that would have stood out like a sore thumb in the first pair, but that gets lost in the noise of the second.

      Of course, nothing is forcing anyone to use the first form if EO is defined, it just enables it for those that wish to. There is no shotgun either way with defined EO, but without it one option is excluded--not very Perlish.

      And yes, in the presence of defined EO, f() and g() can be executed in parallel--if we chose to allow that. No compiler analysis is required, the programmer is explicitly condoning it.

      If that does not convince you that defined EO is a gain-something, lose-nothing proposal, then probably nothing will, but your being unconvinced will not change that it is so.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?

        Your definition of EO is significantly different from what I think everyone else assumed it could mean. And, I would like to make the observation that your definition of EO is counter-intuitive - at least to the majority of those who have participated in this thread.

        For all intents and purposes, your defined EO is pretty much undefined. Which gets called first - f or g? You're saying that you're not defining this. This is a seriously underdefined execution order, because only bits and pieces (and a very difficult bit and piece to define, as is evidenced by this thread) are defined, but not the entire expression.

        It's an interesting definition. I don't think there is much of a gain if only because you've left it pretty much completely undefined. (What if f or g affect $o? What order then?) We've gained so little, and so counter-intuitively, that I'm not sure it's significant enough to put forth code to implement it.

        Parallelism is not something we've gained - it's something we still lose. Any definition of execution order is, well, by definition, a definition of serialisation. For example, no matter what, the addition of the two values must happen prior to the assignment to the LHS. We can't add and assign in parallel. Similarly, if we must evaluate all the arguments to functions prior to calling any of them, then we are introducing a sequence point (which is quite invisible, which is why it's counter-intuitive) where we must synchronise everything before paralellising anything.

        Further, I'm not sure why you're only defining this for arguments to functions. Why not full subexpressions? That is, defining whether the LHS of a binary operator is evaluated before or after the RHS. Why the arbitrariness of the definition? It would be more intuitive if the whole expression was defined. It's confusing to only go half way, and to stop at some arbitrary point which doesn't seem to have any purpose to it.

        You may think I'm opposing you for the sake of opposing. And I'm definitely not as tactful as TimToady. But, at the very least, I hope that I'm helping to refine your idea, or at least the presentation of it. At this moment, however, I do remain unconvinced that there is an idea here that would make life better, so, at the most, I'm hoping that I convince you to see the light. ;-)

Re: Why is the execution order of subexpressions undefined?
by BrowserUk (Pope) on Apr 21, 2005 at 04:42 UTC

    As my categorically final contribution to this thread, I offer this link to a paper I read online in html form, but couldn't relocate until now as it seems to have moved countries.

    If you can see the parallels (sic) between the main topic of controversy in this thread, and the phrases "out of order execution", "branch prediction" and "non-determanism", and especially if you see the logic embodied in section 2.3 "Multi-threaded ISAs", and would like to discuss that seriously, then feel free to contact me via the email on my homepage which will stay active until it starts getting spam.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco.
    Rule 1 has a caveat! -- Who broke the cabal?
Re: Why is the execution order of subexpressions undefined? (a supporting argument)
by BrowserUk (Pope) on Apr 30, 2005 at 10:16 UTC

    Maybe I'm not totally mad?

    Sather


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about question the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://446796]
Approved by sh1tn
Front-paged by tlm
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2014-12-22 04:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (110 votes), past polls