Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

What's going on with either constants folding or B::Deparse output in this case?

by vr (Curate)
on Feb 25, 2017 at 11:35 UTC ( [id://1182813]=perlquestion: print w/replies, xml ) Need Help??

vr has asked for the wisdom of the Perl Monks concerning the following question:

I'm puzzled with this little problem (a fragment is to reproduce it, never mind what it was doing originally):

C:\>perl -we "sub parse{${$_[0]}=~/\Gfoo/gc or die} parse \('foo') for + 1..2" Died at -e line 1. C:\>perl -we "sub parse{${$_[0]}=~/\Gfoo/gc or die} parse \(''.'foo') +for 1..2" C:\>perl -MO=Deparse -we "sub parse{${$_[0]}=~/\Gfoo/gc or die} parse +\('foo') for 1..2" BEGIN { $^W = 1; } sub parse { die unless ${$_[0];} =~ /\Gfoo/cg; } parse \'foo' foreach (1 .. 2); -e syntax OK C:\>perl -MO=Deparse -we "sub parse{${$_[0]}=~/\Gfoo/gc or die} parse +\(''.'foo') for 1..2" BEGIN { $^W = 1; } sub parse { die unless ${$_[0];} =~ /\Gfoo/cg; } parse \'foo' foreach (1 .. 2); -e syntax OK

I'd expect, because of constants folding, two fragments of code to behave the same. Also, see B::Deparse output. Yet code runs differently.

Replies are listed 'Best First'.
Re: What's going on with either constants folding or B::Deparse output in this case? (updated)
by haukex (Archbishop) on Feb 25, 2017 at 11:58 UTC

    I am not an expert on this, but I do think I have an idea of what's going on: AFAIK the state that m//gc keeps is attached to each string (pos). See the output of the following, just your test cases with debugging added:

    use warnings; use strict; use Carp; use Devel::Peek; sub parse { Dump $_[0]; ${$_[0]}=~/\Gfoo/gc or confess; } warn "##### Case 1 #####\n"; parse \(''.'foo') for 1..2; warn "##### Case 2 #####\n"; parse \('foo') for 1..2;

    In "Case 1", the dot operator (concat) appears* to create a new string on each execution of the loop. In "Case 2", the same string (same memory address etc.) is passed to the function each time, so m/\G.../gc keeps its state, and so the second call fails.

    As for B::Deparse, from its docs:

    The output of B::Deparse won't be exactly the same as the original source, since perl doesn't keep track of comments or whitespace, and there isn't a one-to-one correspondence between perl's syntactical constructions and their compiled form, but it will often be close.

    * Update: Added the "appears to", since my further investigation below has made me uncertain as to what is going on the exact technical explanation is.

      I understand (at least I think so) why my 1st command line (your "Case 2") dies. I also understand there's no "one-to-one correspondence" between source and B::Deparse'd output (e.g. or die vs die unless). But, then I don't understand why constant folding doesn't work as expected.

      About Devel::Peek output, I see there are Readonly flags in "Case 2" but not in "Case 1". Is that the reason for demonstrated behavior? Can I count on this 'fix' (e.g. prepending an empty string) to parse a hard-coded string several times, if a 3d party module uses similar (as shown) mechanism for parsing? (maybe I'm asking too much here)

      Update. ... So, answering my last question, better not to count on this behavior, but "add more lines of code" such as dummy variable

      Update 2. Deleted, from update above, what was probably speculation and wishful thinking. Still unclear, what's going on.

        I don't understand why constant folding doesn't work as expected.

        I'm not an expert on the Perl internals, for now I'm just drawing my conclusions from the documentation and the debug output. All I can offer at the moment is the output of B::Concise, I have trimmed it down to only the relevant differences between the two:

        $ perl -MO=Concise -we 'sub parse{${$_[0]}=~/\Gfoo/gc or die} parse \( +""."foo") for 1..2' ... a <1> srefgen sKM/1 ->b - <1> ex-list lKRM ->a 9 <$> const(PV "foo") sPRM/FOLD ->a ... $ perl -MO=Concise -we 'sub parse{${$_[0]}=~/\Gfoo/gc or die} parse \( +"foo") for 1..2' ... 9 <$> const(IV \"foo") sM/FOLD ->a ...

        The way I interpret that is that in the first case, the constant ""."foo" is apparently folded down to "foo", however what apparently isn't folded is taking a reference to that string. "Constant Folding" in perlop only talks about the constant folding of strings and numeric values, so I'm sorry I don't have a very solid explanation of why the two examples are different (other than my above interpretation of the concat op, which no longer seems to fully accurately explain the situation).

        In general, I would try to find a solution that doesn't depend on internals like whether certain optimizations were performed or certain internal flags like Readonly are set.

        Can I count on this 'fix' (e.g. prepending an empty string) to parse a hard-coded string several times

        If the problem is that you want to reset m/\G.../gc matches, then what you can do reliably is pos($string)=undef; to reset that particular state, I posted an example here.

Re: What's going on with either constants folding or B::Deparse output in this case?
by shmem (Chancellor) on Feb 25, 2017 at 23:21 UTC
    I'd expect, because of constants folding, two fragments of code to behave the same.

    I would not. Just looking at the code, without getting out the Devel toolbox or the debugger and such, the first variant gives a string and the second an expression to the reference-operator (backslash). The expression must be resolved in order to take a reference to its result. Constant folding takes place in either case, but the result of an expression is stored in its own (new) SV, whereas a string unaltered by constant folding is passed as is. Consider:

    sub parse{${$_[0]}=~/\Gfoo/gc or die} parse \(''.'foo'.$_) for 1..2

    Here, the terms '' and 'foo' are folded into foo by the compiler (constant folding), then (at runtime) it is concatenated with the loop variable. A reference is generated for the result of that expression.
    The same steps are taken for (''.'foo') - but without the concatenation (since no such op).

    The result of constant folding needs to be stored somewhere. It is not shoehorned back into either PV of '' or 'foo' (which one?), and its storage is volatile.

    That's my view of it... I'd be glad to be corrected by somebody more familiar with perl internals.

    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1182813]
Approved by haukex
Front-paged by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-18 18:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found