Beefy Boxes and Bandwidth Generously Provided by pair Networks RobOMonk
go ahead... be a heretic
 
PerlMonks  

Assigning a varable inside of (?{})

by monsterzero (Monk)
on May 03, 2006 at 15:57 UTC ( #547187=perlquestion: print w/ replies, xml ) Need Help??
monsterzero has asked for the wisdom of the Perl Monks concerning the following question:

Hi Everyone,

I have a problem assigning a varable when using (?{}) inside of a regex. When I run the code below I receive this error:

Sunrise is:03:12Sunset is:21:08 Use of uninitialized value in print at D:\scripts\reg.pl line 7, <DATA +> line 2. Use of uninitialized value in print at D:\scripts\reg.pl line 7, <DATA +> line 2. Sunrise is:Sunset is:

If I comment out the use strict line of the code and remove the my($sunrise, $sunset) line the code will run. I really don't want to comment out the use strict part of my code. So my question is how can I change my code so the (?{}) part will play nice with use strict?

Thanks
use strict; use warnings; while (<DATA>) { my ( $sunrise, $sunset ) = get_time($_); print "Sunrise is:", ${$sunrise}, "Sunset is:", ${$sunset}, "\n"; } sub get_time { my ($string) = @_; my ($sunrise, $sunset); $string =~ /sunrise:\s+(\d+:\d+)(?{ $sunrise = $^N}) \s+sunset:\s+(\d+:\d+)(?{$sunset = $^N})/x; return ( \$sunrise, \$sunset ); } __DATA__ Aberdeen, Scotland 57 9 N 2 9 W sunrise: 03:12 sunset: 21:08 Adelaide, Australia 34 55 S 138 36 E sunrise: 06:52 sunset: 16:41

Comment on Assigning a varable inside of (?{})
Select or Download Code
Re: Assigning a varable inside of (?{})
by davidrw (Prior) on May 03, 2006 at 16:09 UTC
    You can just use the named captures $1 and $2 to store the values (also note the warning in perlre about (?{ code }) being highly experimental).. also, no need for string references here.
    use strict; use warnings; while (<DATA>) { next unless /\S/; my ( $sunrise, $sunset ) = get_time($_); next unless $sunrise && $sunset; printf "Sunrise is: %s Sunset is: %s\n", $sunrise, $sunset; } sub get_time { my $string = shift; return unless $string =~ /sunrise:\s+(\d+:\d+)\s+sunset:\s+(\d+:\d+) +/; return ( $1, $2 ); } __DATA__ Aberdeen, Scotland 57 9 N 2 9 W sunrise: 03:12 sunset: 21:08 Adelaide, Australia 34 55 S 138 36 E sunrise: 06:52 sunset: 16:41
    But where the undef was coming from was a blank line in <DATA> ... so the match failed, so $sunrise/$sunset where undef. Just a quick check for a valid string/success on the match resolves it.

      But where the unief was coming from was a blank line in <DATA> ... so the match failed...

      No it didn't. Place a "or die qq/failed\n/;" after the regexp and you'll see. While the OP certanly wouldn't ever know if it did or didn't fail (because he doesn't check), when I checked, it's not failing. The real problem is a scoping issue related to when regexes are compiled.

      Update:
      I do want to congratulate you on zeroing in on the optimal solution, however. I see no good reason to be relying on (?{...}) and $^N, with their inherent difficulties, just to emulate what $1 and $2 do automatically, and without the complexity.


      Dave

      Hello,

      I don't think that is the problem. If you look at the error it is in line 2. If what you say is true it would have said line 3. I believe (from the other answer) it is a scopeing issue? Anyway thanks for the reply.

Re: Assigning a varable inside of (?{})
by davido (Archbishop) on May 03, 2006 at 16:11 UTC

    This is a cool little bug in your script based on the fact that the first time a regexp is compiled, $sunrise and $sunset are, in effect, locked in. Subsequent calls to get_time() create new lexical $sunrise and $sunset variables, but the regexp is still using the original (unreachable to you) $sunrise and $sunset from the first get_time() call. One solution is to use globals for this situation. ...or rethink your regexp strategy.

    Observe the behavior of the following snippet, and take notice of the fact that on the second call to get_time() the reference to $sunrise printed within the regexp is a different reference than the reference to $sunrise printed outside the regexp. This supports the claim that the regexp got compiled once, the first time, locking in the first $sunrise, while each call to get_time() created a new lexical that the regexp simply wasn't using:

    use strict; use warnings; while (<DATA>) { my ( $sunrise, $sunset ) = get_time($_); print "Sunrise is:", ${$sunrise}, "Sunset is:", ${$sunset}, "\n"; } sub get_time { my ($string) = @_; my ($sunrise, $sunset); $string =~ /sunrise:\s+(\d+:\d+)(?{ $sunrise = $^N; print \$sunris +e, qq!\n!;}) \s+sunset:\s+(\d+:\d+)(?{$sunset = $^N})/x print \$sunrise, "\n"; return ( \$sunrise, \$sunset ); } __DATA__ Aberdeen, Scotland 57 9 N 2 9 W sunrise: 03:12 sunset: 21:08 Adelaide, Australia 34 55 S 138 36 E sunrise: 06:52 sunset: 16:41

    Also, you ought to be checking to ensure that you actually had a successful match before trusting and assuming that you did. It's not the issue in this particular case, but could become an issue silently, without you ever realizing it since you're not checking.

    ...by the way, Perl is behaving as documented.


    Dave

      As Davido says, moving them to be globals (I'm presuming changing the my declaration to an our solves this in the same way), you could also wrap the regex in an eval. This will force it to be compiled fresh every time, and you'll be able to use your block scoped variables as you have them.

      (edit) Hmmm, having tested this, it doesn't quite work as I expected. An eval block doesn't work, but evalling that line as a (single-quoted) string does work. Shrug.
      One solution is to use globals for this situation.

      Yes, global (lexical or package) variables will do the trick, but localized package variables would be better:

      sub get_time { my ($string) = @_; local our ($sunrise, $sunset); $string =~ / sunrise: \s+ (\d+:\d+) (?{ $sunrise = $^N }) \s+ sunset: \s+ (\d+:\d+) (?{ $sunset = $^N }) /x; return ( \$sunrise, \$sunset ); }
      ...or rethink your regexp strategy.

      I think this would be the proper course of action here. Using (?{...}) is overkill, and using $^N imposes a requirement for Perl v5.8.0+ needlessly. The OP could use the following simple expression:

      sub get_time { my ($string) = @_; my ($sunrise, $sunset) = $string =~ / sunrise: \s+ (\d+:\d+) \s+ sunset: \s+ (\d+:\d+) /x; return (\$sunrise, \$sunset); }

      By the way, why return references to the values? return ($sunrise, $sunset); would make more sense.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://547187]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (10)
As of 2014-04-24 12:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (565 votes), past polls