Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Extraneous behaviour of match variables

by ikegami (Pope)
on Nov 03, 2006 at 03:49 UTC ( #582001=note: print w/ replies, xml ) Need Help??


in reply to Extraneous behaviour of match variables

But the second 'if' reset the match variables in a failed test.

Are you talking about the blank output for "N"? "N" is reached on a successful match. It's printing blank because $1 was cleared because /video/ has no captures.
if ( $1 !~ /video/ )
simply means
if ( !( $1 =~ /video/ ) )
The negation occurs *after* the match fails or succeeds.

Update: Maybe the following will make things a little clearer:

for (qw( video book )) { 'unchanged' =~ /(.*)/; # Set $1 if ( $_ !~ /(video)/ ) { print "true -> $1\n"; } else { print "false -> $1\n"; } }
false -> video true -> unchanged

When the match succeeds and the expression returns false, $1 is set.
When the match fails and the expression returns true, $1 remains unchanged.

Update: And finally, a solution to your problem

foreach my $link ( '<a href="/story/43480/">The Bottled Water Lie</a>', '<a href="/story/video/43480/">The Bottled Water Lie</a>', ) { my ($url, $title) = $link =~ m{href="(.+)">(.+)</a>} or next; $url =~ /video/ and next; print("$url: $title\n"); }

or

foreach my $link ( '<a href="/story/43480/">The Bottled Water Lie</a>', '<a href="/story/video/43480/">The Bottled Water Lie</a>', ) { my ($url, $title) = $link =~ m{href="((?:(?!video).)+)">(.+)</a>} or next; print("$url: $title\n"); }

Replace the print with whatever you want.


Comment on Re: Extraneous behaviour of match variables
Select or Download Code
Re^2: Extraneous behaviour of match variables
by explorer (Chaplain) on Nov 03, 2006 at 13:34 UTC

    Thanks, ikegami, for illumination.

    A question more. Is better ((?:(?!video).)+) that ((?:(?<!video).)+) ?

      Better might be to use an HTML parser (or something like HTML::LinkExtor) and simplify what you have to look at.

      (?:(?<!video).)+ is wrong.

      for my $re ( qr/"(?:(?!video).)+"/, qr/"(?:(?<!video).)+"/, qr/"(?:.(?<!video))+"/, ) { print("$re\n"); for ( '"...video..."', '"...video"', '"video..."', '"video"', ) { print("$_: ", /$re/?1:0, "\n"); } }
      (?-xism:"(?:(?!video).)+") "...video...": 0 "...video": 0 "video...": 0 "video": 0 (?-xism:"(?:(?<!video).)+") "...video...": 0 "...video": 1 <----- XXX "video...": 0 "video": 1 <----- XXX (?-xism:"(?:.(?<!video))+") "...video...": 0 "...video": 0 "video...": 0 "video": 0

      (?:(?!video).)+ and (?:.(?<!video))+ should be equivalent. You can do benchmarks to be sure.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://582001]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2015-07-06 04:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (70 votes), past polls