Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Regular expression quantifiers and the /gsmx modifiers

by davis (Vicar)
on Jun 16, 2012 at 21:22 UTC ( #976599=perlquestion: print w/replies, xml ) Need Help??
davis has asked for the wisdom of the Perl Monks concerning the following question:

*blows dust off perlmonks.org account. I've been away a while, but thought my Perl-fu was up to this rather simple task.

I believe I'm suffering from a rather simple misunderstanding of the /x modifier, and of the {} quantifiers. It's also the first time I've used the named capture buffers, but I'm not sure that matters.

Here's a complete script which should produce a match, so what have I done wrong?

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $vg_details = ' --- Physical volumes --- PV Name /dev/dsk/c14t3d1 PV Name /dev/dsk/c15t3d1 Alternate Link PV Status available Total PE 15997 Free PE 0 Autoswitch On Proactive Polling On '; while($vg_details =~ m/^\s*PV\s+Name\s*(?<pv_name>\S+)\s*$ (^\s*PV\s+Name\s+(?<alt_link>\S+)\s+Alternate\s+Link\s*$){0,20} +# skip them ^\s*PV\s+Status\s+(?<pv_status>\S+)\s*$ ^\s*Total\s+PE\s+(?<total_pe>\S+)\s*$ ^\s*Free\s+PE\s+(?<free_pe>\d+)\s*$ ^\s*Autoswitch\s+(?<autoswitch>\S+)\s*$ ^\s*Proactive\s+Polling\s+(?<proactive_polling>\S+)\s*$/gsmx) { my $pv_name = $+{pv_name}; print "matched $pv_name"; }

The example data is slightly contrived, in that the "Alternate Link" lines are optional (and there may be many). Removing the additional 6 lines below the "PV Name" line in the regex makes it work, so have I completely confused the multi-line comment switch?

Also, I know this is a slightly ludicrous method to process VG data, but I've been handed a big, big list of "vgdisplay -v" output, and this particular edge case is failing. I've reduced it to this minimal example and my eyes still can't spot what's wrong. What silly mistake am I making?


davis

Replies are listed 'Best First'.
Re: Regular expression quantifiers and the /gsmx modifiers
by morgon (Curate) on Jun 16, 2012 at 22:03 UTC
    I think the problem is that $ matches BEFORE a newline and ^ matches AFTER a newline (when using /sm), but you also need to consume the newline itself.

    Consider:

    use strict; my $s = <<"__end__"; hubba bubba __end__ print "matched 1\n" if $s =~ m/^hubba$ ^bubba$/smx; print "matched 2\n" if $s =~ m/^hubba$ \n ^bubba$/smx;
    Here only the second regex matches because in the first after the $ matches there is still a newline left and therefore the ^ does not match. You need to consume that in the regex as the second example shows.

      Damn. I completely missed that.

      #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $vg_details = ' --- Physical volumes --- PV Name /dev/dsk/c14t3d1 PV Name /dev/dsk/c15t3d1 Alternate Link PV Status available Total PE 15997 Free PE 0 Autoswitch On Proactive Polling On '; while($vg_details =~ m/^\s*PV\s+Name\s*(?<pv_name>\S+)\s*$ \n (^\s*PV\s+Name\s+(?<alt_link>\S+)\s+Alternate\s+Link\s*$ \n){0,20} + # skip them ^\s*PV\s+Status\s+(?<pv_status>\S+)\s*$ \n ^\s*Total\s+PE\s+(?<total_pe>\S+)\s*$ \n ^\s*Free\s+PE\s+(?<free_pe>\d+)\s*$ \n ^\s*Autoswitch\s+(?<autoswitch>\S+)\s*$ \n ^\s*Proactive\s+Polling\s+(?<proactive_polling>\S+)\s*$ \n/gsmx) { my $pv_name = $+{pv_name}; print "matched $pv_name"; }

      Seems to DWIM. I genuinely cannot believe I've been gawping at that for so long. My thanks.


      davis

Re: Regular expression quantifiers and the /gsmx modifiers
by Kenosis (Priest) on Jun 16, 2012 at 22:33 UTC

    I removed the ^ and $ notations in your regex, and your matching worked perfectly:

    #!/usr/bin/perl use Modern::Perl; my @matched = qw {pv_name alt_link pv_status total_pe free_pe autoswitch proactive +_polling}; my $vg_details = ' --- Physical volumes --- PV Name /dev/dsk/c14t3d1 PV Name /dev/dsk/c15t3d1 Alternate Link PV Status available Total PE 15997 Free PE 0 Autoswitch On Proactive Polling On '; $vg_details =~ m/^\s*PV\s+Name\s*(?<pv_name>\S+)\s* (\s*PV\s+Name\s+(?<alt_link>\S+)\s+Alternate\s+Link\s*) # skip t +hem \s*PV\s+Status\s+(?<pv_status>\S+)\s* \s*Total\s+PE\s+(?<total_pe>\S+)\s* \s*Free\s+PE\s+(?<free_pe>\d+)\s* \s*Autoswitch\s+(?<autoswitch>\S+)\s* \s*Proactive\s+Polling\s+(?<proactive_polling>\S+)\s*/gsmx; say $+{$_} for @matched;

    Results:

    /dev/dsk/c14t3d1 /dev/dsk/c15t3d1 available 15997 0 On On
      I removed the ^ and $ notations in your regex, and your matching worked perfectly...

      Of course, the reason is that a newline  \n is a member of the  \s regex character set, and there is an abundance of  \s* in the modified regex.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://976599]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2016-10-01 18:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?






    Results (5 votes). Check out past polls.