comment on

Here's my take on a solution:

#!/usr/bin/env perl

use strict;
use warnings;

my $re = qr{ set \s zone \s (?> id \s \d+ \s | ) \" ( [^"]+ ) }x;
my $out_format = "Config line=> %s;    Value=> %s;    zone=> %s\n";

while (<DATA>) {
    next unless /$re/;
    chomp;
    printf $out_format => $., $_, $1;
}

__DATA__
set zone "VLAN" vrouter "trust-vr"
set zone id 100 "Internet_Only"
[download]

Output:

$ pm_pref_quote_regex.pl
Config line=> 1;    Value=> set zone "VLAN" vrouter "trust-vr";    zon
+e=> VLAN
Config line=> 2;    Value=> set zone id 100 "Internet_Only";    zone=>
+ Internet_Only
[download]

Note that I've used the (?> ... ) construct - documented in perlre - Extended Patterns. Use of this construct for alternations is a Perl Best Practices recommendation (which may, or may not, be important to you).

I also added some additional lines to test for skipped (i.e. not matched) input and arbitrary surrounding text:

__DATA__
set zone "VLAN" vrouter "trust-vr"
set zone id 100 "Internet_Only"
blah
blah blah set zone "extra" whatever
blah blah blah "set zone id 12345 "extra2_a" something "extra2_b"
[download]

These tests were successful:

$ pm_pref_quote_regex.pl
Config line=> 1;    Value=> set zone "VLAN" vrouter "trust-vr";    zon
+e=> VLAN
Config line=> 2;    Value=> set zone id 100 "Internet_Only";    zone=>
+ Internet_Only
Config line=> 4;    Value=> blah blah set zone "extra" whatever;    zo
+ne=> extra
Config line=> 5;    Value=> blah blah blah "set zone id 12345 "extra2_
+a" something "extra2_b";    zone=> extra2_a
[download]

You may be interested in Regexp::Debugger. This tool provides a visualisation of your regex in action. It is very easy to use: just add use Regexp::Debugger; near the start of your code and run your script.

Another tool is YAPE::Regex::Explain. However, do be aware of its limitations: "There is no support for regular expression syntax added after Perl version 5.6, ...". Using this on your supplied regex produces the following (somewhat lengthy) output:

$ perl -MYAPE::Regex::Explain -e '
my $re = q{^set\szone\s("([^"]*)"|id\s\d+\s"([^"]*)")};
print YAPE::Regex::Explain->new($re)->explain;
'
The regular expression:

(?-imsx:^set\szone\s("([^"]*)"|id\s\d+\s"([^"]*)"))

matches as follows:
  
NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  set                      'set'
----------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
  zone                     'zone'
----------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    "                        '"'
----------------------------------------------------------------------
    (                        group and capture to \2:
----------------------------------------------------------------------
      [^"]*                    any character except: '"' (0 or more
                               times (matching the most amount
                               possible))
----------------------------------------------------------------------
    )                        end of \2
----------------------------------------------------------------------
    "                        '"'
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    id                       'id'
----------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
    \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
    "                        '"'
----------------------------------------------------------------------
    (                        group and capture to \3:
----------------------------------------------------------------------
      [^"]*                    any character except: '"' (0 or more
                               times (matching the most amount
                               possible))
----------------------------------------------------------------------
    )                        end of \3
----------------------------------------------------------------------
    "                        '"'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
[download]

-- Ken

In reply to Re: Problem with alternating regex? by kcott
in thread Problem with alternating regex? by dwlepage

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


good chemistry is complicated, and a little bit messy -LW
	PerlMonks