G'day dwlepage,
Here's my take on a solution:
#!/usr/bin/env perl
use strict;
use warnings;
my $re = qr{ set \s zone \s (?> id \s \d+ \s | ) \" ( [^"]+ ) }x;
my $out_format = "Config line=> %s; Value=> %s; zone=> %s\n";
while (<DATA>) {
next unless /$re/;
chomp;
printf $out_format => $., $_, $1;
}
__DATA__
set zone "VLAN" vrouter "trust-vr"
set zone id 100 "Internet_Only"
Output:
$ pm_pref_quote_regex.pl
Config line=> 1; Value=> set zone "VLAN" vrouter "trust-vr"; zon
+e=> VLAN
Config line=> 2; Value=> set zone id 100 "Internet_Only"; zone=>
+ Internet_Only
Note that I've used the (?> ... ) construct - documented in perlre - Extended Patterns. Use of this construct for alternations is a Perl Best Practices recommendation (which may, or may not, be important to you).
I also added some additional lines to test for skipped (i.e. not matched) input and arbitrary surrounding text:
__DATA__
set zone "VLAN" vrouter "trust-vr"
set zone id 100 "Internet_Only"
blah
blah blah set zone "extra" whatever
blah blah blah "set zone id 12345 "extra2_a" something "extra2_b"
These tests were successful:
$ pm_pref_quote_regex.pl
Config line=> 1; Value=> set zone "VLAN" vrouter "trust-vr"; zon
+e=> VLAN
Config line=> 2; Value=> set zone id 100 "Internet_Only"; zone=>
+ Internet_Only
Config line=> 4; Value=> blah blah set zone "extra" whatever; zo
+ne=> extra
Config line=> 5; Value=> blah blah blah "set zone id 12345 "extra2_
+a" something "extra2_b"; zone=> extra2_a
You may be interested in Regexp::Debugger. This tool provides a visualisation of your regex in action. It is very easy to use: just add use Regexp::Debugger; near the start of your code and run your script.
Another tool is YAPE::Regex::Explain. However, do be aware of its limitations: "There is no support for regular expression syntax added after Perl version 5.6, ...". Using this on your supplied regex produces the following (somewhat lengthy) output:
$ perl -MYAPE::Regex::Explain -e '
my $re = q{^set\szone\s("([^"]*)"|id\s\d+\s"([^"]*)")};
print YAPE::Regex::Explain->new($re)->explain;
'
The regular expression:
(?-imsx:^set\szone\s("([^"]*)"|id\s\d+\s"([^"]*)"))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
set 'set'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
zone 'zone'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
[^"]* any character except: '"' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
id 'id'
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
[^"]* any character except: '"' (0 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
" '"'
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.