SquireJames has asked for the wisdom of the Perl Monks concerning the following question:
Hail all ye Monkley monks,
I'm sure that this is a pretty straightforward question, but I've really only just started to get my feet wet with Regexp's and I'm fresh out of ideas.
I'm writing a script to release a number of emails from Quarantine. Essentially, the user is provided with a list of messages to release. I am trying to allow the user to enter a range of options, for example: 1,4-6,a. "1" would be release the first message, 4-6 would be messages "4-6" and "a" would release all messages. I am trying to write a regular expression that will eveluate a string of any length (i.e. with an infinite potential for message numbers) and ensures that it is in a valid format. Currently I am using the following peice of code which will only check if the first peice of data entered matches any of the expression given:
use strict;
my $releasing;
while ($releasing !~ /[0-9][0-9]?,?|[aA],?|[0-9][0-9]?\-[0-9][0-9]?,?/
+) {
print ("\n\nPlease type the numbers of the messages that are to be
+ released (n-n and n \nare allowable, A for all): ");
$releasing = <STDIN>;
chomp $releasing;
}
I know that I could just accept the data, split it by comma and then evaluate each section, but I'd like to be able to continue prompting for the message selection then and there if the data entered is incorrect. For the record, I have tried using /^[0-9][0-9]?,?$|^[aA],?$|^[0-9][0-9]?\-[0-9][0-9]?,?$/ but that didn't help much either (only allowing one item entry). I have a feeling that I should be using /g and \G, but I can't really find a decent resource on the 'net to give me a good start at this.
Thanks
Re: Regular Expression Question
by Anonymous Monk on Dec 09, 2003 at 07:12 UTC
|
/^(?:(?:\d+|\d+-\d+|a)(,\s*|$))+$/
| [reply] [d/l] |
|
That's excellent, thanks muchly.
Now all I have to do is work out what it's doing....
| [reply] |
|
use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(qr/^(?:(?:\d+|\d+-\d+|a)(,\s*|$))+$/)-
+>explain;
outputs:
The regular expression:
(?-imsx:^(?:(?:\d+|\d+-\d+|a)(,\s*|$))+$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
(?: group, but do not capture (1 or more times
(matching the most amount possible)):
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
- '-'
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
a 'a'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
, ','
----------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ")
(0 or more times (matching the most
amount possible))
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
$ before an optional \n, and the end of
the string
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
)+ end of grouping
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
| [reply] [d/l] [select] |
Re: Regular Expression Question
by Abigail-II (Bishop) on Dec 09, 2003 at 10:27 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
my $item = qr /(?:[Aa]|\d+(?:-\d+)?)/;
my $list = qr /^$item(?:,$item)*$/;
while (<DATA>) {
chomp;
print "$_: ", /$list/ ? "matches\n" : "does not match\n";
}
__DATA__
1
1-3
a
1,4-6,a
1,4-6,,a
1,4-6,
,1,4-6,a
1,4-6,b
1,4-6-8,a
1,4-a,2
1,4-6,4-6,1234-12,a
1: matches
1-3: matches
a: matches
1,4-6,a: matches
1,4-6,,a: does not match
1,4-6,: does not match
,1,4-6,a: does not match
1,4-6,b: does not match
1,4-6-8,a: does not match
1,4-a,2: does not match
1,4-6,4-6,1234-12,a: matches
Abigail | [reply] [d/l] |
Re: Regular Expression Question
by Aragorn (Curate) on Dec 09, 2003 at 10:17 UTC
|
I think the following comes pretty close to what you want (even more, multiple messages/ranges on a line and continued prompting):
#!/usr/bin/perl
use strict;
use warnings;
my $done = 0;
while (not $done) {
print "\n\nPlease type the numbers of the messages that are to be
+released (n-n and n \nare allowable, A for all): ";
my $input = <STDIN>;
chomp($input);
foreach my $range (split(/,/, $input)) {
$range =~ s/\s*//g; # Remove whitespace
if ($range =~ /[Aa]/) {
# All messages, assume we're done
print "All messages!\n";
# ...
$done = 1;
}
elsif ($range =~ /(\d+) # Single message number
(?: # Can be followed by a dash wit
+h a
-(\d+) # second number (which denotes
+the end
)? # of the range
/x) {
if ($1 and $2) {
print "Message range: #$1 to #$2\n";
# ...
}
else {
print "Single message: #$1\n";
# ...
}
}
elsif ($range =~ /[Qq]/) {
# Give the user the option to quit.
$done = 1;
}
else {
# Wrong input
print "Wrong input!\n";
# ...
}
}
}
Arjen | [reply] [d/l] |
Re: Regular Expression Question
by delirium (Chaplain) on Dec 09, 2003 at 10:51 UTC
|
If you know the number of messages available ahead of time, you could pass that along with the range string to this function. It will return undef if the string is invalid, otherwise it will return a reference to an array containing each message number in question based on a given range.
sub range_populate {
my ($max, $range) = @_;
my @range = ();
if ($range eq 'a') { return \@{[0..$max]}; }
elsif ($range !~ /[-,]/) { push @range, $range; }
else {
my @mini = split /,/, $range;
for (@mini) {
return undef if /-.*?-/; # 1 dash per subsection, e.g., "2
+-6-9, 24" is invalid
if ( !/-/ ) { push @range, $_; }
else {
return undef unless /(\d+)-(\d+)/;
return undef unless $1 < $2;
push @range, ($1..$2);
}
}
}
return undef if $#range == -1;
for (@range) { return undef unless ($_ >= 0 && $_ <= $max); }
return \@range;
}
# example
my $num_messages = 10;
my $sample_range = '3-6,9';
my $rng_ref = &range_populate ($num_messages, $sample_range); # return
+s reference to array (3, 4, 5, 6, 9)
| [reply] [d/l] |
|
|