smls has asked for the wisdom of the Perl Monks concerning the following question:
Consider the example of having as input a list of numbers, some of which are not known exactly but only known to be inside a given range (that is encoded in a string value).
The following code snippet adds up the given numbers, returning an upper and lower bound for the result:
#!/usr/bin/perl
use strict; use warnings;
my @ranges = ('15', '28-31', '3-4', '40', '17-19');
my ($total_min, $total_max);
foreach my $range (@ranges) {
my ($min) = $range =~ /^(\d+)/;
my ($max) = $range =~ /(\d+)$/;
$total_min += $min;
$total_max += $max;
}
print "total is between $total_min and $total_max\n";
It works fine, but regarding the regex part, the need to
- separate it into multiple statements
- introduce temporary variables (here $min and $max)
always seriously bothers me in cases like this.
I would much prefer to be able to write the whole body of the of loop in the above example in the form:
$total_min += SELF_CONTAINED_FUNCTIONAL_STATEMENT;
$total_max += SELF_CONTAINED_FUNCTIONAL_STATEMENT;
Is there an alternative syntax for string extraction using regexes that would allow this?
It's not about performance or such. It's about my brain receiving a nice dose of dopamin whenever I write a line of concise, functional, self-contained code - and the opposite when I can't.
---- PS: Even better would be the following, but unfortunately it seems that Perl's += operator does not work that way:
($total_min, $total_max) += STATEMENT_RETURNING_A_LIST_OF_TWO_NUMS;
Re: Is there a more functional regex syntax?
by BrowserUk (Patriarch) on Sep 18, 2012 at 16:26 UTC
|
use List::Util qw[ sum ];
my @ranges = ('15', '28-31', '3-4', '40', '17-19');
my $tMin = sum map{ /^(\d+)/ } @ranges;
my $tMax = sum map{ /(\d+)$/ } @ranges;
print "$tMin : $tMax";;
103 : 109
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP Neil Armstrong
| [reply] [Watch: Dir/Any] [d/l] |
|
I think this is the most readable solution for this specific problem posted so far, thanks for posting it.
I have always shied away from List::Util and friends so far, but I'm starting to see its beauty...
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
|
|
Re: Is there a more functional regex syntax?
by tobyink (Canon) on Sep 18, 2012 at 15:54 UTC
|
#!/usr/bin/perl
use strict; use warnings;
my @ranges = ('15', '28-31', '3-4', '40', '17-19');
my @totals = (0, 0);
foreach my $range (@ranges) {
$totals[$_] += (split /-/, $range)[$_] for 0, -1;
}
print "total is between $totals[0] and $totals[-1]\n";
... it's arguably less readable though.
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [Watch: Dir/Any] [d/l] |
|
Nice trick using the -1 index like that, I'll have to remember that...
Regarding the original question though, I was hoping for a solution that keeps the regexes, because even if replacing them with split is possible in this case, that won't always be feasible if dealing with more complex regexes.
| [reply] [Watch: Dir/Any] |
|
$totals[$_] += ($range =~ /(\d+)/g)[$_] for 0, -1;
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [Watch: Dir/Any] [d/l] |
Re: Is there a more functional regex syntax?
by kennethk (Abbot) on Sep 18, 2012 at 15:54 UTC
|
What about something more like:
$range =~ /^(\d+)(?:-(\d+))?$/;
$total_min += $1;
$total_min += $2//$1;
I personally think holding onto $min and $max reads easier, but I can understand the desire for conciseness.
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Or, as an improvement in the regex, /^(?=(\d+))(?:\d+-)?(\d+)$/, so that both $1 and $2 are always defined. So, as 1 line,
$totals[$_] += ($range =~ /^(?=(\d+))(?:\d+-)?(\d+)$/)[$_] for 0,1;
Or more simply, if you want to keep the variables separate,
$total_min += ($range =~ /^(\d+)/)[0];
$total_max += ($range =~ /(\d+)$/)[0];
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Thank, this is what I've been looking for!
It's not extremely pretty, but it should scale nicely for the general case of needing the result of a regex string extraction as a right-hand-side value.
PS: The regex suggested by tobyink above can make the single-line case much more readable: /(\d+)/g
| [reply] [Watch: Dir/Any] [d/l] |
|
Re: Is there a more functional regex syntax?
by Arunbear (Prior) on Sep 18, 2012 at 16:19 UTC
|
A functional approach (more for fun though):
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(reduce);
use Data::Dump 'pp';
my @ranges = ('15', '28-31', '3-4', '40', '17-19');
my ($total_min, $total_max) = @{
reduce { [$a->[0] + $b->[0], $a->[1] + $b->[1]] }
map { @$_ == 1 and [($_->[0]) x 2] or $_ }
map { [ split /-/ ] } @ranges
};
pp ($total_min, $total_max);
__DATA__
('15', '28-31', '3-4', '40', '17-19')
|
v
([15], [28, 31], [3, 4], [40], [17, 19])
|
v
([15, 15], [28, 31], [3, 4], [40, 40], [17, 19])
|
v
[103, 109]
| [reply] [Watch: Dir/Any] [d/l] |
Re: Is there a more functional regex syntax?
by AnomalousMonk (Archbishop) on Sep 18, 2012 at 16:54 UTC
|
I, too, would be inclined toward something like the initial form of the code given in the OP for reasons of readability and maintainability. However, here's another way to glue everything together:
>perl -wMstrict -le
"my @ranges = ('15', '28-31', '3-4', '40', '17-19');
my ($total_min, $total_max);
;;
m{ \A (\d+) (?{ $total_min += $^N })
(?: - (\d+))? (?{ $total_max += $^N })
\z
}xmsg for @ranges;
;;
print qq{total between $total_min and $total_max};
"
total between 103 and 109
Update:
I would much prefer to be able to write the whole body of the of loop in the above example in the form:
$total_min += SELF_CONTAINED_FUNCTIONAL_STATEMENT;
$total_max += SELF_CONTAINED_FUNCTIONAL_STATEMENT;
This seems to come a bit closer to what smls asked for (but I like BrowserUk's solution better!) and has a bit of input validation:
perl -wMstrict -le
"my @ranges = qw(15 28-31 3-4 40 17-19 99- -99 -99- x x-x);
my ($total_min, $total_max);
;;
my $extract_ranges = qr{ \A (\d+) (?: - (\d+))? \z }xms;
for (@ranges) {
$total_min += /$extract_ranges/ && $1;
$total_max += /$extract_ranges/ && $^N;
}
;;
print qq{total between $total_min and $total_max};
"
total between 103 and 109
(Update: Now that I look back on this thread, my second approach looks rather like kennethk's first idea.)
| [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Is there a more functional regex syntax?
by rjt (Curate) on Sep 18, 2012 at 15:54 UTC
|
I'm not exactly sure the expected output is for the cases where there is only one digit. I took a guess that the min = max in these cases, so if there is no second number, I use the first again.
s|^(\d+)(\-(\d+))?$|$total_min += $1; $total_max += $3//$1|e for (@r
+anges);
| [reply] [Watch: Dir/Any] [d/l] |
Re: Is there a more functional regex syntax?
by kcott (Archbishop) on Sep 19, 2012 at 07:34 UTC
|
G'day smls,
In the code below, I've eliminated both temporary variables and reduced the foreach block of code to a single line:
$ perl -Mstrict -Mwarnings -e '
my @ranges = qw{15 28-31 3-4 40 17-19};
my ($total_min, $total_max);
s{^(\d+)-?(\d*)$}{$total_min += $1; $total_max += $2 ? $2 : $1}e for @
+ranges;
print "total is between $total_min and $total_max\n";
'
total is between 103 and 109
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
|