Re: Matching numbers by regex.
by GrandFather (Saint) on Apr 19, 2006 at 10:01 UTC
|
* is greedy - it matches as many characters as it can, but it can match none at all. In (\d+).*(\d*\d+) the \d* is redundant (the following \d+ matches at least 1 digit and as many as it may) and the .* before it matches as many charactes as it can including all except one digit (the \d+ grabs one digit). One way to fix the problem is:
use strict;
use warnings;
my $data = "Exlief 4 page : 1 /10";
my $match = qr/pag\w+\s*:\s*(\d+)[^\d]*(\d+)/;
print "Pages : $1 / $2\n" if $data =~ $match;
$data = "Exlief 4 page : 1 / 5";
print "Pages : $1 / $2\n" if $data =~ $match;
Prints:
Pages : 1 / 10
Pages : 1 / 5
Note that a precompiled regex is used to save retyping (perhaps differently) the regex and that the 'match any character' has been replaced by 'match any character except a digit' and that the redundant digit match has been removed.
DWIM is Perl's answer to Gödel
| [reply] [d/l] [select] |
|
[^\d]*
may be represented with
\D*
and will be more efficient as well, since it avoids calls to utf8::IsDigit internally.
• another intruder with the mooring in the heart of the Perl
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
Re: Matching numbers by regex.
by prasadbabu (Prior) on Apr 19, 2006 at 09:56 UTC
|
Here is one way to do it. In your coding you have used unnecessary greediness. You have to take a look at perlre
my $data = "Exlief 4 page : 1 /10";
if ($data =~ /pag[^:]*:\s*(\d+)[^\d]*(\d*)/) {
print "Pages : $1 / $2\n";
}
my $data = "Exlief 4 page : 1 / 5";
if ($data =~ /pag[^:]*:\s*(\d+)[^\d]*(\d*)/) {
print "Pages : $1 / $2\n";
}
output:
Pages : 1 / 10
Pages : 1 / 5
| [reply] [d/l] |
Re: Matching numbers by regex.
by Samy_rio (Vicar) on Apr 19, 2006 at 09:56 UTC
|
#!/usr/bin/perl
my $data = "Exlief 4 page : 1 /10";
if ($data =~ /pag\w+\s*:\s*(\d+)[^\d]*(\d+)/) {
print "Pages : $1 / $2\n";
}
my $data = "Exlief 4 page : 1 / 5";
if ($data =~ /pag\w+\s*:\s*(\d+)[^\d]*(\d+)/) {
print "Pages : $1 / $2\n";
}
__END__
Pages : 1 / 10
Pages : 1 / 5
Regards, Velusamy R. eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';
| [reply] [d/l] [select] |
Re: Matching numbers by regex.
by jonadab (Parson) on Apr 19, 2006 at 12:51 UTC
|
As others have noted, the greediness of .* is your problem. However,
they all seem to want to fix it by making it not match any digits, which
seems odd, since the problem isn't that it can match digits, but rather
that it is greedy. I would just change .* to .*? to make it non-greedy.
The results will be just about the same in this particular instance,
however.
Sanity? Oh, yeah, I've got all kinds of sanity. In fact, I've developed whole new kinds of sanity. Why, I've got so much sanity it's driving me crazy.
| [reply] |
Re: Matching numbers by regex.
by japhy (Canon) on Apr 19, 2006 at 13:48 UTC
|
What was the purpose of \d*\d+ in the regex?
| [reply] [d/l] |
|
I thought that it behaved the opposite to greedy. just the \d+ was there but only matching the last digit. So I thought I could force it to accept the first digit of the last number by adding a \d*.
That didn't work so I came here for help, I'm new to this whole regexp business.
| [reply] |
Re: Matching numbers by regex.
by CountZero (Bishop) on Apr 19, 2006 at 21:15 UTC
|
Just trying it a little bit differently:
$data =~{(\d+)\s*/\s*(\d+)}This will give you the figures to the left and right of the slash. Whitespace may optionally separate the figures from the slash.
CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law
| [reply] [d/l] |