Re: RegEx Question
by toolic (Bishop) on Sep 09, 2009 at 18:46 UTC
|
match a range of numbers with each digit just matching once?
Please clarify the question by providing some sample input and some desired output.
I've tried something like ^0-9{1,1} but its not working.
Please use 'code' tags around your code because it renders poorly. See Writeup Formatting Tips.
Here is what your regular expression means, according to YAPE::Regex::Explain:
use warnings;
use strict;
use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new('[^0-9]{1,1}')->explain();
__END__
The regular expression:
(?-imsx:[^0-9]{1,1})
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
[^0-9]{1,1} any character except: '0' to '9' (between
1 and 1 times (matching the most amount
possible))
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
| [reply] [d/l] |
|
Thanks for the reply. I like to match permutations of 0-8 with each digits occurring once using all 8 digits.
I've tried something like the follow but I can't get it straight... :(
[0-8]{1, 1}
| [reply] [d/l] |
|
[0-8]{1, 1} matches one digit out of the set {0 1 2 3 4 5 6 7 8}, so it doesn't do what you need at all.Actually, I am not sure that you can do this in one regex. I will show a simpler example only using permutations of the digits 0 - 3 (to make it easier to follow): First you have to check that your string contains 4 digits in the range 0 - 3. That one is easy: /^[0-3]{4}$/ If your string passes this test, then you check whether each digit occurs only once.
/^(\d)(?!\d*\1)(\d)(?!\d*\2)(\d)(?!\d*\3)\d$/;
This works by capturing each digit and using negative look-aheads to check that this digit does not re-occur again in the string.The following program proves that it works:
use warnings;
use strict;
for my $one (0 .. 3) {
for my $two (0 .. 3) {
for my $three (0 .. 3) {
for my $four (0 .. 4) {
my $test = "$one$two$three$four";
print "$test\n" if $test=~m/^[0-3]{4}$/ and $test=~m/^
+(\d)(?!\d*\1)(\d)(?!\d*\2)(\d)(?!\d*\3)\d$/;
}
}
}
}
Output:0123
0132
0213
0231
0312
0321
1023
1032
1203
1230
1302
1320
2013
2031
2103
2130
2301
2310
3012
3021
3102
3120
3201
3210
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
|
|
|
|
|
I like to match permutations of 0-8 with each digits occurring once using all 8 digits.
Um... the way I learned it, "0-8" represents 9 digits. (1-8 would be 8 digits, as would 0-7.) Is there some digit between 0 and 8 that you intend to leave out, and if so, which one?
Anyway, if you had said "permutations of 0-8 with each digit occurring once, using all 9 digits", then I would understand that you are looking only for strings of nine digit characters, such that all nine characters are distinct, and none of them is the digit "9":
#!/usr/bin/perl
use strict;
while (<DATA>) {
chomp;
if ( length() == 9 and not ( /[^0-8]/ or /(.).*\1/ )) {
print "$_\n";
} else {
warn "rejected input: $_\n";
}
}
__DATA__
01234
123456780
1234567890
223456781
345678012
234567890
a12345678
012345678
456782011
456782013
Update: I suspect that the regex used as the last stage in my conditional is a fairly expensive operation; for strings that actually meet the criteria (are not rejected), it has to do 8+7+6+...+1 (total of 36) character comparisons to finish. (There should be some sort of "O(...n...)" expression for that, but it escapes me.) So, it would most likely be better to use a split/hash solution, as suggested by others, especially if you'll be handling large quantities of input with a relatively high "success" rate. Something like this:
while (<DATA>) {
chomp;
if ( length() == 9 and not /[^0-8]/ ) {
my %c = map { $_ => undef } split //;
if ( keys %c == 9 ) {
print "$_\n";
next;
}
}
warn "rejected input: $_\n";
}
| [reply] [d/l] [select] |
Re: RegEx Question
by Marshall (Canon) on Sep 10, 2009 at 06:55 UTC
|
Here is my 2 bits worth:
Please show some simple input and output. I didn't completely understand your question.
The below code shows some common techniques. If you want sequences of digits (a list of solutions) that don't repeat, the code is different, but not by much.
#!/usr/bin/perl -w
use strict;
use Data::Dumper;
my @test =(112,1234,1424);
foreach (@test)
{
if (is_num_repeated($_) )
{
print "$_ FAILED digit is repeated\n";
}
else
{
print "$_ OK no digit repeated\n";
}
}
foreach (@test)
{
print "no_repeat: $_ string before repeat is: ",
first_non_repeating_digits($_),"\n";
}
sub is_num_repeated
{
my $num = shift;
my @digits = split(//,$num);
my %seen;
foreach (@digits)
{
$seen{$_}++;
}
# this is grep in a scalar context...
return (grep {$_ >1} values %seen);
}
sub first_non_repeating_digits
{
my $num = shift;
my @digits = split(//,$num);
my %seen;
my $result;
foreach (@digits)
{
return $result if ($seen{$_}++);
$result .= $_;
}
return $result;
}
__END__
112 FAILED digit is repeated
1234 OK no digit repeated
1424 FAILED digit is repeated
no_repeat: 112 string before repeat is: 1
no_repeat: 1234 string before repeat is: 1234
no_repeat: 1424 string before repeat is: 142
| [reply] [d/l] |
Re: RegEx Question
by grizzley (Chaplain) on Sep 10, 2009 at 07:27 UTC
|
D:\>perl -lne "print /^[0-8]{9}$/ && !/(.)(?=.*\1)/ ? 'ok' : 'not ok'"
1234
not ok
1234567689
not ok
012345678
ok
018273645
ok
010101010
not ok
^Z
And with one regexp:
D:\>perl -lne "print /^(?:([0-8])(?!.*\1)){9}$/ ? 'ok' : 'not ok'"
123456780
ok
123123123
not ok
123456781
not ok
^Z
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
Well, its simplier than the second approach :>
And sequence of 9 or 8 digits is mentioned in discussion above
| [reply] |