More than a year ago, I posted in this section a way of
solving the N-queens problem (place N queens on an NxN board
such that no two queens attack each other) using a regex.
However, the regex was full of (?{ }) and (?(?{ })|) constructs, so it wasn't a real regex, as it executed Perl
code all the time.
I never came around explaining how it works, until yesterday
when I gave a small talk at a local Perl mongers meeting.
Driving home last night, I started realizing that the problem
is solvable with pure regexes. No fancy (?{ }) or (?(?{ })|)
constructs.
The program below solves the N-queens problem using a pure
regex. It takes a few options: -n followed by a number
indicates the size of the board, and if you use -p, it prints out the regex (and the string it matches against).
-P only prints out the regex and string, but doesn't try
to match it. Since it's slow like hell (but I've some ideas
to speed it up), try -n 5 or -n 6. -n 8 (the default) takes
a long time.
There's no much explaination (yet), but if you see the string and the regex, you can figure it out.
#!/usr/bin/perl
use strict;
use warnings 'all';
use Getopt::Long;
Getopt::Long::Configure ("bundling");
GetOptions ('p|print' => \my $print,
'P|Print' => \my $Print,
'n|number=i' => \(my $nr_of_queens = 8)
);
my $nr_of_rows = $nr_of_queens;
my $nr_of_cols = $nr_of_queens;
my @rows = (1 .. $nr_of_rows);
my @cols = map {chr ($_ - 1 + ord 'a')} 1 .. $nr_of_cols;
# Return positions not attacked by a certain other position.
sub free {
my $pos = shift;
my ($col, $row) = $pos =~ /(\D+)(\d+)/;
$col = ord ($col) - ord ('a') + 1;
map {my $c = chr ($_ -> [0] - 1 + ord 'a'); "$c$_->[1]"}
grep {$_ -> [0] != $col && $_ -> [1] != $row &&
abs ($_ -> [0] - $col) != abs ($_ -> [1] - $row)}
map {my $c = ord ($_) - ord ('a') + 1; map {[$c, $_]} @rows} @
+cols;
}
my $str = join "\n" => map {my $c = $_;
my $l = join "," => map {"$c$_"} @rows;
",$l,"} @cols;
$str .= "\n;\n";
map {$str .= "$_:" . join ("," => free $_) . ",\n"}
map {my $c = $_; map {"$c$_"} @rows} @cols;
my $re = join "\n" => (".*,(\\w+),.*") x $nr_of_queens;
$re .= "\n";
map {my $q = $_;
$re .= "[\\x00-\\xFF]*\\n\\$q:";
map {$re .= ".*\\$_,"} grep {$_ ne $q} 1 .. $nr_of_queens;
$re .= ".*\n"} 1 .. $nr_of_queens;
if ($print || $Print) {
print "'$str' =~ \n/^$re/\n";
exit if $Print;
}
if (my @a = $str =~ /^$re/) {
print "[@a]\n";
}
__END__
Abigail
Re: The N-queens problem using pure regexes
by dragonchild (Archbishop) on Oct 08, 2003 at 16:56 UTC
|
Bravo! Very neat solution. And, a beautiful demonstration of the fact that the right data structure isn't just important, it's everything.
Also, I like the heavy usage of map and grep. A little harder to read at first (as a programmer who's functional at play, not at work), but very concise.
Question - how does the regex engine not choose a free square in d4 when looking for free squares in d3? I'm guessing it has to do with [\x00-\xFF]*, but I'm not positive as to how that's working ...
------ We are the carpenters and bricklayers of the Information Age. The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6 Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [d/l] |
|
,a1,a2,a3,a4,a5,a6,a7,a8,
,b1,b2,b3,b4,b5,b6,b7,b8,
,c1,c2,c3,c4,c5,c6,c7,c8,
,d1,d2,d3,d4,d5,d6,d7,d8,
,e1,e2,e3,e4,e5,e6,e7,e8,
,f1,f2,f3,f4,f5,f6,f7,f8,
,g1,g2,g3,g4,g5,g6,g7,g8,
,h1,h2,h3,h4,h5,h6,h7,h8,
is being matched against the first part of the regex:
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
.*,(\w+),.*
The newlines in the string and regex are significant here (. won't match
a newline). This part will make \1 one of 'a1' .. 'a8', \2 one of 'b1' ..
'b8', \3 one of 'c1' .. 'c8', etc. So, it will never try to place two
queens on the 'd' column.
But you may ask, "why doesn't it put a queen on d3 and one on e4?". That's
where the second part of the string and regex come in. \4 contains the
position of the queen on the d column, so, in this case, \4 equals 'd3'.
Then there's this line in the second part of the regex:
[\x00-\xFF]*\n\4:.*\1,.*\2,.*\3,.*\5,.*\6,.*\7,.*\8,.*
We know that \4 equals 'd3', so part of this lines read '\nd3:'. Looking
back at the second part of the string, there is only one line that starts
with 'd3:':
d3:a1,a2,a4,a5,a7,a8,b2,b4,b6,b7,b8,c1,c5,c6,c7,c8,e1,e5,e6,e7,e8,f2,f
+4,f6,f7,f8,g1,g2,g4,g5,g7,g8,h1,h2,h4,h5,h6,h8,
If you look carefully, after the colon are all the fields that aren't
attacked by a queen on d3. Specifically, the field 'e4' is missing.
But the line
[\x00-\xFF]*\n\4:.*\1,.*\2,.*\3,.*\5,.*\6,.*\7,.*\8,.*
is saying "match a line that starts with 'd3', and has after the colon
a list of fields that include the positions of all other 7 queens".
The [\x00-\xFF]* just skips enough lines to get to the next
queen.
Abigail
| [reply] [d/l] [select] |
|
| [reply] [d/l] |
|
Re: The N-queens problem using pure regexes
by Abigail-II (Bishop) on Oct 09, 2003 at 14:30 UTC
|
Since it's slow like hell (but I've some ideas to speed it up)
Speeding it up turned out to be easier than I thought it was.
Below is reworked program that is dramatically faster that
the original. But first a table comparing running times of
three versions, the original, pure regex solution from the
parent node, the faster (still pure regex) solution presented
below, and the non-pure variant presented last year. The
latter is still the faster solution though.
Timings (values in wall clock seconds):
N Original Faster Non-pure
4 0.035 0.034 0.035
5 0.045 0.036 0.036
6 0.769 0.041 0.038
7 4.833 0.042 0.038
8 0.082 0.049
9 0.072 0.044
10 0.113 0.056
11 3.504 0.051
12 0.096
13 0.071
14 0.577
15 0.467
16 3.864
17 2.289
18 19.630
19 1.324
20 117.227
Before giving the program, some sample output:
$ ./queens -p -n 4
';,a1,a2,a3,a4,
;,b1,b2,b3,b4,
b1:,a3,a4,
b2:,a4,
b3:,a1,
b4:,a1,a2,
;,c1,c2,c3,c4,
c1:,a2,a4,b3,b4,
c2:,a1,a3,b4,
c3:,a2,a4,b1,
c4:,a1,a3,b1,b2,
;,d1,d2,d3,d4,
d1:,a2,a3,b2,b4,c3,c4,
d2:,a1,a3,a4,b1,b3,c4,
d3:,a1,a2,a4,b2,b4,c1,
d4:,a2,a3,b1,b3,c1,c2,
' =~
/^;.*,(\w+),.*
;.*,(\w+),.*
[^;]*\2:.*,\1[^;]*
;.*,(\w+),.*
[^;]*\3:.*,\1.*,\2[^;]*
;.*,(\w+),.*
[^;]*\4:.*,\1.*,\2.*,\3[^;]*
/
[a3 b1 c4 d2]
$ ./queens -n 8
[a8 b4 c1 d3 e6 f2 g7 h5]
And here's the program:
#!/usr/bin/perl
use strict;
use warnings 'all';
use Getopt::Long;
Getopt::Long::Configure ("bundling");
GetOptions ('p|print' => \my $print,
'P|Print' => \my $Print,
'n|number=i' => \(my $nr_of_queens = 8)
);
my @rows = 1 .. $nr_of_queens;
my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1];
sub a2i {ord ($_ [0]) - ord ('a') + 1}
sub i2a {chr ($_ [0] + ord ('a') - 1)}
# Given a square, return all non-attacked squares on columns to
# the *left* of the given square. (a1 is the lower left corner).
sub free {
my ($C, $R) = $_ [0] =~ /(\D)(\d+)/;
$C = a2i $C;
map {join "" => i2a ($_ -> [0]), $_ -> [1]}
grep {$_ -> [0] != $C &&
$_ -> [1] != $R &&
abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)}
map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1]
}
my ($str, $re) = ("", "");
foreach my $c (@cols) {
$str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n";
$re .= ";.*,(\\w+),.*\n";
next if $c eq 'a';
map {$str .= "$_:," . join ("," => free ($_)) . ",\n"} map {"$c$_"
+} @rows;
my $C = a2i $c;
$re .= "[^;]*\\$C:" . join ("" => map {".*,\\$_"} 1 .. $C - 1) . "
+[^;]*\n";
}
if ($print || $Print) {
print "'$str' =~ \n/^$re/\n";
exit if $Print;
}
if (my @a = $str =~ /^$re/) {
print "[@a]\n";
}
__END__
Abigail | [reply] [d/l] [select] |
|
N Original Faster Non-pure
4 0.035 0.034 0.035
5 0.045 0.036 0.036
6 0.769 0.041 0.038
7 4.833 0.042 0.038
8 0.082 0.049
9 0.072 0.044
10 0.113 0.056
11 3.504 0.051
12 0.096
13 0.071
14 0.577
15 0.467
16 3.864
17 2.289
18 19.630
19 1.324
20 117.227
I'm curious - have you had a chance to look at why the speeds actually improve when going from 8 to 9 for both Faster and Non-pure and from 10-11 for Non-pure, but slows down 30x for Faster? And, what's with 17, 18, and 19 when it's 2.289 -> 19.630 -> 1.324??
------ We are the carpenters and bricklayers of the Information Age. The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6 Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [d/l] |
|
It has to do with how many positions are rejected before
a suitable one is found. The solutions found for n = 8 and
n = 9 are:
[a8 b4 c1 d3 e6 f2 g7 h5]
[a9 b7 c4 d2 e8 f6 g1 h3 i5]
As you can see, for n = 8, it never has to backtrack for
the first queen (a8 is choosen), but for the seconde queen,
b8, b7, b6, and b5 need to be rejected. b8 and b7 will be
rejected right away (as they are attacked by a8), but for
b6 and b5 to be rejected, lots of other queens will be
have to be placed. For n = 9, no backtracking for the
first queen is needed, and for the second queen, the positions b9 and b8 are rejected immediately. It's only
the third queen were there's some real backtracking going
on - c9, c8, c7, and c6 are rejected immediately, and only
for c5 more queens will be tried before rejecting it.
The timings for 'faster' with n >= 10 cannot be trusted,
as the program contained a bug for n >= 10 (see elsewhere
in this thread - the bug is now fixed). Here's a new table
(done on a different computer, and recording user times,
not wall clock time), with the fixed programs:
N Original Faster Non-Pure
4 0.06 0.05 0.04
5 0.07 0.04 0.05
6 1.57 0.07 0.05
7 9.29 0.06 0.05
8 0.23 0.06
9 0.16 0.06
10 0.50 0.07
11 0.41 0.07
12 2.64 0.14
13 1.58 0.10
14 37.23 0.82
15 35.45 0.70
16 5.45
17 3.18
18 27.17
19 1.89
20
And, in case you are interested, the code that generated
the table:
#!/usr/bin/perl
use strict;
use warnings;
no warnings qw /syntax/;
$| = 1;
my $width = 15;
my $time_out = 120;
my @cmds = ("./queens2 -n ",
"./queens3 -n ",
"./queens1 -f -n ");
my $nr_of_commands = @cmds;
my $N = 4;
print " N";
printf "%${width}s" => $_ for qw /Original Faster Non-Pure/;
print "\n";
while ($nr_of_commands) {
printf "%3d" => $N;
foreach my $cmd (@cmds) {
unless (defined $cmd) {
print " " x $width;
next;
}
local $SIG {ALRM} = sub {die "Time out!"};
alarm ($time_out);
eval {
my $time = (`/usr/bin/time -f "%U" $cmd $N 2>&1`) [-1];
alarm (0);
chomp $time;
printf "%$width.2f" => $time;
};
if ($@ && $@ =~ /Time out/) {
undef $cmd;
$nr_of_commands --;
print " " x $width;
}
}
print "\n";
$N ++;
}
Home work question: the code above is lacking something
vital. What is it not doing what it should do?
Abigail | [reply] [d/l] [select] |
|
|
|
|
For any odd N odd N not divisible by 3: $a_solution = [ map { chr(ord('a') + $_ - 1).(((2 * $_) - 1) % $N); } (1..$N) ]
(e.g., for $N = 11: [ a1, b3, c5, d7, e9, f11, g2, h4, i6, j8, k10 ])
This is equivalent to starting in the bottom-left corner of the board and moving right one square and up two squares (wrapping when you hit the edge) N-1 times.
If the regex picks the first space not-already-capturable in each column (From brief inspection, it appears to do so -- It finds an equivalent solution for odd-N), this is the first solution it will find. In the even-N case, this process will not leave any not-already-capturable squares in the last column on the first pass, so it must then backtrack.
Update: Whoa there, Ben. I spoke way too soon. The above solution only applies when N is odd AND not divisible by 3. (So, for N = (1,5) mod 6)
More update: I was wrong about most of the analysis, too. This won't be the first solution found.
| [reply] [d/l] |
Re: The N-queens problem using pure regexes
by thor (Priest) on Oct 10, 2003 at 01:33 UTC
|
I think I've found a bug:
thor@bravo:~/perl> queens.pl -n 10
[a10 b8 c6 d9 e1 f1 g1 h7 i5 j2]
Wouldn't e1, f1, and g1 all be able to attack one another?
thor
| [reply] [d/l] |
|
You are quite correct. The faulty solution can only happen
with n >= 10; it happens because 'e10' isn't attacked by 'f1',
and "e10" =~ /e1/. Luckely, the fix is simple:
#!/usr/bin/perl
use strict;
use warnings 'all';
use Getopt::Long;
Getopt::Long::Configure ("bundling");
GetOptions ('p|print' => \my $print,
'P|Print' => \my $Print,
'n|number=i' => \(my $nr_of_queens = 8)
);
my @rows = 1 .. $nr_of_queens;
my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1];
sub a2i {ord ($_ [0]) - ord ('a') + 1}
sub i2a {chr ($_ [0] + ord ('a') - 1)}
# Given a square, return all non-attacked squares on columns to
# the *left* of the given square. (a1 is the lower left corner).
sub free {
my ($C, $R) = $_ [0] =~ /(\D)(\d+)/;
$C = a2i $C;
map {join "" => i2a ($_ -> [0]), $_ -> [1]}
grep {$_ -> [0] != $C &&
$_ -> [1] != $R &&
abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)}
map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1]
}
my ($str, $re) = ("", "");
foreach my $c (@cols) {
$str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n";
$re .= ";.*,(\\w+),.*\n";
next if $c eq 'a';
map {$str .= "$_:," . join (",," => free ($_)) . ",\n"} map {"$c$_
+"} @rows;
my $C = a2i $c;
$re .= "[^;]*\\$C:" . join ("" => map {".*,\\$_,"} 1 .. $C - 1) .
+"[^;]*\n";
}
if ($print || $Print) {
print "'$str' =~ \n/^$re/\n";
exit if $Print;
}
if (my @a = $str =~ /^$re/) {
print "[@a]\n";
}
__END__
Abigail | [reply] [d/l] |
Re: The N-queens problem using pure regexes
by Roy Johnson (Monsignor) on Nov 21, 2003 at 19:42 UTC
|
This may get me a very clever, -10, but I've tweaked the program to run rather quickly (-n 22 runs in a few seconds) by using a known general solution as the basis of my regex. Technically, it still generates a regular expression and matches it against the who-doesn't-attack-whom string, so -n 3 returns no match.
#!/usr/bin/perl
use strict;
use warnings 'all';
use Getopt::Long;
Getopt::Long::Configure ("bundling");
GetOptions ('p|print' => \my $print,
'P|Print' => \my $Print,
'n|number=i' => \(my $nr_of_queens = 8)
);
my @rows = 1 .. $nr_of_queens;
my @cols = ('a' .. 'z') [0 .. $nr_of_queens - 1];
sub a2i {ord ($_ [0]) - ord ('a') + 1}
sub i2a {chr ($_ [0] + ord ('a') - 1)}
# Given a square, return all non-attacked squares on columns to
# the *left* of the given square. (a1 is the lower left corner).
sub free {
my ($C, $R) = $_ [0] =~ /(\D)(\d+)/;
$C = a2i $C;
map {join "" => i2a ($_ -> [0]), $_ -> [1]}
grep {$_ -> [0] != $C &&
$_ -> [1] != $R &&
abs ($_ -> [0] - $C) != abs ($_ -> [1] - $R)}
map {my $c = a2i $_; map {[$c, $_]} @rows} @cols [0 .. $C - 1]
}
my ($str, $re) = ('', '');
my $solrow = 0;
foreach my $c (@cols) {
$str .= ";," . (join "," => map {"$c$_"} @rows) . ",\n";
$solrow += 2;
$solrow = 1 if $solrow > $nr_of_queens;
$re .= ".*\n" unless $c eq 'a';
$re .= ";.*,($c$solrow),";
next if $c eq 'a';
$re .= sprintf "(?:.*\n){%d}\\%d:", $solrow, a2i($c);
for my $i (1..a2i($c)-1) {
$re .= ".*,\\$i,";
}
$re .= sprintf "(?:.*\n){%d}", $nr_of_queens - $solrow;
map {$str .= "$_:," . join (",," => free ($_)) . ",\n"} map {"$c$_
+"} @rows;
}
if ($print || $Print) {
print "'$str' =~ \n/^$re/\n";
exit if $Print;
}
if (my @a = $str =~ /^$re/) {
print "[@a]\n";
}
__END__
| [reply] [d/l] |
|
|