Lately I've been wondering about just how powerful Perl's pattern
matching language is (commonly known as regular expressions). Because of backtracking and recursion, Perl's
patterns can match more than regular languages. They can match (at
least some)
#match context free grammar --[0{N}1{N}] (e.g. 01, 0011, 000111, etc.)
$cf = qr/01|0(??{$cf})1/;
$init = "00001111";
print "match\n" if ($init=~m/^$cf$/);
...they can also match (at least some)
context sensitive grammars...
#match context sensitive grammar --[0{N}1{N}2{N}] (012, 001122, 000111
+222, ...)
$h=qr/01|0(??{$h})1/;
$t=qr/12|1(??{$t})2/;
$cs=qr/(?=$h 2+$)(?=0+$t$)/x;
$init = "001122";
print "match\n" if ($init=~m/^$cs$/);
(I've haven't proven those two snippets correct, so any counter examples
would be appreciated)
We can also craft patterns that loop forever (not merely a long time)
(well, ignoring stack overflows that is)
#Word boundries are zero-width so the regex engine will never progress
my $inf; $inf=qr/\b(??{$inf})/;
On a more mundane level we can also compute simple things...
#simple computations with pattern matching operating on unary numbers
$"="";
$_ = 'lll+ll'; #3+2
@add = m/(l+)\+(l+)/;
print "$_ = @add\n";
$_ = 'lll*ll'; #3*2
@mult = m/l(?=.*\*(l*))/g;
print "$_ = @mult\n";
$_ = 'lll';
@sqr = m/(l)(?=(l*))(?=(l*))/g;
print "$_ squared = @sqr\n";
It is also known that certain cellular automata are universal
(
Rule 110), and with
some additional machinery, we can get a regex to emulate one.
#implement rule 110 cellular automata
$init = " X "; #initial value (pad with two spaces on either side)
$"="";
for (1..24)
{
print ' ' x (38-$_)."$init\n";
$init=" @{[($init=~m/^(\s) | #Start of line
(?<=\s)(\s)(?=\s) | #000
(?<=\s) \s (?=(X)) | #001
(?<=\s) (X)(?=\s) | #010
(?<=\s) (X) (?=X) | #011
(?<=X) (\s)(?=\s) | #100
(?<=X) \s (?=(X)) | #101
(?<=X) (X)(?=\s) | #110
(?<=X) X (?=X+(\s))| #111
(\s)$ #End of line
/gx)]} ";
}
Of course that might be considered cheating, because we've used an
external mechanism for looping. It gets more interesting when we
consider that
some
Diophantine equations are universal . And a few perl wizards have
managed to use regex to
solve linear Diophantine equations
and test for
primeness (see also section 6.16 of the Ram
book).
Your mission, should you choose to accept it, is to meditate on this
problem and see if you can enlighten the rest of us as to whether perl5
patterns are universal. (Or, as a lesser challenge, see if you can get a
regex to compute 2**N, or factorial(N), or Ackermann's function)