http://www.perlmonks.org?node_id=224866


in reply to Re: Re: Re: A few random questions from Learning Perl 3
in thread A few random questions from Learning Perl 3

NP-completeness is a property of an algorithm. It implies that no algorithm is known to solve the problem in polynomial time.
This means that if you increase the length of the input for the problem, the execution time will increase exponentially. (Of course there are input cases which are polynomial, but many of interest are not). Essentially, it means that brute force is the only known method to tackle the problem exactly.

The question is on the relation between the behavior of an algorithm to decide on a language and the class to which this language belongs. For regular languages and context free languages polynomial time algorithms are known, but does this necessarily mean that since regular expressions with backreferences are proven to be NP-complete that the language they describe are a superset of regular and context free languages?

It certainly means it is hard to decide whether or not a certain string is an element of the language described by a regular expression with backreferences. But what does it tell us about the expressive power?

The expression /^(.*)\1$/ defines the language {ww | w in sigma*}, known neither to be regular, nor context free. On the other hand, regular expressions with backreference can't describe {a^n b^n | n >= 0} which is definitely context free.

So on the one hand, regular expressions with backreference describe languages that are not context free, but can't describe all context free languages either! This example illustrates that one has to be very careful when judging expressive power from algorithmic complexity. A high complexity is a sign that the expressive power must be high in some cases, but doesn't guarantee that everything can be done.

Incidently, the code below shows two Perl regular expressions that describe non-regular languages:

{ a^n b^n | n >= 0} /^ (a*) (??{sprintf("b{%d}", (length($1)))}) $/x
which is context free as mentioned above and
{ a^n b^n c^n | n >= 0 } /^ (a*) (??{sprintf("b{%d}", (length($1)))}) (??{sprintf("c{%d}", (length($1)))}) $/x
which is context sensitive.

Just my 2 cents, -gjb-