http://www.perlmonks.org?node_id=224866


in reply to Re: Re: Re: A few random questions from Learning Perl 3
in thread A few random questions from Learning Perl 3

NP-completeness is a property of an algorithm. It implies that no algorithm is known to solve the problem in polynomial time.
This means that if you increase the length of the input for the problem, the execution time will increase exponentially. (Of course there are input cases which are polynomial, but many of interest are not). Essentially, it means that brute force is the only known method to tackle the problem exactly.

The question is on the relation between the behavior of an algorithm to decide on a language and the class to which this language belongs. For regular languages and context free languages polynomial time algorithms are known, but does this necessarily mean that since regular expressions with backreferences are proven to be NP-complete that the language they describe are a superset of regular and context free languages?

It certainly means it is hard to decide whether or not a certain string is an element of the language described by a regular expression with backreferences. But what does it tell us about the expressive power?

The expression /^(.*)\1$/ defines the language {ww | w in sigma*}, known neither to be regular, nor context free. On the other hand, regular expressions with backreference can't describe {a^n b^n | n >= 0} which is definitely context free.

So on the one hand, regular expressions with backreference describe languages that are not context free, but can't describe all context free languages either! This example illustrates that one has to be very careful when judging expressive power from algorithmic complexity. A high complexity is a sign that the expressive power must be high in some cases, but doesn't guarantee that everything can be done.

Incidently, the code below shows two Perl regular expressions that describe non-regular languages:

{ a^n b^n | n >= 0} /^ (a*) (??{sprintf("b{%d}", (length($1)))}) $/x
which is context free as mentioned above and
{ a^n b^n c^n | n >= 0 } /^ (a*) (??{sprintf("b{%d}", (length($1)))}) (??{sprintf("c{%d}", (length($1)))}) $/x
which is context sensitive.

Just my 2 cents, -gjb-

Replies are listed 'Best First'.
Re: Perl regular expressions vs. RL, CFL, CSL
by JadeNB (Chaplain) on Dec 14, 2009 at 17:08 UTC
    NP-completeness is a property of an algorithm. It implies that no algorithm is known to solve the problem in polynomial time.
    This means that if you increase the length of the input for the problem, the execution time will increase exponentially. (Of course there are input cases which are polynomial, but many of interest are not). Essentially, it means that brute force is the only known method to tackle the problem exactly.

    I think that this may be a little misleading. Right now (as 6 years ago), NP-completeness of a problem means that no polynomial-time algorithm is known, but that statement may eventually become false *. Maybe it's better to say “Computer scientists believe that, if a problem is NP-complete, then there is no polynomial-time algorithm to solve it”?

    Also, I'm not sure that it's fair to say that NP-completeness of a problem means that the time-complexity of the problem grows exponentially in the input. Again, we think that NP-completeness correlates with exponential time-complexity, but that could change *. For that matter, can't NP-complete problems have super-exponential complexity (like 2^(n^2))—or are you using ‘exponential’ in the generic sense of ‘faster-growing than polynomial’?

    * Although we all know that it won't really. :-)