Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Re: Re: A few random questions from Learning Perl 3

by theorbtwo (Prior)
on Jan 07, 2003 at 04:37 UTC ( [id://224853]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: A few random questions from Learning Perl 3
in thread A few random questions from Learning Perl 3

You're right, and you're wrong... I'm fairly certian that while ordinary regular expressions aren't up to parsing HTML, even on a theorical basis. Perl regular expressions are a whole 'nother breed. Regular expressions with backreferences are NP-complete; it's been proven at least twice. (Well, three times, but one of them is buggy.) I suspect I'm missing somthing here... if anybody knows what (other then my mind), I'd love to hear it.


Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

Replies are listed 'Best First'.
Perl regular expressions vs. RL, CFL, CSL
by gjb (Vicar) on Jan 07, 2003 at 05:40 UTC

    NP-completeness is a property of an algorithm. It implies that no algorithm is known to solve the problem in polynomial time.
    This means that if you increase the length of the input for the problem, the execution time will increase exponentially. (Of course there are input cases which are polynomial, but many of interest are not). Essentially, it means that brute force is the only known method to tackle the problem exactly.

    The question is on the relation between the behavior of an algorithm to decide on a language and the class to which this language belongs. For regular languages and context free languages polynomial time algorithms are known, but does this necessarily mean that since regular expressions with backreferences are proven to be NP-complete that the language they describe are a superset of regular and context free languages?

    It certainly means it is hard to decide whether or not a certain string is an element of the language described by a regular expression with backreferences. But what does it tell us about the expressive power?

    The expression /^(.*)\1$/ defines the language {ww | w in sigma*}, known neither to be regular, nor context free. On the other hand, regular expressions with backreference can't describe {a^n b^n | n >= 0} which is definitely context free.

    So on the one hand, regular expressions with backreference describe languages that are not context free, but can't describe all context free languages either! This example illustrates that one has to be very careful when judging expressive power from algorithmic complexity. A high complexity is a sign that the expressive power must be high in some cases, but doesn't guarantee that everything can be done.

    Incidently, the code below shows two Perl regular expressions that describe non-regular languages:

    { a^n b^n | n >= 0} /^ (a*) (??{sprintf("b{%d}", (length($1)))}) $/x
    which is context free as mentioned above and
    { a^n b^n c^n | n >= 0 } /^ (a*) (??{sprintf("b{%d}", (length($1)))}) (??{sprintf("c{%d}", (length($1)))}) $/x
    which is context sensitive.

    Just my 2 cents, -gjb-

      NP-completeness is a property of an algorithm. It implies that no algorithm is known to solve the problem in polynomial time.
      This means that if you increase the length of the input for the problem, the execution time will increase exponentially. (Of course there are input cases which are polynomial, but many of interest are not). Essentially, it means that brute force is the only known method to tackle the problem exactly.

      I think that this may be a little misleading. Right now (as 6 years ago), NP-completeness of a problem means that no polynomial-time algorithm is known, but that statement may eventually become false *. Maybe it's better to say “Computer scientists believe that, if a problem is NP-complete, then there is no polynomial-time algorithm to solve it”?

      Also, I'm not sure that it's fair to say that NP-completeness of a problem means that the time-complexity of the problem grows exponentially in the input. Again, we think that NP-completeness correlates with exponential time-complexity, but that could change *. For that matter, can't NP-complete problems have super-exponential complexity (like 2^(n^2))—or are you using ‘exponential’ in the generic sense of ‘faster-growing than polynomial’?

      * Although we all know that it won't really. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://224853]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (4)
As of 2024-03-19 07:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found