Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Are Perl patterns universal?

by blokhead (Monsignor)
on Nov 09, 2004 at 06:41 UTC ( #406290=note: print w/replies, xml ) Need Help??

in reply to Are Perl patterns universal?

The question is trivial if you allow (?{code}) and (??{code}) constructs, as you can encapsulate arbitrary Perl in the regex.

Do you mean, given an arbitrary Turing machine, can you build a regex $foo such that $x =~ /$foo/ if and only if $x is accepted by the Turing machine? Using just lookaheads/lookbehinds and captures/backreferences, I'd definitely say no. You don't even get all CFLs this way (consider {a^n b^n : n > 0}) but yet you get some CSLs (consider {a^n b a^n b a^n : n > 0}). See also Perl regular expressions vs. RL, CFL, CSL.

A Turing machine must have unbounded memory. A Perl regex using only captures & backreferences has a bounded (linear in the size of input) amount of "memory" about the its input -- the number of captures is fixed when the regular expression is compiled, and each capture contains at most the entire string. Not only that, but the type of access to this memory is very stricly limited: you can only match substrings. In addition, you don't have write access to this memory in any sort of arbitrary fashion (apart from trying a different substring of the input -- which is hardly arbitrary, and could be viewed as just a form of nondeterminism). This is an essential feature for the power of a Turing machine.

Even allowing (??{$re}) just gets you all CFLs (and various closures -- intersection and complement, etc), but doesn't allow for universal behavior because the unbounded memory just isn't there.

To do most "interesting" things with regexes, you'll need a layer of Perl around (or inside) the regex, to do either iterated substitutions, multiple matches, or build a different regex for each input. The last approach is a fun one used by various regex-reductions: Hamiltonian Cycle, 3SAT, N-Queens.


Replies are listed 'Best First'.
Re^2: Are Perl patterns universal?
by sleepingsquirrel (Hermit) on Nov 09, 2004 at 17:30 UTC
    A Turing machine must have unbounded memory.
    Yeah. But I'm encouraged because of the possibility of infinite recursion (from above).
    In addition, you don't have write access to this memory in any sort of arbitrary fashion
    It is not obvious to me how to do arbitrary memory access in any of the following, but they're all universal.

    -- All code is 100% tested and functional unless otherwise noted.
      infinite recursion
      Recursion with (??{$re}) is only sufficient for context-free matching. Even primitive recursion requires some sort of argument passing. Pretty much the only thing you can "pass" is the current pos of the string, which is way too restricted: You have only a fixed number of values for pos, and a fixed number of regexes you could be "recursing" to, so you can answer the halting problem for these creatures (see footnote below).
      cellular automata
      Here the grid of automata must be unbounded. Otherwise, you only have a finite number of possible grid configurations (number of automata states ^ size of grid).
      Diophantine equations
      The number of variables in the equation is fixed, but their values can be arbitrarily large integers. If their values are bounded, then you only have a finite number of combinations to try (possible values ^ number of variables), and you can always halt while determining if a Diophantine equation has a solution.
      cyclical tag systems
      This is just like an automaton with a queue -- take from the front and add to the back. But you must allow for rules which increase the size of data in the queue, which can happen indefinitely. If you are not allowed to increase the size of the queue data (or if you have an upper limit on the queue size), you only have a finite number of queue contents and thus configurations of the automaton (number of states * (queue alphabet size ^ max queue size))
      SK combinators
      I don't pretend to have any special insight on SK combinators. But what you have is a very restricted projection operator K, and a very restricted recursion operator S which still encompasses primitive recursion and μ recursion. The μ recursion is the key part of the universality of general recursive functions, as the value being minimized may grow arbitrarily large.

      Footnote: when a system has a finite number (say, N) of possible configurations on a given input, you can answer the halting problem for it as follows (where "halting" means entering some special subset of configurations): Simulate it for N steps. If it hasn't reached a halting configuration by then, it must repeat a configuration. Since the next configuration depends only on the previous configuration, it must be in an infinite loop and thus will never reach a halting configuration. Turing machines have an infinite tape and thus an infinite number of possible configurations.

      Clearly if you can answer halting queries on a system, it is not universal (Halting Problem).


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://406290]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2023-09-28 10:52 GMT
Find Nodes?
    Voting Booth?

    No recent polls found