http://www.perlmonks.org?node_id=598451

Sifmole has asked for the wisdom of the Perl Monks concerning the following question:

All,

Is anyone aware of a library ( or other method ) of converting a Perl regex to the POSIX format of the same regex?

Replies are listed 'Best First'.
Re: Perl regex to POSIX
by quester (Vicar) on Feb 06, 2007 at 07:39 UTC
    You might want to elaborate a little more on the actual regexes that you want to convert, and what program you will be feeding those converted regexes into. I think the reason no one has posted anything yet is that the most general case of a "perl regex" includes all manner of horrors that would be very difficult to translate properly, but are also not heavily used.

    Offhand, it looks like anything that appears in the perlre documentation before the topic (?=pattern) should be tolerable to convert to the POSIX regex format in, for example, http://www.tin.org/bin/man.cgi?section=7&topic=regex. The expressions described after that point in perlre are likely to be difficult. On the other hand, all of them except the zero-width lookahead and lookbehind assertions are marked "experimental", so it's not unlikely that your particular RE's don't use any of those features.

      I appreciate the input.

      Unfortunately, what I said about covers the details; but I will try again.

      There is an application which allows users to input regular expressions which are saved and used later for different operations. The input regular expressions are validated by passing them through the qr() method and accepted if they do not throw an error. This has to this point worked fine as the later processing was being performed in Perl code.

      I have a requirement that involves moving that same processing into a database model ( Postgres ) and support the existing regular expressions which users have already saved. I thought to convert the existing expressions to POSIX format because Postgres supports that.

      Thanks in advance for any help.

        That's actually a much better problem description because you've a) nailed down the why behind what you're attempting to do which can dispel any lingering hints of X-Y problems, and b) given a more concrete target (convert Perl regexen into POSIX regexen as understood by Postgres).

        You might look into building a convertor on top of YAPE::Regex. It would parse an existing regex and then walk it emitting a Postgres regex equivalent (or throwing an error if it encounters something like (?{}) which doesn't have an analogue).

        Update: D'oh! Forgot that Pg can embed Perl; that'd be a much better route if you can swing it than translating. Go with the suggestion below if you can; parse and convert as a last resort.

Re: Perl regex to POSIX
by PreferredUserName (Pilgrim) on Feb 06, 2007 at 15:14 UTC
    As someone said, it's not easy in general, since you can embed various perlisms into Perl regexes that POSIX regexes have no way of knowing about (lacking a perl interpreter).

    A different approach would be to use Perl as a server-side language on Postgres. This page tells you how: http://www.oreillynet.com/pub/a/databases/2005/11/10/using-perl-in-postgresql.html

    Their example is:

    postgres$ createlang plperlu mydb
    A Simple Example

    The easy way to show how to use PL/Perl is to create a very simple function; one that would be a lot harder to do otherwise. Suppose that you want to test if a given piece of text is a palindrome (a word that reads the same backwards as forwards), disregarding white space and the case of the letters. Here's a piece of SQL to define the function:

    create function palindrome(text) returns boolean language plperl immutable as ' my $arg = shift; my ($canonical = lc $arg) =~ s/\s+//g; return ($canonical eq reverse $canonical) ? "true" : "false"; ';
    Given this function, you can write SQL like:
    select name, region, country, population from towns where palindrome(name);
      This is an option I was aware of but I wasn't sure of something related to it:

      Would Perl also need to be installed on the machine where the database is running? or is the needed "whatever" included with the PostGres install/setup?

Re: Perl regex to POSIX
by PreferredUserName (Pilgrim) on Feb 06, 2007 at 15:31 UTC
    On the third hand ... it might also be adequate to simply *not cover* the general case.

    Change all the easy stuff (like \s, \w, etc.) into their POSIX equivalents, and just balk at any (?...) cases you don't cover. Unless you have very perl-literate users, that might be good enough.

    Professional software development tip: say you'll cover those cases "in a future release", but then never do it. By the time the future arrives, you'll probably have moved on to greener pastures! :)

      This tends to only work when you work for people that don't know software. Unfortunately, I work for people that actually understand what we do; and they know this trick as well.