Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

regex escaping forward slash in regex

by gilemon (Initiate)
on Dec 09, 2009 at 16:22 UTC ( #811960=perlquestion: print w/replies, xml ) Need Help??
gilemon has asked for the wisdom of the Perl Monks concerning the following question:

I'm doing a Perl script to replace 5.2 deprecated PHP functions like split.
I'm hitting a problem when it comes to replace something like

so far I came with this regex:


but this obviously doesn't work if the regex contains the forward slash character.
I also tried this:


which escapes too many things as it escapes every meta characters.

Any secret escape sequence code that only escape forward slashes?
Any idea?

Replies are listed 'Best First'.
Re: regex escaping forward slash in regex
by moritz (Cardinal) on Dec 09, 2009 at 16:45 UTC
    Any secret escape sequence code that only escape forward slashes?

    Don't try to do it all at once. Write a simple function that escape forward slashes, and call that in the replacement part of your substitution. You can use s{}{}ge to evaluated the replacement part, and comfortably call functions there.

Re: regex escaping forward slash in regex
by Marshall (Abbot) on Dec 09, 2009 at 17:59 UTC
    I don't know much about PHP, but here is a straightforward idea that might make some sense or I hope at least give you something to work from. The general idea is to break this down into smaller "chunks". I assume that the code that your Perl program "re-writes" does compile and that parens match up, etc. Compressing the spaces is also a simplifying assumption.
    #!/usr/bin/perl -w use strict; my $line = q{ split( '/', $string ) }; $line =~ s/ //g; #compress spaces #now just "split('/',$string)" if ($line =~ m/^split/) # check for the split keyword { # get (parm1, parm2) of the split, ie the two # things separated by commas within the parens my ($parm1,$parm2) = ($line =~ m/\((.*?),(.*?)\)/)[0,1]; # now get stuff between quotes in parm 1 my $inside_quote = ($parm1 =~ m|'(.*?)'|)[0]; # change any / to \/ $inside_quote =~ s|/|\\/|g; #now just print back out print "preg_split('/$inside_quote/',$parm2)\n"; } __END__ Prints: preg_split('/\//',$string)
Re: regex escaping forward slash in regex
by ikegami (Pope) on Dec 09, 2009 at 17:27 UTC

    What's with "\s*?"? Don't you simply mean "\s*"?


    s/ (?<!_) ( split \s* \( \s* ) ( " (?:[^\\"]+|\\.)* " | ' (?:[^\\']+|\\.)* ' ) / my @x = ($1,$2); $x[1] =~ s{/}{\\/}g; "preg_$x[0]$x[1]" /xesg


    Will create an error if the "/" is already escaped.

Re: regex escaping forward slash in regex
by JadeNB (Chaplain) on Dec 10, 2009 at 01:19 UTC

    ‘Dumb’ search-and-replaces (and I mean that as a comment on the code, not on you!) always strike fear in my heart; if we can't even parse such a rigid language as XML with regexes (which we can't, right? Or at least no sane person would?), how can we expect correctly to parse the rich grammar of a programming language? I always think of clbuttic.

    If this were my job, the first thing I'd do would be to look at some means of getting at the internal, not textual, representation of a PHP program. The first result for PHP + AST is php-ast; I'm not sure if it does what you want, but you might be able to fold, spindle, and mutilate it, or else just look further down in the results.

      Dumb search-and-replaces always strike fear in my heart

      I do dumb search-and-replaces a hundred times a day, mostly with my editor's search and replace box. There's nothing wrong with it if the process is supervised.

      In the OP's case, he can use a visual diff tool — I love Beyond Compare — to compare the pre- and post-change versions of his files and fix the mistakes.

      And then he has to do a manual search to change the ones the tool didn't catch.

Re: regex escaping forward slash in regex
by gilemon (Initiate) on Dec 10, 2009 at 09:01 UTC
    Thanks a LOT for the answers.

    It's now working as I wanted here:

    Basically as you said I had to separate the job in easier chunks.

    As for the search and replace "paradigm", I do agree it's not the best solution...It is actually a very stressful way to go. I have been bumping into automatic code modification for few times now and the AST approach is definitely the way to go.
    My solution can't deal with cases like:
    $myreg = '/'

    But I guess this approach will do enough for what it is intend for...

    Thanks Again!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://811960]
Approved by wfsp
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2018-03-24 23:41 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (299 votes). Check out past polls.