Beefy Boxes and Bandwidth Generously Provided by pair Networks Russ
P is for Practical
 
PerlMonks  

Re^2: Regex weirdness?

by Pic (Scribe)
on Mar 14, 2005 at 20:34 UTC ( [id://439523]=note: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.


in reply to Re: Regex weirdness?
in thread Regex weirdness?

Of course. What I'm currently using is this:
m/([^"']+|(?:"(?:[^"]|\\")*")|(?:'(?:[^']|\\')*'))/g
Which does what I want. I'll look into the \G variant and see if that makes sense in my head (I have occasional problems with wrapping my head around regex stuff). My intent is to split a block of text into a list of elements, alternating between a quoted string (with the quotes) and a non-quoted string. For example the string print ( "some stuff", $more_stuff, "final stuff" ); should become this:
@list = ( q/print ( /, q/"some stuff"/, q/, $more_stuff, /, q/"final stuff"/, q/ );/ )

Replies are listed 'Best First'.
Re^3: Regex weirdness?
by Roy Johnson (Monsignor) on Mar 14, 2005 at 22:37 UTC
    To get a backreference to a quote, you have to put the quote in parens, which means it is going to be returned as a separate group. So I think you're going to have to stay with the separate alternatives for each type of quote.

    The /x option is absolutely straightforward: any whitespace within your regex is ignored. So you can pretty it up as you like. You can also put comments in it. I recommend you jump right into using it.

    The \G anchor tells the pattern to resume looking from where it last left off with the string. I don't think it's going to help you with what you're trying to do here.

    I notice that the backslash-protection of quotes doesn't work with your pattern. Consider that, within quotes, you will accept backslash followed by any character, and any run of non-backslash, non-quote characters. Or, you will accept a minimal match of any character leading up to a quote that is not preceded by a backslash. I illustrate both of these here (along with the use of /x):

    my @matches = m/([^"']+ |(?: " (?:\\.|[^\\"]+)* " ) # Double quote |(?: ' .*? (?<!\\)' ))/gx;
    Update: note that the second version will not recognize that \\' does not protect the quote.

    Caution: Contents may have been coded under pressure.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://439523]
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.