Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Regular Expresssion TroubleShoot Help plz

by hv (Prior)
on Mar 29, 2006 at 02:20 UTC ( [id://539850]=note: print w/replies, xml ) Need Help??


in reply to Regular Expresssion TroubleShoot Help plz

The problem is precisely that \1 in a character class is not a backreference: it refers to the ASCII character chr(1), an abbreviation of the octal escape sequence \001.

You can achieve what you want with a slightly more complicated approach using negative lookahead:

m{ \A # anchor to start (\W) # open text (?: (?!\1) . )* # anything that isn't the closer \1 # close text (\W) # separator }xs

That works for the general case, when you simply want to match a bunch of stuff not containing a given substring. In this case though, you want to match "up to the first occurrence" of that substring, so it's much simpler - you just need a minimal match:

m{ \A # anchor to start (\W) # open text .*? # anything contained, up to the ... \1 # ... close text (\W) # separator }xs

(I've taken the liberty of replacing your '+' with '*', on the assumption that you want to allow empty fields.)

Hugo

Replies are listed 'Best First'.
Re^2: Regular Expresssion TroubleShoot Help plz
by MajingaZ (Beadle) on Apr 12, 2006 at 01:30 UTC
    $foo = qq{^snafu^|^foobar^\n}; $foo =~ m/\A(\W) # \A instead of ^ and match first non-word .+? # +? Minimal match everything that isn't in \1 \1(\W) # Match non-word following the 2nd \1 /xms; $TEXT_QUAL = $1; $FIELD_SEP = $2;


    Having now found the proper way to attempt to acquire delimiters, the following questions how to utilize these new found delimiters.
    Instead of creating one large regex, I'd perfer to store them in scalars, which is the core of this particular problem.
    $foo =~ /\G$TEXT_QUAL(.*?)$TEXT_QUAL[$FIELD_SEP\n]/xmsgc;
    Fails to work since the qualifiers are metacharacters used in regular expressions.


    $foo =~ /\G\$TEXT_QUAL(.*?)\$TEXT_QUAL[\$FIELD_SEP\n]/xmsgc;
    Fails to work as \$ is a literal $ followed by the name.


    $foo =~ /\G\\$TEXT_QUAL(.*?)\\$TEXT_QUAL[\\$FIELD_SEP\n]/xmsgc;
    Also Fails to work as \\ is is a literal \ The only way I've found is


    $LIT_TEXT_QUAL = qq{\\$TEXT_QUAL}; $LIT_FIELD_SEP = qq{\\$FIELD_SEP}; $foo =~ /\G$LIT_TEXT_QUAL(.*?)$LIT_TEXT_QUAL[$LIT_FIELD_SEP\n]/xmsgc;


    I do have reasons for using all those flags as this thread continues, however with the intent to get discrete answers to smaller problems I'm hoping to reduce the amount of new information my brain will have to process.

    Basically this post is looking for a way to use any variable in a regex that may or may not contain metachacters. Edit:: OK yeah missed the boat on this one, answer is just quotemeta function, from CB thanx guys!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://539850]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2024-04-23 15:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found