Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

regex problem with metachars

by sweetblood (Prior)
on Nov 07, 2003 at 18:00 UTC ( [id://305382]=perlquestion: print w/replies, xml ) Need Help??

sweetblood has asked for the wisdom of the Perl Monks concerning the following question:

I've run into a problem with a script I'm developing that among other things changes a delimited files delimiter, 1st checking that the new delimiter does not exist in the data.
Everything works fine unless the delimiter is a regex metacharacter. So if the delimiter is ":" no problem, but if the delimiter is a "|" big problem as the expresion /$n_delim/ is always true when $n_delim =~ /|/
I've tried several approaches but they've all failed as I won't know what the delimiter may be. Most often though the delimiter would be "," or "|" or "\t". I been banging my head against the wall for days with this and I just can't figure it out. The snippet below is from the offending script. For those of you wondering; I do have use strict; and use warnings; at the top.

open(FI, "<", $i_file) or die "Unable to open $i_file: $!"; while (sysread(FI, $_, 2 ** 16)){ if (/$n_delim/){ print "Delimiter: $n_delim already in $i_file\n"; exit 2; } }

Thanks in advance!

Replies are listed 'Best First'.
Re: regex problem with metachars
by shenme (Priest) on Nov 07, 2003 at 18:04 UTC
    Use \Q and \E surrounding your possibly offending characters.   This will 'quote' any strange and nasty characters by escaping them with '\'.   Read about these in perlre.
      Thanks for your reply but, but by using \Q it also would put a splash in front of any escaped character such as "\t". The same goes for quotemeta().

        That's because \t is a regex metacharacter. If you only want to escape some metacharacters, you'll need to figure out which ones.

        If you want to escape all metacharacters beside backslash, something like:

        s/([^A-Za-z_0-9\\])/\\$1/g;
        should do it.

        If the delimiter will always be one character then you can assume anything longer than 1 character is a escape sequence like "\t". So you just call quotemeta() on delimiters with a length of 1.

        HTH

        Tedrek

Re: regex problem with metachars
by ptkdb (Monk) on Nov 07, 2003 at 18:07 UTC
Re: regex problem with metachars
by injunjoel (Priest) on Nov 08, 2003 at 01:43 UTC
    Greetings,
    Here are my thoughts on the current dilema.
    Are you sure you want to do this test with regex?
    I think index() might better serve your needs.
    So the line in your original post
    if (/$n_delim/){
    would be rewritten as
    if (index($_, $n_delim) != -1){
    Seeing as you are not doing anything regexp specific with the value of $n_delim.
    Hope that helps...

    Peace and blessings to you all
    -injunjoel

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://305382]
Approved by HyperZonk
Front-paged by HyperZonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2024-04-23 20:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found