Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Escaping quotemeta() in regex

by sweetblood (Parson)
on Nov 30, 2005 at 15:52 UTC ( #512972=perlquestion: print w/ replies, xml ) Need Help??
sweetblood has asked for the wisdom of the Perl Monks concerning the following question:

Is there a way to craft an expression that thwarts quotemeta()? We have an app that takes user data and applies a regex to it but before doing so it does quotemeta to the string. This is expected and desirable but, there are times I'd rather pass some metacharacters for instance I might want to pass a string like this:
\somedir\anotherdir\filename.\d{3}$
Once quotemeta is done with this it turns it into:
\\somedir\\anotherdir\\filename\.\\d\{3\}\$
What I'd like is to be able to bypass quotemeta's quoting.

I'd like the string to look like this:

\\somedir\\anotherdir\\filename\.\d{3}$
I had hoped perhaps \E prior to the desired meta would do the trick but quotemeta just escaped it. I didn't really expect that to work and I suppose it shouldn't but I'd still like to do something.

Does anyone have any suggestions?

TIA

Sweetblood

Comment on Escaping quotemeta() in regex
Select or Download Code
Re: Escaping quotemeta() in regex
by Roy Johnson (Monsignor) on Nov 30, 2005 at 16:10 UTC
    After you call quotemeta, do some post-processing. For example,
    my $expr = quotemeta 'foo.\E{2}\Qbar'; $expr =~ s{\\\\E(.*?)(?:\\\\Q|$)}{ (my $unprotect = $1) =~ s/\\(\\?)/$1/g; $unprotect; }ge;
    to handle embedded \E and \Q as needed. It removes any backslash by itself, and one backslash from every pair of backslashes; that should effectively undo a layer of quotemeta, if I'm thinking straight. I found that \E, when part of an interpolated string variable, is not recognized as a legitimate escape, so its use here is just as a flag for your post-processing.

    Caution: Contents may have been coded under pressure.
Re: Escaping quotemeta() in regex
by ikegami (Pope) on Nov 30, 2005 at 16:34 UTC

    That doesn't really make any sense. Does
    \dir\d\file.txt
    mean
    \\dir\\d\\file\.txt
    or
    \dir\d\\file.txt
    or something else?
    There's no way to tell what the user meant. There's a reason why regexp provide an escape character: It is required.

      Exactley, this is why I ask if there may be someway to tell quotemeta the this protion of the string should not be escaped because it contains valid expression metacharacters.

      Sweetblood

        What distinguishes the portion that should be escaped from the portion that shouldn't be?
Re: Escaping quotemeta() in regex
by ikegami (Pope) on Nov 30, 2005 at 19:03 UTC
    You want want to look at File::Glob if you're trying to get a list of files. It doesn't allow \d, but it does allow well known file matching expressions, such as
    • file?.txt,
    • file*.txt,
    • file[0-9].txt and
    • file.{htm,html}.
Re: Escaping quotemeta() in regex
by Anonymous Monk on Nov 30, 2005 at 21:35 UTC
    Does anyone have any suggestions?

    Yes.

    In your application, split your string up into parts you want quoted and parts you don't. Then use quotemeta to quote the parts you want quoted. Then put the strings back together again.

    You haven't said what your criteria is for deciding whether or not to quote a given part of a string is, so I can't give you specific code to split the string up, but the general idea is, don't try to bypass security features like quotemeta by finding "loopholes". If there are loopholes, there are bugs; and you hope there *aren't* bugs, right? Instead, only apply a feature if you really want it in the first place...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://512972]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (11)
As of 2014-07-29 00:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (210 votes), past polls