Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

SOLVED - Can a better regex person than me tell me how to fix this?

by boftx (Deacon)
on Dec 06, 2013 at 03:24 UTC ( #1065885=perlquestion: print w/ replies, xml ) Need Help??
boftx has asked for the wisdom of the Perl Monks concerning the following question:

I'm not a regex expert by any means, so I'm not sure how to deal with this error message coming from a Win32 test box.

# ''\518' resolved to '\o{51}8' in regex; marked by +<-- HERE in m/^(?:C:\smoker\518 <-- HERE 11x64\cpan\build\Module-Cook +er-v0.1.3-Ec2rZJ\blib\lib\Module\Cooker\default)/ at C:\smoker\51811x +64\cpan\build\Module-Cooker-v0.1.3-Ec2rZJ\blib\lib/Module/Cooker.pm l +ine 295. # '

The code in question looks like this:

my $std_dir = $self->_profile_dir; my $src_type = ( $dir =~ /^(?:$std_dir)/ ) ? 'standard' : 'local';

$self->_profile_dir returns an absolute path created by Cwd::realpath(). It has passed on every *nix platform tested so far, but both Win32 testers have reported this, so it appears to be related to only Windows. I suspect it has to do with a '\' preceding a digit in the path.

Now that I see it here I'm wondering if I should put $std_dir into a qr//. Can anyone offer some insight here? Thanks!

It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.

Comment on SOLVED - Can a better regex person than me tell me how to fix this?
Select or Download Code
Re: Can a better regex person than me tell me how to fix this? (quotemeta)
by Anonymous Monk on Dec 06, 2013 at 03:32 UTC
    on windows paths can have backslashes, which are regular expressions metacharacters, so quotemeta ie /^(?:\Q$std_dir\E)/
      use re 'debug'; my $fudge = '.*'; my $judge = '\d'; print "$fudge eq $judge = @{[int( $fudge eq $judge )]}\n"; $judge =~ /$fudge/; $judge =~ /\Q$fudge\E/; __END__ .* eq \d = 0 Compiling REx ".*" Final program: 1: STAR (3) 2: REG_ANY (0) 3: END (0) anchored(MBOL) implicit minlen 0 Matching REx ".*" against "\d" 0 <> <\d> | 1:STAR(3) REG_ANY can match 2 times out of 214 +7483647... 2 <\d> <> | 3: END(0) Match successful! <<<<<<<<<<<<<<< <<<<<<<<<<<<<<< Compiling REx "\.\*" Final program: 1: EXACT <.*> (3) 3: END (0) anchored ".*" at 0 (checking anchored isall) minlen 2 Guessing start of match in sv for REx "\.\*" against "\d" Did not find anchored substr ".*"... Match rejected by optimizer <<<<<<<<<<<<<<< <<<<<<<<<<<<<<< Freeing REx: ".*" Freeing REx: "\.\*"

      Thanks! That seems to be the fix based on some quick and dirty testing.

      It helps to remember that the primary goal is to drain the swamp even when you are hip-deep in alligators.
Re: SOLVED - Can a better regex person than me tell me how to fix this?
by AnomalousMonk (Abbot) on Dec 06, 2013 at 20:35 UTC

    The answer you've already received is a complete solution to your problem and this reply is rather belated. However:

    I suspect it has to do with a '\' preceding a digit in the path.

    Just be aware that the problem is more extensive than that. In the substring  '\smoker' the first pair of characters are interpreted, when interpolated into a regex, as the  \s (whitespace) character class. Likewise, the initial pair in  '\build' becomes the  \b "word" boundary assertion. I also see  \c and  \l lurking about. These are all perfectly valid regex metasequences (or whatever the proper term should be) and are silently accepted! And simply interpolating  $std_dir first into a  qr// would not have helped — unless some metaquoting mechanism was also used.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1065885]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (11)
As of 2014-12-21 12:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (105 votes), past polls