Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Regex and negative file test

by neilwatson (Priest)
on Dec 06, 2006 at 20:49 UTC ( #588182=perlquestion: print w/replies, xml ) Need Help??
neilwatson has asked for the wisdom of the Perl Monks concerning the following question:

If $file = cica2061205.gz (which is a regular file), why is this code matching?
# filter any unwholesome file names next if ( !-f $file || $file =~ m/ ^\.| # begins with a '.' \.txt$| # is a .txt file \$|&|<|>|@|\/ # contains any shell control characters /x );

Neil Watson

Replies are listed 'Best First'.
Re: Regex and negative file test
by Corion (Pope) on Dec 06, 2006 at 20:56 UTC

    Most likely because there is no file cica2061205.gz in the current directory. Or because the @| array contains something that makes the file match. Or something else. Why don't you try and break the regular expression down and sprinkle the code with debug messages to see closer what's going on?

    # filter any unwholesome file names my @reason; push @reason, "... a file with that name doesn't exist" if ! -f $file; push @reason, "... it starts with a dot" if $file =~ /^\./; ... if (@reason) { warn "Rejecting filename '$file' because "; warn $_ for @reason; next; };

    Also, it is a much saner approach to only allow what is permitted instead of rejecting what you know is bad. For example, I would simply just allow sane filenames, instead of trying to weed out not-so-sane filenames:

    $file =~ /^\w[-_\w]+\.\w+$/ or push @reason, '... it doesn't look like a sane filename';
Re: Regex and negative file test
by johngg (Abbot) on Dec 06, 2006 at 21:13 UTC
    I agree with the approach Corion has suggested. One thing springs to mind with your posted code. Your alternation \$|&|<|>|@|\/ would be better expressed as a character class [$&<>@/].



      Your alternation \$|&|<|>|@|\/ would be better expressed as a character class [$&<>@/].

      Not really, no:

      use strict; use warnings; my $name = 'whatever'; print $name =~ /[$&<>@/]/ ? 'Yep!' : 'Nope!';


      Unmatched right square bracket at C:\Perl\progs\ line 5, at end + of line syntax error at C:\Perl\progs\ line 5, near "/[$&<>@/]" Search pattern not terminated or ternary operator parsed as search pat +tern at line 5.

      Drat! Let's escape that pesky slash:

      my $name = 'whatever'; print $name =~ /[$&<>@\/]/ ? 'Yep!' : 'Nope!';


      Use of uninitialized value in concatenation (.) or string at l +ine 5. Nope!

      What's happening now? Ah, perl seems to think that $& means $MATCH. Why can't it DWIM? Let's try this:

      my $name = 'whatever'; print $name =~ /[&$<>@\/]/ ? 'Yep!' : 'Nope!';

      Aah, that's better!

      Update: fixed typo, closed blockquote tag.

      Update 2: just realised that there's yet another problem. The dollar sign has to be escaped as well:

      my $name = 'what$ever'; print $name =~ /[\$&<>@\/]/ ? 'Yep!' : 'Nope!';
        Yes, that was stupid of me. Thank you for pointing out my error.

        Part of it is that I always use m{ ... } rather than / ... / so I have lost the habit of escaping slashes. The other part is I didn't realise that there was interpolation inside a character class so, as you point out, I failed to escape the $. Ought to do the @ as well, I suppose. Not sure about the &?

        $ perl -le ' > use strict; use warnings; > print q{whatever} =~ m{[\$&<>\@/]} ? q{Yep} : q{Nope};' Nope $ perl -le ' > use strict; use warnings; > print q{v$table} =~ m{[\$&<>\@/]} ? q{Yep} : q{Nope};' Yep $ perl -le ' > use strict; use warnings; > print q{} =~ m{[\$&<>\@/]} ? q{Yep} : q{Nope};' Yep $



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://588182]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2017-01-19 07:59 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (169 votes). Check out past polls.