Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

how to state this by REL

by whatluo (Novice)
on Nov 09, 2005 at 05:42 UTC ( [id://506983]=perlquestion: print w/replies, xml ) Need Help??

whatluo has asked for the wisdom of the Perl Monks concerning the following question:

I need to check the name to conform to the following conventions: valid names are comprised of only alphanumeric characters(a-z A-Z and 0-9) plus the characters '-', '_', and '.'. Names must begin with a letter.

someone told me to use this

$name =~ /\A[a-zA-Z][\w.-]*\z/;

but it treat "s@--__" and "s!!!@@@___" to valid which should be invalide name, casue \w in perl include @ and ! and punction sign, Any hints on this?

Thanks,
--Whatluo

Replies are listed 'Best First'.
Re: how to state this by REL
by pg (Canon) on Nov 09, 2005 at 06:57 UTC

    The regexp is correct. The problem is your string. Use single quote instead of double quote, to avoid interpolation. Try this code:

    { my $name = "s@--__"; print $name, "\n"; if ($name =~ /\A[a-zA-Z][\w.-]*\z/) { print "match\n"; } else { print "not match\n"; } } { my $name = "s!!!@@@___"; print $name, "\n"; if ($name =~ /\A[a-zA-Z][\w.-]*\z/) { print "match\n"; } else { print "not match\n"; } } { my $name = 's@--__'; print $name, "\n"; if ($name =~ /\A[a-zA-Z][\w.-]*\z/) { print "match\n"; } else { print "not match\n"; } } { my $name = 's!!!@@@___'; print $name, "\n"; if ($name =~ /\A[a-zA-Z][\w.-]*\z/) { print "match\n"; } else { print "not match\n"; } }

    Which prints:

    Possible unintended interpolation of @___ in string at math1.pl line 1 +1. s-__ match s!!!@@ not match s@--__ not match s!!!@@@___ not match
      >The regexp is correct. The problem is your string. Use single >quote instead of double quote, to avoid interpolation. Yes, I got it, Now I use single quote to make it happy, and I want to know what's really means as to \w in perl, do it include @ ! or not? Thanks for the info.
        According to the perlre documentation, it doesn't:
        A "\w" matches a single alphanumeric character (an alphabetic character, or a decimal digit) or "_", not a whole word.

        Arjen

        All that is gold does not glitter...
Re: how to state this by REL
by Errto (Vicar) on Nov 09, 2005 at 06:00 UTC
    casue \w in perl include @ and ! and punction sign,

    This isn't true. See the following example,

    print "no match" unless 's!!!@@@___' =~ /\A[a-zA-Z][\w.-]*\z/;
    which prints "no match." \w only matches letters, digits, and underscores. Maybe you could show us an example program where this is happening so we can provide some better advice.
Re: how to state this by REL
by reasonablekeith (Deacon) on Nov 09, 2005 at 11:07 UTC
    While the guy who gave you this regex obviously knows what he's doing, and admittedly, it does function according to spec, I'd still change it.

    The simple reason being I had to lookup what \A and \z do. Far more common anchors for matching the beginning and end of strings are ^ and $. Even the documentation describes them as "The two most common anchors".

    so I'd change it to

    $name =~ /^[a-zA-Z][\w.-]*$/
    I might even be tempted to explicitly escape that '.', even though it's not strictly necessary.
    $name =~ /^[a-zA-Z][\w\.-]*$/
    To me, that makes it more obvious that we're matching a dot.

    I wonder what percentage of 'average ability' Perl programmers could tell you the subtle difference between \z and \Z without reference to a manual?

    ---
    my name's not Keith, and I'm not reasonable.
      Howdy!

      I wouldn't change that RE. Since you had to look up \A and \z, did you also look up ^ and $? They don't do what you say they do. \A and \z do exactly what you attribute to ^ and ?.

      Further, escaping . in a character class is gratuitous.

      To elaborate: ^ matches the beginning of the line, while $ matches the end of the line (unless the line ends with a newline, in which case the anchor matches before the newline. \A is always the start of the string. \z is always the end of the string. ^ and $ depend on whether you are using the /m modifier or not.

      TheDamian makes fair arguments for using \A and \z (along with /x, /m, and /s routinely). The core argument is that doing so makes your regexen behave with less surprising behavior all the time. Consistency, and all that.

      yours,
      Michael

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://506983]
Approved by Errto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2024-04-16 08:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found