Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

validation of posted data.

by hostux (Initiate)
on Jul 31, 2002 at 21:32 UTC ( #186609=perlquestion: print w/replies, xml ) Need Help??

hostux has asked for the wisdom of the Perl Monks concerning the following question:

hi again peeps, here is a bit of code i wrote to strip usernames of all illegal characters
elsif ($display !~ /^\w*$/) { $message = $message.$errmsg ; $found_err = 1 ; }
currently this does not allow . in usernames. would someone please be able to re-write this to allow . thanks James

Replies are listed 'Best First'.
Re: validation of posted data.
by DamnDirtyApe (Curate) on Jul 31, 2002 at 21:44 UTC

    It's already been said here, but it's worth saying again:

    Don't try to exclude all `illegal' or `invalid' characters. You'll never get them all. Instead, decide what you will accept, and make sure your input contains that, and nothing else.

    I'd write your regexp something like this:

    unless ( $display =~ /^[a-z0-9\-\.]+$/ ) { # Invalid input }

    D a m n D i r t y A p e
    Home Node | Email
      I've found
      if ($display =~ /[^\-a-z0-9\.]/) { #err }
      to be easiere to read most of the time.
      Also if you _read_ the code, Yours says: unless $display is all good chars make an error. Where this says: If there are any illegal chars make an error. I find the latter easiere to understand.

      T I M T O W T D I

        Kind of a tangent, here, but isn't "." acceptable by itself (i.e., unescaped) in a character class, since its function as a metacharacter there wouldn't make much sense?

        --Your punctuation skills are insufficient!

(jeffa) Re: validation of posted data.
by jeffa (Bishop) on Jul 31, 2002 at 21:39 UTC
    Use a character class: /^[\w.]+$/
    use strict; my @user = map { 'foo'.$_.'bar' } qw(! @ # $ % ^ & * ( ) _ .); for (@user) { print /^[\w.]+$/ ? "$_ is legal\n" : "$_ is illegal\n" ; }


    (the triplet paradiddle with high-hat)
      I doubt \w is appropiate in this context, since usernames almost never may contain unicode chars or locale dependant chars ;)

      T I M T O W T D I
Re: validation of posted data.
by amir (Sexton) on Jul 31, 2002 at 23:41 UTC
    The best thing to do, as CERT recommends, is to "sanitize" and only allow what you need:
    $_ = "the\\/bad\$dataStuff"; # your data of course :) $OK_CHARS='-a-zA-Z0-9_.@'; # allowed characters s/[^$OK_CHARS]/_/go; # replace invalid chars with _ $user_data = $_; # sanitized version print $user_data; # output: the__bad_dataStuff
    Excellent article from CERT.
Re: validation of posted data.
by fs (Monk) on Jul 31, 2002 at 21:39 UTC
    I'd suggest that you verify that it only contains valid characters. So if your usernames can contain alpha and '.' only, you could use something like (untested):
    if($display =~ /[^a-b\.]/i){ # error condition here }
    Modify the regex to add additional conditions to suit your exact needs.
Re: validation of posted data.
by demerphq (Chancellor) on Jul 31, 2002 at 22:51 UTC
    A minor point but your regex allows "" (the empty string) to be used as a valid username.

    Yves / DeMerphq
    Writing a good benchmark isnt as easy as it might look.

      hi there, i have a seperate piece of code to validate blanks anlong with usernames > 20 < 3 chars
        All of that can be done in a regex:
        Will match only word sequences that are 3 to 20 characters long ONLY.

        Yves / DeMerphq
        Writing a good benchmark isnt as easy as it might look.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://186609]
Approved by Hero Zzyzzx
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (5)
As of 2023-06-08 11:28 GMT
Find Nodes?
    Voting Booth?
    How often do you go to conferences?

    Results (30 votes). Check out past polls.