Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Filtering unwanted chars from input field

by Anonymous Monk
on Dec 17, 2012 at 18:29 UTC ( #1009212=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!
I would like to know if anyone would have a better way of filtering a file name coming from form in a file upload script using Perl.
I use something like this:
sub filter { my $str=shift; for ($str) { return '' unless $_; s/\s+//g; s/'/\\'/g; tr{\*<>;()\"\'?#\/}{}d; s/<script//g; } $str; }
Is there a better way? What if another character(s) gets in because its not listed in the subroutine?

Replies are listed 'Best First'.
Re: Filtering unwanted chars from input field
by CountZero (Bishop) on Dec 17, 2012 at 18:46 UTC
    Rather than deleting unwanted characters, perhaps you can keep acceptable characters?


    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: Filtering unwanted chars from input field
by Kenosis (Priest) on Dec 17, 2012 at 19:04 UTC

    I agree with CountZero. One implementation of this can be:

    use strict; use warnings; print filter($_), "\n" for <DATA>; sub filter { my $str = shift; defined $str or return ''; my $acceptable = 'a-z0-9_.'; $str =~ s/[^$acceptable]//gi; return $str; } __DATA__ Hello, world! te-st%$/*. All_of_this_is_OK. !@#$%^&*()12345


    Helloworld test. All_of_this_is_OK. 12345

    Edit: Used a simpler s///

      Instead of spliting and greping it might be simpler to use tr (see Quote and Quote like Operators).

      $ perl -E ' > @data = ( > q{Hello, world!}, > q{te-st%$/*.}, > q{All_of_this_is_OK.}, > q{!@#$%^&*()12345}, > ); > say for map { tr{A-Za-z0-9_.}{}cd; $_ } @data;' Helloworld test. All_of_this_is_OK. 12345 $

      I hope this is of interest.



        Thank you, johngg. Not sure why I complicated it. Yours is, indeed, a more elegant and readable solution (++). Was brought back to this node after the following should-have-done-this-in-the-first-place solution occurred to me:

        $str =~ s/[^$acceptable]//gi;

        Edited my original comment to reflect this...

        Would I be breaking the law by allowing a "-" and updating the code to:
        sub filter { my $str = shift; defined $str or return ''; $str =~tr{A-Za-z0-9_.-}{}cd; return $str; }
Re: Filtering unwanted chars from input field
by kennethk (Abbot) on Dec 17, 2012 at 19:07 UTC
    As CountZero says above, white listing is a much better idea in general for this type of issue since you don't have to worry so much about what you missed; it's a literal fail safe. I'd probably do something like:
    sub filter { my $file = shift; if (defined $file) { return $file if $file =~ /[\w.]/; } return; }

    The biggest question in all this is what are you going to do with the string when you are done? For example, if you are feeding this to client display, most templates (HTML::Template) can handle the escaping for display literals without much difficulty. If you are passing it to an open, you can use the 3 argument form to avoid a lot of vulnerability. If you are passing it to system, multiple argument forms also handle escaping for you.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Good stuff, this is only to accept a file been uploaded, just to make sure that the user doesn’t add weird characters like single quote or \ or /, who knows. I like the white list suggestion. Thanks!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1009212]
Approved by herveus
Eily is Listening to "Leave out all the rest" by Linkin Park
[karlgoethebier]: that's why trade unions like C ;-)

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (10)
As of 2017-07-21 09:30 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (320 votes). Check out past polls.