Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Preventing XSS

by techcode (Hermit)
on Sep 19, 2007 at 19:30 UTC ( #639978=perlquestion: print w/replies, xml ) Need Help??
techcode has asked for the wisdom of the Perl Monks concerning the following question:

I thought I'm all settled with following code:
sub form { my $self = shift; my %params = @_; # I could use delete right? my $skip = array_to_hash($params{'skip_fields'}); # Array/ArrayRef my $q = $self->query(); my %vars = $q->Vars(); use HTML::Entities; foreach(keys %vars){ next if $skip->{$_}; # Don't encode if it's in skip list $vars{$_} = HTML::Entities::encode($vars{$_}); } return \%vars; }
But here is a problem. I use UTF-8 so that site would support Serbian (latin not cyrilic) so I end up with funky entities instead of letters like Š, Đ, Č, Ć and Ž.

Which when I hit preview I realised this site is doing too :)

Is there any other way to filter the input that would not do this? I dont want Š instead of Š in my forms ...

I believe it's ok to have those chars not encoded since I set both header and meta charset to utf-8.

Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

Replies are listed 'Best First'.
Re: Preventing XSS
by ikegami (Pope) on Sep 19, 2007 at 20:11 UTC

    Sounds to me like you want

    $vars{$_} = HTML::Entities::encode_entities($vars{$_}, '<>&"');

    Quote HTML::Entities,

    The default set of characters to encode are control chars, high-bit chars, and the <, &, >, ' and " characters. But this, for example, would encode just the <, &, > and " characters:

    $encoded = encode_entities($input, '<>&"');

    It converts plain text into tag-less HTML.

      Note that I see no reason to encode quote characters here. It isn't like the result is being placed into an attribute value. (:

      And I probably wouldn't encode all ampersands since they are useful and rather low risk. If you are worried about the little-supported javascriptish ampersand stuff, then I'd only encode those ampersands. But I guess taking something useful away from users out of combined fear and laziness is not a shockingly rare result.

      Which leaves us with a couple of simple regexes and little reason to use a module:

      s/&{/&amp;{/g; s/</&lt;/g;

      - tye        

Re: Preventing XSS
by b10m (Vicar) on Sep 19, 2007 at 19:44 UTC

    I'm afraid you don't get the concept of XSS. You're dealing with encoding/HTML Entity problems, which is bad, but completely different than XSS "protection".

    For XSS "protection", have a look at HTML::StripScripts, it works rather well :-)

    Update: after reading your post again, it does seem you want to prevent XSS attacks (by using HTML::Entities) yet you don't want your "crazy letters" to be lost ;-). I'm not sure HTML::Entities will bulletproof your script. Have a look at HTML::StripScripts, really. But experts my say HTML::Entities _is_ enough (I would love to hear opinions on this)


    All code is usually tested, but rarely trusted.
Re: Preventing XSS
by andreas1234567 (Vicar) on Sep 20, 2007 at 11:05 UTC
Re: Preventing XSS
by techcode (Hermit) on Sep 20, 2007 at 10:47 UTC
    I ended up using what ikegami sugested + whitelist of allowed charactes in the fields. I needed to encode them since I print that back (HTML::FillInForm) together with error messages.

    So in Data::FormValidator I created additional constraints such as that ordinary fields should contain alphanums, underscore, minus, space and dot. Email obviosly takes out space and adds @, while URL's add : and /

    Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://639978]
Approved by b10m
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2017-10-17 06:15 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (218 votes). Check out past polls.