http://www.perlmonks.org?node_id=155477


in reply to Am I javascript or not?

If you haven't gotten it yet, there's wild javascript in there (it's all tame though), read the source Luke. From what I've been saving on my pad.

  • Script Tag

    First off the SCRIPT tag, good you stomped it. Checking for just javascript is bad. There are things other than javascript including:
    JScript
    blecch
    VBScript
    shudder
    PerlScript (the horror)
    This is (of course?) actually of far greater concern here than on the internet at large
    This document is less than authoritative concerning removal of VBScript and JScript. They'd require seperate research and I would not be surprised if there are many undocumented features concerning them.
  • JavaScript Protocols

    mocha: and javascript: allow inline JavaScript for anchors. Mozilla drops support for mocha.

    Example

    make mine a latte cup o' mud

    For those interested, this is actually why I have JavaScript enabled. My personal toolbar is full of this stuff, they are called "bookmarklets".

  • JavaScript entities

    &{}, a form of inline JavaScript, not commonly used. This is probably NN only and it seems like support has been dropped for this in recent 4.x builds.

    Example

    Entities
  • JavaScript Attributes

    onClick, onSubmit etc. for any tag which is allowed through. Or for extra safety as a brother has so been so kind as to demonstrate, remove them no matter what.
  • HTML entities for ASCII printable characters

    These should be replaced with the characters they code for, & < and > carefully examined and excluded of course.
    NOTE: one should not be strict about requiring the ';' as browsers are flaky on this. This should be done as the first step of cleansing.

    Examples

    Gotta love your SGML entities.
    J'accuse This works in Netscape Communicator 4.79 and K-Meleon 0.6. UPDATE: Mozilla 1.6 too
    Variation on a theme
    J'accuse This does not work in Netscape Communicator 4.79 or K-Meleon 0.6 I would expect it to work in some browsers.
    Chaining/Stacking I (you could of course do them in the opposite order).
    J'accuse This does not work in Netscape Communicator 4.79 or K-Meleon 0.6 I think that if URL-encoded works in a browser then at least one chaining would follow.
    Recursion I.
    J'accuse This does not work in Netscape Communicator 4.79 or K-Meleon 0.6 This really ought not work anywhere.
  • Data Protocol

    No example here ;-), see the RFC below and think MIME-type.
  • UPDATE --

    META


    Something that is not itself directly a threat is <meta http-equiv="Content-Script-Type" content="text/javascript">. However removal of it could be prudent. If this META tag is used to set the preferred scripting language for the page, when removed any scripts on the page MAY become invalid (assuming the browser cannot auto-detect the type, this is most likely for installed extensions such as PerlScript and TCL).
  • Further Reading

    Here are some related sources that are definitely worth a once-over
    Mmmm Entities
    RFC 2397
    Stuff you probably didn't know you could do
    Or if you really want to get in deep (though they seem to have ditched much of the older documentation)
    More than you wanted to know
  • --
    perl -pe "s/\b;([st])/'\1/mg"