Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Yesterday, while casting around looking for 'the right way' (Yes! I know, TMTOWTDI! So less say 'a right way') to do some things, I encountered a link her that led me to read this. Now, without getting into the debate of the merits of the author's style or motivations nor even the specific details of the article, it did serve to highlight several weaknesses in my own treatment of 'external input' and my attempts to 'sanatise' and untaint it.

I've looked around PM looking for a 'standard', tested way of acheiving this (what we might call b0iler proofing with the risk of the collective ire for a bad pun and conferring undue notoriety).

It seems to me that what is needed (I need) is a subroutine, that takes a string and removes all 'unsafe' (meta) characters and character sequences.

My thoughts on writing this are:

  • Don't use regex's for the parsing - I also recently discovered that even experienced monks can have trouble getting these right.
  • Don't do it in a method that would allow embedded escaping (or nulls etc) to be processed.
  • Allow as many others to review the code as possible in the hope of it becoming 'well refined'

I've had a couple of attempt at doing this. I started looping over the string and inspecting each char individually using ord() and comparing against a list of 'known values'.

I then thought of unpack()ing the string to ensure that Perl wouldn't do any magical escaping.

But my Perl skills so far are such that I'm reluctant to trust my own code (and even more reluctant to offer it here for public review again :o), so...

My question:

Would you kind people care to share your code to acheive the aims above or point me at code that will acheive those aims?

Offer your input to extending those aims.


Edit by dws to fix tags

In reply to Untainting safely. (b0iler proofing?) by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2021-07-31 21:59 GMT
Find Nodes?
    Voting Booth?

    No recent polls found