Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

My interpretation of the referenced article was that the biggest problems were caused by several factors:

  • Not all programmers are, nor could be, 'experts' in understanding all the cracks in the individual or collective amours' of Perl's calls to system functions, or those of all the systems that Perl run's on.
  • Even experts make mistakes.
  • Re-using (and reusability) are good. CPAN is one of Perl's great strengths. The problem is, when the programmer passes values to modules, he does not know exactly how those values will be used. - Sure, he can read the source, but that negates half the benefit of reuse.
  • Doing sufficient screening (or just enough allowing) for the required use of input is fine, but what happens in 6 months or a year when the functionality needs to be extended? Will the maintenance programmer know or understand what sanitising was done and why? The article had an interesting section that showed that how consecutive filtering was order dependant. Programmer 1 gets input, does appropriate filtering for the planned use, and uses it. Programmer 2 comes along a month or 6 later with a requirement for new functionality. Goes in, grabs the value already untainted - and uses it.

    I don't think "only employ competent programmers" cuts it here!

As an example, my current project (its only a learning exercise at this point, so please don't pollute the thread by critiquing the project design), uses .xml files to describe 'things' and these are used to build the HTML for displaying. The user selects the 'thing' of interest by clicking on a menu. The identifier of the 'thing' is passed as a URL search parameter and then that identifier is used to build the filename of the .xml file that is opened. The path/filename constructed is then passed to XML::Simple to read and process. The menu's that the user clicks on are themselves generated by processing input from readdir().

The aim is that new things and groups of things can be added to the site by simply dropping a new .xml file in the appropriate directory and creating new directories respectively. Updates to existing things would be done by editing the .xml (and then validating before putting (back) into the production environment). The idea being that you don't need to code new HTML to add/delete/update 'thing' pages - you edit the .xml, validate it against a custom DTD and move/copy it to the appropriate place and all the layout of the HTML is taken care of by an intelligent Perl script. (Once I get up to speed enough to write such a beast.:)

This simple, small sample of the project implementation raises (at least) the following questions:

The following are only example questions I am not seeking answers to them here!

  • What does XML::Simple do with the parameter I pass?
  • How restrictive should I be on allowable values for filename chars should I be? What happens if later on this is seen as "too restrictive" and the rules get modified?
  • If the user embeds an url-escaped null after the product identifier and I haven't checked for embedded null chars (I hadn't!) and I pass this alone to XML::Simple, will it be used in a way that could be vulnerable?

I know there are more questions.

In answer to Merlyn's statement:...I don't understand why you are even trying to do it... and other statements about context...

It struck me that there are only a limited (and relatively few) modes of possible exploitation /failure for external data. However, these points of exploitation /failure can be spread throughout a project. Many of you monks will have already "rolled your own" solutions to some or all of these - perhaps many times.

It seemed to makes sense both from factorisation and maintenance points of view, to handle the screening and untainting of 'external input' in a centralised manner. That way if (when?) new failures and exploits are described or occur, the required modifications only need to be done once, in one place.

To this end, I thought a sub (module?) called (for example) sanitise() (or sanitize() if you prefer:), that "takes care" of this was appropriate.

The input would be the tainted string, the return as appropriate. Given the discussion thus far regarding context, perhaps a second parameter would be a constant chosen to define what type of sanitisation was required. eg.

  • PATH - perhaps RELATIVE_PATH | ABSOLUTE_PATH would be necessary?
  • HTML

Maybe I am "looking in the wrong places" or "don't understand the problem"? Maybe I am trying to factor something that is either too complicated or too trivial to be factored? Maybe I am reacting in paranoia, or just 'knee jerking' in response to the referenced article, but I felt that the strongest conclusion to draw from the article was that DIY sanitisation (even by security experts) of external input was the biggest source of vuln and exploits and I thought that this was an appropriate way solving that problem.

Sorry this got so long, but re-reading it, there is nothing that I feel should be left out!

In reply to Re: Untainting safely. (b0iler proofing?) by BrowserUk
in thread Untainting safely. (b0iler proofing?) by BrowserUk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (1)
As of 2021-07-31 13:14 GMT
Find Nodes?
    Voting Booth?

    No recent polls found