Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine

Re: Untainting safely. (b0iler proofing?)

by BrowserUk (Pope)
on Jun 25, 2002 at 22:40 UTC ( #177228=note: print w/replies, xml ) Need Help??

in reply to Untainting safely. (b0iler proofing?)

My interpretation of the referenced article was that the biggest problems were caused by several factors:

  • Not all programmers are, nor could be, 'experts' in understanding all the cracks in the individual or collective amours' of Perl's calls to system functions, or those of all the systems that Perl run's on.
  • Even experts make mistakes.
  • Re-using (and reusability) are good. CPAN is one of Perl's great strengths. The problem is, when the programmer passes values to modules, he does not know exactly how those values will be used. - Sure, he can read the source, but that negates half the benefit of reuse.
  • Doing sufficient screening (or just enough allowing) for the required use of input is fine, but what happens in 6 months or a year when the functionality needs to be extended? Will the maintenance programmer know or understand what sanitising was done and why? The article had an interesting section that showed that how consecutive filtering was order dependant. Programmer 1 gets input, does appropriate filtering for the planned use, and uses it. Programmer 2 comes along a month or 6 later with a requirement for new functionality. Goes in, grabs the value already untainted - and uses it.

    I don't think "only employ competent programmers" cuts it here!

As an example, my current project (its only a learning exercise at this point, so please don't pollute the thread by critiquing the project design), uses .xml files to describe 'things' and these are used to build the HTML for displaying. The user selects the 'thing' of interest by clicking on a menu. The identifier of the 'thing' is passed as a URL search parameter and then that identifier is used to build the filename of the .xml file that is opened. The path/filename constructed is then passed to XML::Simple to read and process. The menu's that the user clicks on are themselves generated by processing input from readdir().

The aim is that new things and groups of things can be added to the site by simply dropping a new .xml file in the appropriate directory and creating new directories respectively. Updates to existing things would be done by editing the .xml (and then validating before putting (back) into the production environment). The idea being that you don't need to code new HTML to add/delete/update 'thing' pages - you edit the .xml, validate it against a custom DTD and move/copy it to the appropriate place and all the layout of the HTML is taken care of by an intelligent Perl script. (Once I get up to speed enough to write such a beast.:)

This simple, small sample of the project implementation raises (at least) the following questions:

The following are only example questions I am not seeking answers to them here!

  • What does XML::Simple do with the parameter I pass?
  • How restrictive should I be on allowable values for filename chars should I be? What happens if later on this is seen as "too restrictive" and the rules get modified?
  • If the user embeds an url-escaped null after the product identifier and I haven't checked for embedded null chars (I hadn't!) and I pass this alone to XML::Simple, will it be used in a way that could be vulnerable?

I know there are more questions.

In answer to Merlyn's statement:...I don't understand why you are even trying to do it... and other statements about context...

It struck me that there are only a limited (and relatively few) modes of possible exploitation /failure for external data. However, these points of exploitation /failure can be spread throughout a project. Many of you monks will have already "rolled your own" solutions to some or all of these - perhaps many times.

It seemed to makes sense both from factorisation and maintenance points of view, to handle the screening and untainting of 'external input' in a centralised manner. That way if (when?) new failures and exploits are described or occur, the required modifications only need to be done once, in one place.

To this end, I thought a sub (module?) called (for example) sanitise() (or sanitize() if you prefer:), that "takes care" of this was appropriate.

The input would be the tainted string, the return as appropriate. Given the discussion thus far regarding context, perhaps a second parameter would be a constant chosen to define what type of sanitisation was required. eg.

  • PATH - perhaps RELATIVE_PATH | ABSOLUTE_PATH would be necessary?
  • HTML

Maybe I am "looking in the wrong places" or "don't understand the problem"? Maybe I am trying to factor something that is either too complicated or too trivial to be factored? Maybe I am reacting in paranoia, or just 'knee jerking' in response to the referenced article, but I felt that the strongest conclusion to draw from the article was that DIY sanitisation (even by security experts) of external input was the biggest source of vuln and exploits and I thought that this was an appropriate way solving that problem.

Sorry this got so long, but re-reading it, there is nothing that I feel should be left out!

Replies are listed 'Best First'.
•Re: Re: Untainting safely. (b0iler proofing?)
by merlyn (Sage) on Jun 26, 2002 at 00:09 UTC
    Again, I've got to emphasize. There is no such thing as "unsafe data". Merely "data used unsafely". So a hypothetical sanitize routine could at best be written as:
    sub sanitize { die "If you had to call me, you've lost already"; }
    You must fix the behavior of your code, not wrestle your data to the floor.

    -- Randal L. Schwartz, Perl hacker

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://177228]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2021-08-02 00:36 GMT
Find Nodes?
    Voting Booth?
    My primary motivation for participating at PerlMonks is: (Choices in context)

    Results (16 votes). Check out past polls.