Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

variable I expect to be tainted isn't: possible explanations?

by kudra (Vicar)
on May 21, 2002 at 12:01 UTC ( #168090=perlquestion: print w/ replies, xml ) Need Help??
kudra has asked for the wisdom of the Perl Monks concerning the following question:

In the course of testing a function that manipulated taint, I tried to verify that some data was tainted. From this I noticed something rather odd with taint (or, as Ovid put it, "This is totally screwy."). Taint should affect all data derived from outside the program itself, but I was seeing user-supplied arguments as untainted.

This quick test which should show $match, $two and $ENV{PATH} as tainted:

#!/usr/bin/perl -T use strict; use CGI; use Getopt::Long; my $match = CGI::param('a'); my $two; GetOptions('b' => \$two); print "Running...\nPerl version: $]\nOsname: $^O\nExecutable name: $^X +\n\n"; my @data = ( 'zot', $match, $two, $ENV{PATH} ); foreach my $data ( @data ) { my $result = is_tainted($data) ? "$data is tainted\n" : "$data is not tainted\n"; print $result; $result = is_tainted_two($data) ? "$data is tainted\n" : "$data is not tainted\n"; print $result; } # Camel, 2nd edition (p. 358) taint check sub is_tainted { return not eval{ join("",@_), kill 0; 1; } } # Camel, 3rd edition (p. 561) taint check sub is_tainted_two { my $arg = shift; my $nada = substr($arg, 0, 0); local $@; eval {eval "# $nada"}; return length($@) != 0; }

However, when I tested it only $ENV{PATH} was found to be tainted. I tested this on four different computers with two different operating systems and a total of four different perl versions, and that was always the result.

Here is the output (I've left off $ENV{PATH} because it was too long), called with the -b flag:

# version 5.005_03 built for i386-freebsd
# FreeBSD our 4.4-RC FreeBSD 4.4-RC #7: Sun Aug 26 09:54:54 CET 2001 i386 # AND
# FreeBSD ns1 4.5-RELEASE FreeBSD 4.5-RELEASE #0: Mon Jan 28 14:31:56 GMT 2002 i386

(offline mode: enter name=value pairs on standard input)
a=foo
Running...
Perl version: 5.00503
Osname: freebsd
Executable name: /usr/bin/perl

zot is not tainted
zot is not tainted
foo is not tainted
foo is not tainted
1 is not tainted
1 is not tainted

######################################

# v5.7.3 built for i686-linux-64int
# Linux gremlin 2.4.2-2 #1 Sun Apr 8 20:41:30 EDT 2001 i686 unknown
# For some reason, this one didn't prompt me to enter the CGI arg
# in offline mode.

Running...
Perl version: 5.007003
Osname: linux
Executable name: /root/perl/bin/perl5.7.3

zot is not tainted
zot is not tainted
 is not tainted
 is not tainted
1 is not tainted
1 is not tainted

######################################

# v5.6.0 built for i386-linux
# Linux gremlin 2.4.2-2 #1 Sun Apr 8 20:41:30 EDT 2001 i686 unknown

(offline mode: enter name=value pairs on standard input)
a=foo
Running...
Perl version: 5.006
Osname: linux
Executable name: /usr/bin/perl

zot is not tainted
zot is not tainted
foo is not tainted
foo is not tainted
1 is not tainted
1 is not tainted

######################################

# perl5 (revision 5.0 version 6 subversion 1)
# linux funky 2.4.17-0.13smp #1 smp fri feb 1 10:30:48 est 2002 i686 unknown

Running...
Perl version: 5.006001
Osname: linux
Executable name: /usr/bin/perl

zot is not tainted
zot is not tainted
 is not tainted
 is not tainted
1 is not tainted
1 is not tainted

Ovid tested an earlier version of this test program which didn't use Getopt::Long or is_tainted_two (and had another string 'Ovid') and got this result:

D:\cygwin\home\Ovid>perl -T taint.pl a=1
zot is not tainted
1 is tainted
Ovid is not tainted

I was only able to think of a few possible explanations:

  1. I'm using a bad check for taintedness--but I've used two different published methods for checking it,
  2. The taint checks don't behave the way I think they do--but they are described as tests of whether a variable contains tainted data,
  3. The method of deriving the data is at fault--but I've tried two different methods of getting user input,
  4. Taint doesn't behave the way I think it does--but that would still leave a problem because Ovid's output differs from mine,
  5. I've written some test code which has a bug in it I haven't seen (in which case I plan to blame Ovid, since we were tossing this back and forth ;), or
  6. This is a bug in perl--but it would be a very long-standing one to exist in both 5.5.3 and 5.7.3.
I admit to not being very adept with searching the bug database, so the fact that I couldn't find a mention of this is not necessarily meaningful. I would find it hard to believe this could be a perl bug and have existed unnoticed this long.

I am wondering if anyone is able to provide a sensible explanation for what I've noted.

Update: Per a msged suggestion, I turned on warnings to see if there was the 'too late for -T' error, but there were only the expected 'use of unit value' warnings.

Comment on variable I expect to be tainted isn't: possible explanations?
Download Code
Re: variable I expect to be tainted isn't: possible explanations?
by derby (Abbot) on May 21, 2002 at 12:31 UTC
    Your $two isn't tainted because it is not really user data or derived from user data (but it is set based upon the presence of the option). Try changing it to accept a string parameter to see what happens:

    GetOptions('b=s' => \$two); ... ./script.pl -b foo

    I'm not sure why your CGI param is not tainted, mine is (CGI.pm version 2.752).

    -derby

    update: As for the CGI param not being tainted, when you run under "offline mode", CGI reads from STDIN and passes the data to shellwords (shellwords.pl). shellwords parses the passed data via regex and builds the return value via regex matches - effectively untainting the param. As others have shown, by passing the param on the cmdline (instead of offline), shows the param as tainted.

      You're right of course. I added the command-line check in quickly later.

      Making the change suggested by derby shows $two to be tainted (on one of the systems).

      I tested with Sifmole's syntax (previously I'd just used offline mode) and that shows the variable to be tainted.

      So this appears to be applicable to just CGI paramater gathering, and only in offline mode. And now derby's provided a nice logical explanation--thanks all!

      I'm still not convinced it should be leaving them untainted rather than explicitly retainting them, but at least now I know why this is happening.

      (CGI version is 2.56 with perl 5.6.0 and 2.80 with perl 5.7.3, which is the system I tested the second time.)

      Update too many updates to mention... this node was almost like the chatterbox.

        kudra wrote: I'm still not convinced it should be leaving them untainted rather than explicitly retainting them, but at least now I know why this is happening.

        I think you're right. These variables should be left tainted. The following hack will leave them tainted.

        sub shellwords { package shellwords; local($_) = join('', @_) if @_; my $tainted = substr $_,0,0 if defined; # give me an tainted empty + string local(@words,$snippet,$field); s/^\s+//; while ($_ ne '') { $field = ''; for (;;) { if (s/^"(([^"\\]|\\.)*)"//) { ($snippet = $1) =~ s#\\(.)#$1#g; } elsif (/^"/) { die "Unmatched double quote: $_\n"; } elsif (s/^'(([^'\\]|\\.)*)'//) { ($snippet = $1) =~ s#\\(.)#$1#g; } elsif (/^'/) { die "Unmatched single quote: $_\n"; } elsif (s/^\\(.)//) { $snippet = $1; } elsif (s/^([^\s\\'"]+)//) { $snippet = $1; } else { s/^\s+//; last; } $field .= $snippet; } push(@words, $field); } # this loop will retaint the variables foreach ( @words ) { $_ .= $tainted if defined; } @words; }

        The only problem with this is that if something calls shellwords.pl with several variables, but only one is tainted, then *all* returned variables will be tainted. Is this a problem? I shouldn't think so, but I'm not sure. Also, who the heck would I submit this to? There's no name in the script and it looks like it's part of the standard distribution.

        Update: chromatic suggested that it could be submitted to Perl 5 Porters. Will do.

        Update 2: Benjamin Goldberg replied that my goal was good, but suggested using the 're' pragma. I resubmitted the patch to p5p as follows:

        --- shellwords.pl.orig Tue May 21 10:04:07 2002 +++ shellwords.pl Tue May 21 11:12:45 2002 @@ -17,6 +17,7 @@ while ($_ ne '') { $field = ''; for (;;) { + use re 'taint'; # leave strings tainted if (s/^"(([^"\\]|\\.)*)"//) { ($snippet = $1) =~ s#\\(.)#$1#g; }

        Cheers,
        Ovid

        Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: variable I expect to be tainted isn't: possible explanations?
by Sifmole (Chaplain) on May 21, 2002 at 12:42 UTC
    With a run of ./foo.pl a=123 -b 456, I get:
    Running... Perl version: 5.006001 Osname: linux Executable name: /usr/local/bin/perl + zot is not tainted zot is not tainted 123 is tainted 123 is tainted 456 is tainted 456 is tainted [PATH] is tainted [PATH] is tainted
    Update:
    $CGI::revision = '$Id: CGI.pm,v 1.49 2001/02/04 23:08:39 lstein Exp $' $CGI::VERSION='2.752';
Re: variable I expect to be tainted isn't: possible explanations?
by iamcal (Friar) on May 21, 2002 at 12:53 UTC
    activeperl :
    C:\WINNT\Profiles\CalHenderson\Desktop\lang>perl -T taint.pl a=123 -b +456 Running... Perl version: 5.006001 Osname: MSWin32 Executable name: C:\Perl\bin\Perl.exe zot is not tainted zot is not tainted 123 is tainted 123 is tainted 1 is not tainted 1 is not tainted [path] is tainted [path] is tainted
    C:\WINNT\Profiles\CalHenderson\Desktop\lang>perl -v This is perl, v5.6.1 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Copyright 1987-2001, Larry Wall Binary build 628 provided by ActiveState Tool Corp. http://www.ActiveS +tate.com Built 15:41:05 Jul 4 2001

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://168090]
Approved by ariels
Front-paged by ariels
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (8)
As of 2014-07-14 10:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (257 votes), past polls