Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

A fix for (leave tainted variables tainted)

by Ovid (Cardinal)
on May 21, 2002 at 16:27 UTC ( #168177=note: print w/replies, xml ) Need Help??

in reply to (kudra: getopt correction) Re2: variable I expect to be tainted isn't: possible explanations?
in thread variable I expect to be tainted isn't: possible explanations?

kudra wrote: I'm still not convinced it should be leaving them untainted rather than explicitly retainting them, but at least now I know why this is happening.

I think you're right. These variables should be left tainted. The following hack will leave them tainted.

sub shellwords { package shellwords; local($_) = join('', @_) if @_; my $tainted = substr $_,0,0 if defined; # give me an tainted empty + string local(@words,$snippet,$field); s/^\s+//; while ($_ ne '') { $field = ''; for (;;) { if (s/^"(([^"\\]|\\.)*)"//) { ($snippet = $1) =~ s#\\(.)#$1#g; } elsif (/^"/) { die "Unmatched double quote: $_\n"; } elsif (s/^'(([^'\\]|\\.)*)'//) { ($snippet = $1) =~ s#\\(.)#$1#g; } elsif (/^'/) { die "Unmatched single quote: $_\n"; } elsif (s/^\\(.)//) { $snippet = $1; } elsif (s/^([^\s\\'"]+)//) { $snippet = $1; } else { s/^\s+//; last; } $field .= $snippet; } push(@words, $field); } # this loop will retaint the variables foreach ( @words ) { $_ .= $tainted if defined; } @words; }

The only problem with this is that if something calls with several variables, but only one is tainted, then *all* returned variables will be tainted. Is this a problem? I shouldn't think so, but I'm not sure. Also, who the heck would I submit this to? There's no name in the script and it looks like it's part of the standard distribution.

Update: chromatic suggested that it could be submitted to Perl 5 Porters. Will do.

Update 2: Benjamin Goldberg replied that my goal was good, but suggested using the 're' pragma. I resubmitted the patch to p5p as follows:

--- Tue May 21 10:04:07 2002 +++ Tue May 21 11:12:45 2002 @@ -17,6 +17,7 @@ while ($_ ne '') { $field = ''; for (;;) { + use re 'taint'; # leave strings tainted if (s/^"(([^"\\]|\\.)*)"//) { ($snippet = $1) =~ s#\\(.)#$1#g; }


Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://168177]
[LanX]: before digging into deep debugging ... I have a strange UTF8 problem, probably it rings a bell:
[LanX]: two utf8 strings from different sources are base64 encoded, but after joining both the umlauts in teh second get deleted
[Corion]: LanX: You can't just join two base64 strings together
[LanX]: (not a high priority bug because I can use some HTML entities in the second string)
[Corion]: base64 is padded to a multiple of 4 chars (or something)
[LanX]: misunderstanding, I joined them before converting to base64
[Corion]: Also, I would be wary of encodings and try to make really sure that both input strings are UTF-8. Maybe join the input strings from one source together to see whether they decode as bad or not

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (9)
As of 2017-01-16 13:51 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (150 votes). Check out past polls.