Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Taint checking, File::Find and Cwd

by ncw (Friar)
on Sep 25, 2000 at 15:53 UTC ( [id://33902]=perlquestion: print w/replies, xml ) Need Help??

ncw has asked for the wisdom of the Perl Monks concerning the following question:

I've noticed that File::Find doesn't work under taint checks. Eg :-
#!/usr/bin/perl -wT use strict; use File::Find; $ENV{PATH} = "/bin:/usr/bin"; $ENV{ENV} = ""; # This produces "Insecure dependency in chdir while running with -T # switch" find( sub { print }, "." );
A bit of rooting about in the source code comes up with the problem - File::Find uses Cwd::cwd which produces tainted results. Eg:-
#!/usr/bin/perl -wT use strict; use Cwd; $ENV{PATH} = "/bin:/usr/bin"; $ENV{ENV} = ""; # This produces "Insecure dependency in system while running with -T # switch" system "ls " . cwd();
Now I can see that it might be advantageous for setuid scripts to think that cwd() returns tainted data, but unfortunately this means that is isn't possible to use File::Find with -T set because the call to Cwd is internal to File::Find and can't be monkeyed with.

Any ideas on how to get around this?

PS This may be a un*x only problem I don't know. I tested this with 5.005_03 on linux.

PPS The code for Cwd::cwd() looks like chop(`pwd`) which is rather unpleasant in my opinion because it is calling the shell which starts another process, takes time etc. (Remove the PATH in the Cwd example above and it will fail with can't find pwd!)

Replies are listed 'Best First'.
Re: Taint checking
by merlyn (Sage) on Sep 25, 2000 at 17:11 UTC
    This is a known appropriate restriction of File::Find for 5.5.3, and unlikely to be changed no matter how many times you report it, because it needs to be tainted since the value is untrusted.

    The new versions of File::Find include a user-controllable "I trust this" parameter for managed untainting, but you use these at your own risk:

    `untaint' If find is used in taint-mode (-T command line switch or if EUI +D != UID or if EGID != GID) then internally directory names have to +be untainted before they can be cd'ed to. Therefore they are check +ed against a regular expression *untaint_pattern*. Note, that all +names passed to the user's *wanted()* function are still tainted. `untaint_pattern' See above. This should be set using the `qr' quoting operator. +The default is set to `qr|^([-+@\w./]+)$|'. Note that the paranthes +is which are vital. `untaint_skip' If set, directories (subtrees) which fail the *untaint_pattern* + are skipped. The default is to 'die' in such a case.

    -- Randal L. Schwartz, Perl hacker

      Yes untaint is what I want here - I want to be able to say that I trust '/usr/bin' and just let it get on with it.

      It is a bit disappointing that this untaint stuff isn't mentioned in the File::Find documentation since it is obviously a well known stubmling block.

      Is there any way to upgrade File::File for perl 5.5.3 without upgrading to perl 5.6.0? I run 5.6.0 on my personal machine just to stay ahead, but I prefer 5.5.3 on the servers for its proven track record!

Re: Taint checking
by t0mas (Priest) on Sep 25, 2000 at 16:57 UTC
    Check out this link from perl5-porters list for a similar problem...

    Try setting untaint=>1 in the find sub, it should help (unless you have unsafe characters in the directory names, like spaces, in which case you have to mess with the untaint_pattern).

    /brother t0mas
/bin/pwd (Re: Taint checking)
by tye (Sage) on Sep 25, 2000 at 20:39 UTC

    The code for Cwd::cwd() looks like chop(`pwd`) which is rather unpleasant in my opinion

    In my opinion as well. Unfortunately there aren't really any less unpleasant alternatives. Unix tracks the current working directory of the process by number (you can think of it as tracking the CWD i-node number or having a handle to opendir() handy -- depending on whether you think like a kernel or a user). So there is no "good" way to tell what you current directory is. Shells usually resort to caching the path used to get to this particular directory because of this.

    So /bin/pwd does stat(".") and opendir("..") then stat("..") and opendir("../.."), etc. trying to build a path to the current directory. But you can cd into a directory where you have execute access but no read access. This means that /bin/pwd would fail, except that it is often set-UID to root for exactly this reason.

    Unix provides a subroutine, getcwd(), that does the same thing as /bin/pwd. Often, this functionality if provided by fork()ing and exec()ing /bin/pwd and reading the results from a pipe. Sometimes, the Unix supports privileged subroutines so that /bin/pwd can be implemented directly in the subroutine and still succeed more often (it isn't that hard to have a current working directory for which there is no path).

    So it would be nice if had an XS component so that it could call getcwd() on versions of Unix where that doesn't just run /bin/pwd.

    Or you can just install a real O/S and avoid this whole mess and just call Win32::GetCwd(). (No, I'm not serious. This suggestion is just as useful as the usual "install a real O/S" suggestion that I still hear too often.)

            - tye (but my friends call me "Tye")
      I always thought of this tracking of your current directory to be shell magic just for the convenience of the user, eg an example (run on linux)

      $ cd /usr/X11
      $ pwd
      $ /bin/pwd
      $ ls -ld /usr/X11
      lrwxrwxrwx   1 root     root            5 Oct 21  1999 /usr/X11 -> X11R6
      But I see what you are saying in the case of the current directory being a deleted directory, eg :-
      $ mkdir test
      cd test
      pwd ; /bin/pwd
      rmdir .
      $ pwd ; /bin/pwd
      /bin/pwd: cannot get current directory: No such file or directory
      Anyway, getcwd() was what I expected Cwd to use, and I was very suprised to see this /bin/pwd thing!

      Thanks for the explanation. I think from the above experinents /bin/pwd == getcwd() under linux, so on that platform at least it should be replaced with getcwd(). According to my man pages getcwd() is POSIX so pretty much any unix should support it now-a-days.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://33902]
Approved by root
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (2)
As of 2024-05-26 05:34 GMT
Find Nodes?
    Voting Booth?

    No recent polls found