Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Surprising behavior of Cwd module on Unix with symlinks

by Anonymous Monk
on Feb 08, 2011 at 04:39 UTC ( [id://886883]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Today I noticed a surprising behaviour of the Perl Cwd module with directories that are symbolic links, as demonstrated by the Unix shell commands below:

# echo $PWD /home/devel/devut/build64 # ls -l $PWD ... /home/devel/devut/build64 -> /opt/oflow/librarian/build64 # pwd /home/devel/devut/build64 # pwd -P /opt/oflow/librarian/build64 # perl -l -MCwd -e 'print Cwd::getcwd()' /opt/oflow/librarian/build64 # perl -l -MCwd -e 'print Cwd::cwd()' /opt/oflow/librarian/build64
Since the introduction of symbolic links has broken a number of my automated tests, I'm eager to fix them as simply as possible. Though I could replace the Cwd::cwd() calls with `pwd` (for Unix only), I'd prefer a more portable solution. Ideal would be to somehow tell the Perl Cwd module to behave like the Unix pwd command. Browsing the Cwd docs, however, revealed no obvious way to achieve that. Note that the Unix pwd command provides a -P option to control this behaviour.

I googled for this problem, but found precious little:

Note that these are not especially useful links.

Replies are listed 'Best First'.
Re: Surprising behavior of Cwd module on Unix with symlinks
by ELISHEVA (Prior) on Feb 08, 2011 at 06:15 UTC

    The most portable solution would be to avoid the use of relative paths (and pwd/cwd) altogether. If you really, truly need to go forward with a particular path, you can avoid the "auto-resolving link" problem altogether by constructing the fully qualified path and using that in your program.

    In general, littering your code with relative paths and dependencies on the current working directory is not a good idea. There are several problems that can result:

    • The current working directory might not be what you think it is, not just because the real path is substituted for the symbolic path, but also because cd can fail if you try to switch to a directory for which your effective user (EUID) doesn't have permission.
    • In some cases, particularly with mounted drives, you can get stuck in no-mans land. If you are going to use cwd, you should avoid fastcwd for that reason: see http://rt.cpan.org/Public/Bug/Display.html?id=13851 (ignore the fact that this is a bug report - the discussion about it is the interesting part as regards portability issues)
    • Your code is unlikely to work in taint mode which does not like relative paths.
    • And in general, your code will not be portable to any web interface because the "current" directory in that context is essentially undefined - it could be anything on the face of the planet (well, web server).

    See File::Spec for a portable way to attach relative paths to a fully qualified starting path.

    About the only situation where I can see a genuine use for cwd is (a) start-up situations (b) incoming information (but here lie security risks which is why taint mode doesn't let you use them) and (c) an occasional piece of software that actually requires itself to be run with a particular directory or kind of directory as "current". Most applications do accept fully qualified paths even if the documentation examples only illustrate relative path parameters for convenience

    In start-up/incoming information situations you may want to find certain files based on the starting directory of a user. However, in my own code, I immediately convert that to a fully qualified path and from there on in use fully qualified paths in the code.

Re: Surprising behavior of Cwd module on Unix with symlinks
by ikegami (Patriarch) on Feb 08, 2011 at 05:57 UTC

    Cwd's getcwd simply returns what the system provides. Your shell is actually tracking the work directory separately from the system, but it provides it via $ENV{PWD}. You can use it safely as follows:

    use Cwd qw( getcwd ); sub my_getcwd { use Cwd qw( ); sub getcwd { my $cwd = Cwd::getcwd(); if (exists($ENV{PWD}) && $ENV{PWD} ne $cwd) { my $e = my ($e_dev, $e_node) = stat($ENV{PWD}); my $c = my ($c_dev, $c_node) = stat($cwd); if ($e && $c && $e_dev == $c_dev && $e_node == $c_node) { $cwd = $ENV{PWD}; } } return $cwd; } print Cwd::getcwd(), "\n"; print getcwd(), "\n";
    /tmp/ikegami /home/ikegami/tmp
      $ENV{PWD} has the same portability problems as `pwd` does though, and the OP was specifically looking for a portable solution to the problem.

        Getting bash's pwd isn't possible where bash isn't used. What's your point? Are you implying that the code I posted isn't portable? If so, you'd be mistaken.

        Update:

        Every tool that gives you bash's pwd use $ENV{PWD}.

        $ /bin/pwd GNU pwd /tmp/ikegami $ bash -c pwd bash's builtin /home/ikegami/tmp $ PWD=/ bash -c pwd bash's builtin uses $ENV{PWD} /tmp/ikegami

        The GNU pwd doesn't have any command line options and always gives the path returned by getcwd (with a fallback on error). Darwin's pwd goes a step further and does something that looks awfully familiar.

        static char * getcwd_logical(void) { struct stat lg, phy; char *pwd; /* * Check that $PWD is an absolute logical pathname referring to * the current working directory. */ if ((pwd = getenv("PWD")) != NULL && *pwd == '/') { if (stat(pwd, &lg) == -1 || stat(".", &phy) == -1) return (NULL); if (lg.st_dev == phy.st_dev && lg.st_ino == phy.st_ino) return (pwd); } errno = ENOENT; return (NULL); }
Re: Surprising behavior of Cwd module on Unix with symlinks
by DrHyde (Prior) on Feb 08, 2011 at 11:03 UTC
    FindBin will do what you want:

    $ ls -l symlink lrwxrwxrwx 1 david david 22 2011-02-08 10:54 symlink -> foo/bar $ cd symlink $ pwd /home/david/symlink $ ls -l .. drwxr-xr-x 2 david david 4096 2011-02-08 10:54 bar

    Note that ls shows us the contents of the parent of the *real* directory that we're in, not the contents of the directory containing the symlink. Symlinks are a bit weird.

    $ perl -MFindBin -e 'print "$FindBin::Bin\n"' /home/david/foo/bar

    FindBin correctly resolves it, giving us the full path to the directory.

      FindBin correctly resolves it, giving us the full path to the directory.

      As far as I understand, the OP wants the opposite, i.e. the respective symlink path being returned (/home/david/symlink in your case), not the target of the symlink, which is more generally aka realpath, and is easier to determine.

      As ikegami pointed out, some other process (usually the shell) needs to keep track of the chdir history to be able to provide the info the OP wants.

Re: Surprising behavior of Cwd module on Unix with symlinks
by Anonymous Monk on Sep 20, 2016 at 08:29 UTC
    I've had exactly the same problem on a Solaris 10 system. It was essential that I get the path including the symlink, not the real path to the current working directory. I found that even though pwd on the command line gave this, when used in a script it did not. After much trial and error and Google searching I found that just using the value of $ENV{PWD} gave me what I needed.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://886883]
Approved by broomduster
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2025-03-21 07:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    When you first encountered Perl, which feature amazed you the most?










    Results (63 votes). Check out past polls.