Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Absolute pathnames from relative?

by grantm (Parson)
on Jan 22, 2003 at 02:22 UTC ( [id://228917]=perlquestion: print w/replies, xml ) Need Help??

grantm has asked for the wisdom of the Perl Monks concerning the following question:

On this face of it, this is a real newbie question that I'm kind of embarrased to ask but here goes anyway ...

Given an absolute pathname as a base (say: /usr/local/src) and a relative path (say: ../bin/fnurgle) what module will give the 'canonical' absolute pathname: /usr/local/bin/fnurgle?

I thought this was one of the functions of the File::Spec module but this code ...

perl -MFile::Spec -le 'print File::Spec->rel2abs("../bin/fnurgle", "/usr/local/src")'

... gives this ...

/usr/local/src/../bin/fnurgle

... and so does this ...

perl -MFile::Spec -le 'print File::Spec->canonpath(File::Spec->rel2abs("../bin/fnurgle", "/usr/local/src"))'

I can achieve the desired effect using a regex or even prepending 'file:' and using the URI module but shouldn't this be the default behaviour of File::Spec->canonpath()?

Replies are listed 'Best First'.
Re: Absolute pathnames from relative?
by mojotoad (Monsignor) on Jan 22, 2003 at 06:39 UTC
    This is a feature, not a bug. Or, rather, a symptom of dealing with ambiguity within differing filesystems on different versions of operating systems.

    The short answer is that, without a physical file system check, you can use the no_upwards() method from File::Spec to strip your dots and double dots (or their respective equivalents on other filesystems).

    The somewhat longer answer is that the OS (or filesystem implementation) is the ultimate authority on what the filesystem implementation really does. With a physical system check you can use the Cwd module and a cwd() combined with a pwd() to let the operating system figure out the correct interpretation.

    The reason this is ambiguous is that soft links can span filesystems or volumes. A path involving a soft link across volumes and relative paths presents ambiguity -- if you switched volumes at some point in resolving the path, does the '..' mean backtrack to the prior volume or should it be one level up on the new volume? This is a question typically answered on the OS level.

    Consider two volumes mounted thusly:

    /mnt/disk_a /mnt/disk_b

    and the following link/directory structure:

    /mnt/disk_a/hubba /mnt/disk_a/opt -> /mnt/disk_b/opt /mnt/disk_b/opt /mnt/disk_b/hubba

    Now what does /mnt/disk_a/opt/../hubba mean? It could mean either /mnt/disk_a/hubba or /mnt/disk_b/hubba depending on whether you take a tokenized or holistic approach to resolving relative directories. And to complicate matters further, soft links can also have relative paths embedded in them.

    The real issue, however, is that the "proper" behavior is encoded at the filesystem implementation level. On most Unix variants, you can mount several different filesytem formats with their own interpretations of relative path resolution. But this does not prevent linking across these disparate filesystems. So though it may be more visually pleasing to eliminate those relative paths, they might be necessary to accurately resolve the behavior along OS and filesystem spec lines.

    If you're relatively certain you will be operating on homogenous systems, OS as well as filesystem implementations, then by all means go for it without the physical check.

    Incidentally, punting to the OS in order to let it decide what to do is the difference between cwd() and fastcwd() in the Cwd module.

    Matt

    Update: Also see my response below concerning the same issues regarding mount points across the network: Re^3: Absolute pathnames from relative?

      The somewhat longer answer is that the OS (or filesystem implementation) is the ultimate authority on what the filesystem implementation really does. With a physical system check you can use the Cwd module and a cwd() combined with a pwd() to let the operating system figure out the correct interpretation.

      An easier method using the Cwd method, albeit with the same limitation of being a filesystem-based implementation, is through the use of the abs_path method. For example:

      # perl -MCwd=abs_path -le 'print abs_path("/usr/local/src/../bin/fnurg +le")' /usr/local/bin/fnurgle

       

      perl -le 'print+unpack("N",pack("B32","00000000000000000000001000100010"))'

        I'm bringing this thread back from the past but I'm doing something similar right now so I searched and found it.

        I already tried using abs_path. It works fine if the argument is a directory but not when the argument is a file.

      Thanks Matt, I was guessing it had something to do with symlinks but hadn't put all the pieces together. Interestingly (or not) the colleague who originally asked me the question was working on Win32 which doesn't do symlinks but even on that platform, File::Spec::Win32 still leaves the '..' parts in place.

      The no_upwards() method doesn't quite seem to do what I need either (I had tried it before I posted originally). It takes a list of pathname components and all it does is eliminate the '.' and '..' components (ie: it doesn't remove the component before the '..') - I'm not sure when that would be useful. Even it it did what I wanted, by the time I'd called splitdir() to provide the right inputs and then catdir() to reassemble the output, the end result would hardly be a clear and concise piece of code.

      I guess I'll stick with my URI solution. Thanks again.

        The no_upwards() behavior sounds like a bug, in my mind. It makes no sense to eliminate '..' without considering the consequences.

        I neglected to mention in my initial writeup that soft links are not the only time this arises -- mount points will do it as well. Mounted disks might be local or remote -- they might even be on different operating systems. Same dynamic, though: system_a:/server_disk/home/hubba could be mounted on system_b:/home/hubba. Typically this depends on the NFS implementation (or whatever the remote disk sharing protocol might be).

        rob_au is correct that the abs_path() method of Cwd is the best way to perform this trick using a physical system check. abs_path() itself, last time I checked, is based on File::Spec and the Cwd methods cwd() and pwd() routines. There is an analogous fast_abs_path() that "does the right thing" that you expect, without a physical check. Entirely reliable, however, if you expect to be on a homogenous system.

        Matt

Re: Absolute pathnames from relative?
by MarkM (Curate) on Jan 22, 2003 at 19:35 UTC

    Symbolic links are very odd, and support for conveniently resolving symbolic links is limited.

    On a WIN32 based operating system, symbolic links don't really exist, and so, code such as the following will suffice:

    use File::Spec; use Win32; my $absolute_path = Win32::GetFullPathName(File::Spec->join("C:\\Perl\ +\bin", "..\\lib\\File\\Spec.pm"));

    On UNIX-based operating systems, things get a little strange. The Cwd module itself provides a method of determining the absolute path for a directory.

    use File::Spec; use Cwd; my $absolute_path = Cwd::abs_path(File::Spec->join("/usr/bin", "../lib +"));

    Cwd in Perl 5.8.0 includes an XS portion that uses advanced code that is able to correctly determine the absolute path for a file as well. (The Perl version of the code, that is used if the XS portion cannot be loaded, still has the directory-only limitation)

    The Cwd module uses a frequently used 'trick' that involves recursing backwards from the specified directory through '..' looking for a device/inode match between '..' and the names in '..'. The problem with files is that "/usr/lib/../bin/perl/.." is not a valid accessible path. The normal 'hack' to get around this is to break "/usr/lib/../bin/perl" into "/usr/lib/../bin" and "perl", resolve the "/usr/lib/../bin" to "/usr/bin", and tag "perl" onto the end resulting in "/usr/bin/perl".

    The problem with this approach, is that it does not consider the possibility that "/usr/bin/perl" might itself be a symlink to "/opt/perl5.8.0/bin/perl". The solution to this is usually to do readlink() on the file, and if an absolute path is returned, use it, or else if it is a relative path, substitute the last component ("perl") with it, and recurse, performing the entire process over again on the new path.

    One could argue that Cwd::abs_path() should use the Win32::GetFullPathName() subroutine on WIN32. Maybe it does in the latest ActiveState Perl Build. I only have build 633 installed (Perl 5.6.1). Perhaps somebody else could verify whether Cwd::abs_path() works correctly on files in build 640?

    Good luck,

Re: Absolute pathnames from relative?
by toma (Vicar) on Jan 22, 2003 at 04:16 UTC
    It seems that you want a module that does something like this:
    my $fn= File::Spec->rel2abs("../bin/fnurgle","/usr/local/bin"); print "Before: $fn\n"; for (File::Spec->splitdir($fn)) { if ($_ ne "..") { push @arr, $_ } else { pop @arr } } print "After: ",File::Spec->catdir(@arr),"\n";
    but I couldn't find it either. Of course, we could transform the for loop into a map, go on a quest, and golf it!

    It should work perfectly the first time! - toma

      This could be a good addition to File::Spec
      sub File::Spec::un_upwards { my($self, $foo) = @_; $foo = $self->canonpath( $foo ); my $updir = $self->updir(); return $foo if -1 == index $foo, $updir; my( $volume, $directories, $file ) = $self->splitpath($foo); my @dar; foreach($self->splitdir( $directories )){ if( $_ eq $updir ){ if(@dar){# in case the path is relative, ie ../fo/bar/base pop @dar; } else { push @dar, $_; } } else { push @dar, $_; } } return $self->catpath( $volume, $self->catdir( @dar ), $file ); }

      MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
      I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
      ** The third rule of perl club is a statement of fact: pod is sexy.

Re: Absolute pathnames from relative?
by Gilimanjaro (Hermit) on Jan 22, 2003 at 11:15 UTC
    I myself come from a long family of dentists, and there have always loved toothpicks. I hereby present my solution in the form of an incurable form of falling toothpick syndrome;

    s/\/[^\/]*\/\.\.\//\//g

    Just set this baby on your concatenated paths, and your '..' will magically disappear. You could use

    s/\/\.\//\/g
    to strip your single dots as well...

    This will just give you the path it 'should' be, if there are no weird filesystem thingies going on.

    (yes, I could've used s### or something, but this is more fun.)

      You said:

      s/\/[^\/]*\/\.\.\//\//g

      which is exactly what I was looking for... but it doesn't work if there are two together, as in "foo/bar/../../baz/" So, put it in a loop:

      1 while $foo =~ s/\/[^\/]*\/\.\.\//\//g;

      dan
      www.danheller.com

      Oy veh! This is a case where hairdressers might write the less frightening code, using curlies:
      s{/[^/]*/+\.{2}/}{/}g;
      (also switched to using positive-lookahead to check for the final "/" in the pattern, so as to avoid the need for a loop; and just in case the input is "goofy but functional", ala "/one/two//../three", I added "+").

      (Update: never mind what I said about positive-lookahead. There seems to be no way to avoid using a loop to make this "work", so no point using "(?=/)" as the last part of the match. This was never intended as a serious solution anyway, given the issues discussed in other replies.)

Re: Absolute pathnames from relative?
by Cody Pendant (Prior) on Jan 22, 2003 at 05:11 UTC
    You need "URI::URL".

    I've used it with HTML::LinkExtor in this way:

    @links = map { $_ = url($_, $base)->abs; } @links;
    where the base URL is the /usr/bin one -- that will probably do what you want.

    Update: I just looked at that again and I realise they're not URLs, they're file locations in your question, so it probably isn't what you want. Minus me. Unless I'm just lucky and it work?
    --
    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.” M-J D

      If you look closely at the end of grantm's post, you'll see that he has thought of using the URI module already and it works - if you prepend file:// to the file path.

      -- Hofmator

Re: Absolute pathnames from relative?
by zentara (Archbishop) on Jan 22, 2003 at 15:24 UTC
    Have you looked at FindBin ? perldoc FindBin.
    SYNOPSIS
            use FindBin;
            use lib "$FindBin::Bin/../lib";
            or
            use FindBin qw($Bin);
            use lib "$Bin/../lib";
    DESCRIPTION
           Locates the full path to the script bin directory to allow
           the use of paths relative to the bin directory.
    

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://228917]
Approved by pfaut
Front-paged by thelenm
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2024-04-16 05:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found