Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

File::Find bummers on an NFS volume.

by petdance (Parson)
on Jul 31, 2001 at 20:24 UTC ( #101209=perlquestion: print w/replies, xml ) Need Help??

petdance has asked for the wisdom of the Perl Monks concerning the following question:

I had this little test code:
#!/usr/bin/perl -w use strict; use File::Find; find( \&handler, "/mnt/morkcd" ); sub handler { print "$File::Find::name\n"; }
that refused to descend into the subdirectories on the NFS-mounted volume. I asked in the Chatterbox about it, and I was told that I had to add:
$File::Find::dont_use_nlink = 1;
Sure enough, that fixed it, but why? Perl In A Nutshell simply says "Set this variable if you're using the Andrew File System (AFS)", and perldoc says "Set the variable $File::Find::dont_use_nlink if you're using AFS, since AFS cheats." So what's an "nlink", and why is this AFS-specific fix also an NFS-specific fix?

xoxo,
Andy
--
<megaphone> Throw down the gun and tiara and come out of the float! </megaphone>

Replies are listed 'Best First'.
Re: File::Find bummers on an NFS volume.
by tadman (Prior) on Jul 31, 2001 at 20:34 UTC
    From the little I can get out of the oracle, it should've been turned on all along, but for some reason wasn't. Maybe you're using an outdated version of File::Find or what have you.

    If you look at the spec for NFS, RFC 1094, you will see that nlink is "the number of hard links to the file (the number of different names for the same file)". My theory is that without the dont_use_nlink option, the File::Find system refuses to recurse into what it perceives is an empty directory.
      Your theory sounds pretty good, and this code
      if ($nlink == 2 && !$avoid_nlink) { # This dir has no subdirectories.
      seems to back that up.

      So what does one do to go about getting the pod patched to mention NFS and Samba as well as AFS?

      xoxo,
      Andy
      --
      <megaphone> Throw down the gun and tiara and come out of the float! </megaphone>

        Don't just patch the POD. Patch the module! See RE: RE: File::Find. I'm not aware of a single modern system where having that on by default makes sense. Any system that supports CD-ROM drives should not have it turned on by default, for example. And it is all Win32 file systems, not just remote samba file systems that are incompatable with it. Most Unix systems support quite a few file system types that don't follow the "nlink" rule.

        Of course, I and others have tried to patch this multiple times and none of the patches have ever been accepted. I think File::Find just has too much bad history behind it and what we really need is a replacement that

        • doesn't require the use of a callback
        • never tries the "nlink" trick
        • (and so) always stats each file just before transfering to the user's code so that the user can just stat _ (or -f _, etc.) to make most uses run faster (yes, that's right, the "nlink" trick makes certain trivial uses of File::Find run quite a bit faster but it also forces non-trivial traversals to be slower!).

        File::Recurse might be such a replacement but I'm worried that it isn't being maintained. So I'd rather people put work into something like that (and getting it bundled with the base Perl distribution) than banging their head against the brick wall that is the "nlink" trick in File::Find.

                - tye (but my friends call me "Tye")
Re: File::Find bummers on an NFS volume.
by grinder (Bishop) on Jul 31, 2001 at 20:55 UTC
    The same thing will occur if you try to use File::Find on Win32 upon a Samba-mounted drive. Reading the source is pretty instructive in this case. In a nutshell nlink is the reference count of the inode.

    When considering directories, it stats the current directory ('.') to find out how many references the inode has. If it has two, then it may be inferred that the inode is linked to by the directory itself and its parent directory. By extension, that means that it has no subdirectories, therefore there is no point going any deeper. Thus the search stops going any deeper.

    This turns out to be a very useful speed optimisation, especially when finding depth first, because you don't have to go into the leaf directory, stat every directory entry, just to learn that there are no more subdirectories.

    Unfortunately, not all filesystems behave this way, notably the Andrew File System (not that I have ever knowingly worked with one) and especially Samba-mounted drives. And, as you say, on NFS shares.

    The code can't (maybe it is too expensive to deduce) determine itself whether the number of inode links is reliable or not, which is why you sometimes have to give it a manual hint by way of

      $File::Find::dont_use_nlink = 1;

    --

    g r i n d e r

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://101209]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2021-06-12 18:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (53 votes). Check out past polls.

    Notices?