Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Restrict file search within current filessystem using wanted subroutine

by madparu (Initiate)
on Apr 28, 2016 at 11:41 UTC ( [id://1161756]=perlquestion: print w/replies, xml ) Need Help??

madparu has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I found the following code in Perl Monks to search for top 10 largest files in a filesystem. But it searches in all sub directories of different filesystem. Example., I have /var and /var/log filesystems, my requirement is to search top 10 files under /var but the code searches through /var/log too. is there a way to restrict the search with teh current filesystem alone with this script using File:find and warning subroutine?

use File::Find; File::Find::find( { wanted => sub { return unless -f; my $s = -s _; return if $min < $s && @z > 10; push @z, [ $File::Find::name, $s ]; @z = sort { $b->[1] <=> $a->[1] } @z; pop @z if @z > 10; $min = $z[-1]->[1]; } }, shift || '.' ); for (@z) { print $_->[0], " ", $_->[1], "\n"; }
  • Comment on Restrict file search within current filessystem using wanted subroutine
  • Download Code

Replies are listed 'Best First'.
Re: Restrict file search within current filessystem using wanted subroutine
by haukex (Archbishop) on Apr 28, 2016 at 12:15 UTC

    Hi madparu,

    There is a little-known feature of File::Find from find2perl, the variable $File::Find::topdev, which holds the device number of the path currently being searched under, which you can compare with the device number of the current file (stat). Stick the following at the top of your wanted function, and your search will be limited to the filesystems of the paths which you tell File::Find::find to search under.

    if ( $File::Find::topdev != (stat)[0] ) { $File::Find::prune = 1; return }

    Update: Fixed stat vs. stat(_), apologies. My test code was doing a stat beforehand, so stat(_) worked fine. But if the piece of code above is the very first thing in your wanted function, you should stat $_ first, and only then use the special filehandle _. (The exception is when the follow option is set, then File::Find guarantees that an lstat has been called.)

    And by the way, Use strict and warnings!

    Hope this helps,
    -- Hauke D

      Hello Hauke,

      A little confused... Can you please update the code I posted with your recommended piece of code to understand better.

      Thanks!

        Hi madparu,

        Copy the two lines of code I posted above and paste them into your script between the lines "wanted => sub {" and "return unless -f;". If you have any problems, feel free to ask (and post your code along with any error messages you may be getting).

        Simply put, what the two lines do is compare the device number (see stat) of the path you are searching under to the device number of the file currently being inspected. If they don't match, meaning the file currently being inspected is on a different file system than where the search began, then the code tells File::Find not to descend into that directory by setting $File::Find::prune and to not further process the file (return).

        Regards,
        -- Hauke D

Re: Restrict file search within current filessystem using wanted subroutine
by GotToBTru (Prior) on Apr 28, 2016 at 11:51 UTC

    Simplest way - return if in a directory tree you don't want.

    use File::Find; File::Find::find( { wanted => sub { return unless -f; return if ($File::Find::dir =~ /^\/var\/log/); my $s = -s _; return if $min < $s && @z > 10; push @z, [ $File::Find::name, $s ]; @z = sort { $b->[1] <=> $a->[1] } @z; pop @z if @z > 10; $min = $z[-1]->[1]; } }, shift || '.' ); for (@z) { print $_->[0], " ", $_->[1], "\n"; }

    Update: a preprocess subroutine seems to be the canonical way to exclude files or directories. But see this post for a caveat I learned about the hard way.

    But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1161756]
Approved by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (9)
As of 2024-04-18 08:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found