Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Fastest way to recurse through VERY LARGE directory tree

by JavaFan (Canon)
on Jan 21, 2011 at 03:08 UTC ( #883449=note: print w/replies, xml ) Need Help??

in reply to Fastest way to recurse through VERY LARGE directory tree


It may very well be that whatever you come up with is only faster than File::Find on some setups of disks/volume managers/filesystem, and slower on others.

You need to benchmark to find out.

Now, in theory, carefully handcrafting something that does exactly what you need is going to be faster than a more general setup than File::Find. But whether that's actually be measurable is a different question.

So, benchmark.

Of course, as you describe the problem, the bottleneck might very well be your I/O.

Hence, benchmark.

Have I said you should benchmark? No? Well, benchmark!

  • Comment on Re: Fastest way to recurse through VERY LARGE directory tree

Replies are listed 'Best First'.
Re^2: Fastest way to recurse through VERY LARGE directory tree
by Anonymous Monk on Feb 08, 2018 at 09:00 UTC
    Currently the best way for lightweight scanning very big directory tree, is using library File::Find::Object::Rule

    Using this version, you can make secure iterator object, that do not load all scanned tree into memory before it start to work. Example use is very simple as iterator mode:

    $rule=File::Find::Object::Rule->new(); $rule->Some_filter_method_read_library_examples(parameters)->eventuall +y_next_filter(); $rule->start(path_or_array_of_paths); #here will be initialized iterat +or. don't panic, it will not load all big directory structure while (){ my $item=$rule->match(); #read one single item. I prefer do it here, + it prevents matching name as while loop break last unless defined $item; #stop looping after last element #here do anything with $item, it is path, example: printf "Fetched [%s]\n",$item; if (-l $item) {print "it is symbolic link\n"}; };
    you can leave this loop in any state, and for example start next scanning by calling next $rule->start(@new_searches). It will be reinitialized, for me it works. Of course, in that situation you'' use identical filters as previous. If you want do with different filters, call .....->new() and $rule->some_filters() again. warning, this is fork from library File::Find::Rule and File::Find, currently unmaintained for a long time. this notice I found on metacpan.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://883449]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2018-02-24 13:45 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (310 votes). Check out past polls.