Re^2: Optimizing performance for script to traverse on filesystem

I guess that I'm the "devil's advocate" to the "devil's advocate"?

re: File::Find - I think that we could cooperate and possibly increase internal performance (I'm game for that), but the interface is "spot on" - it works!.

My suggested modifications to the OP's code represents a massive simplification of program logic.

There is only one file system operation that happens per $File:Find::name. Maybe File::Find does some more "under the covers"? I'm not sure what you are proposing... But basically, I see no problem with code that makes a single decision based upon a single input.

I'm game to increase the performance of File::Find - are you willing to help me do it?
I think that will be be a pretty hard undertaking.
I'm not sure that it is even possible.
But if it is, let's go for it!

Comment on Re^2: Optimizing performance for script to traverse on filesystem

Replies are listed 'Best First'.

Re^3: Optimizing performance for script to traverse on filesystem
by graff (Chancellor) on Feb 03, 2012 at 04:34 UTC

Re^2: Get useful info about a directory tree

(I've seen enough benchmark discussions here at the monastery to know that a proper benchmark can be an elusive creature.)

If that benchmark happens to be a valid comparison of the two approaches, it would also be a good exercise for a debugger or profiler session, to see what's causing the difference.

In any case, I definitely don't want to dissuade people from using File::Find or its various derivatives and convenience wrappers -- they do make for much easier solutions to the basic problem, and in the vast majority of cases, a little extra run time is a complete non-issue. (It's just that I've had to face a few edge cases where improving run time when traversing insanely large directories made a big difference.)

[reply]


Problems? Is your data what you think it is?
	PerlMonks