Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Re: Re: Re: Re^4 Useful addition to Perl?

by etcshadow (Priest)
on Mar 08, 2004 at 06:32 UTC ( #334719=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: Re^4 Useful addition to Perl?
in thread Useful addition to Perl?

Fair criticisms. The particular point that the posted code doesn't handle symlinks is valid, and will be fixed before I submit this. The meta-point that there will be other issues such as this that will inevitably come up, and that this will create wasted effort as similar parallel issues are fixed over time in File::Find is also very true.

The issue, though, from my standpoint, is that File::Find doesn't offer any means (at least as far as I can think of... please tell me if I'm wrong) to turn it's use inside-out... that is, if you will, to ask File::Find for a file, rather than be told by File::Find that there is a file.

Ideally, I'd be able to use File::Find::Iterator, but to look at it, it also doesn't reuse File::Find... it's just yet another implementation of directory-tree traversal, and so using it would bring in all the same issues of duplicating work of File::Find (not me duplicating the work, but the mantainers of File::Find::Iterator). Also, File::Find::Iterator appears to be not very complete (version .3), and not file-system-portable (it assumes a directory-separator, and it, also, doesn't handle symlinks).

In a truly ideal world, File::Find would offer some kind of interface that allowed it operate in this manner, but I just don't see it / can't think of how to do it. Sadly, it would be easy to build File::Find's interface out of a thing wrapper around File::Find::Iterator's, but not vice-versa, and File::Find is the one that (currently) works :-(

So, I suppose I can reframe the question as: is there a way to build an iterator-like interface out of an event-generator interface? Even doing ugly stuff with goto's, I can't think of how to get past the fact that I'd have to be leapfroging backward and forward over a couple of stack-frames (or, more precisely, saving a couple of stack-frames off to the side, and then deleting them... then restoring them back later).

What would make me really want to use File::Find for this is if the maintainers of File::Find decided to flip things around a little bit so that File::Find was just a thin wrapper around an underlying iterator class that did the real work... and then I could piggy-back off of that same underlying iterator.

I don't mean to sound closed-minded... I'm not. I'm just trying to figure out a solution to a problem with certain constraints, and to the best I've been able to figure out so far, File::Find won't work within those constraints. I would actually love to figure out how to use File::Find for this, but to the best of my knowledge, it won't. I'd love to hear suggestions about how to fit File::Find into this problem, without violating the two primary constraints that:

  • @ARGV not be blown up to include the entire file tree
  • perl -ne '...' (and any similar looping over <>) works in a completely DWIM fashion.
Thanks for any ideas.
------------ :Wq Not an editor command: Wq

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: Re^4 Useful addition to Perl?
by demerphq (Chancellor) on Mar 08, 2004 at 11:23 UTC

    The issue, though, from my standpoint, is that File::Find doesn't offer any means (at least as far as I can think of... please tell me if I'm wrong) to turn it's use inside-out... that is, if you will, to ask File::Find for a file, rather than be told by File::Find that there is a file.

    Well, to me what you are doing is transforming directories into a list of files right? So the code would be:

    sub recurse_dir { my $dir=shift; my @files; find { wanted=> sub {push @files,$_ unless -d $_}, no_chdir=>1},$dir +; return @dirs; }

    Which then makes your code become:

    my @argv=map { -d $_ ? recurse_dir($_) : $_ } @ARGV;

    Note that this code replaces FWICT your entire doesnt repeatedly stat files it already has, and is robust and portable, and could obviously be inlined further and provides a while host of filtering and hooks with low effort. All you have to do is wrap your tie logic around it and presto...

    Also I bypassed the point about not putting the entire tree into the array. I suspect that you will find that in order to prevent circular directory structures blowing you out of the water you are going to have to store all the visited directories, which essentially means hold the whole lot in memory. Essentially I dont see this as a particularly good idea.


    ---
    demerphq

      First they ignore you, then they laugh at you, then they fight you, then you win.
      -- Gandhi


Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://334719]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2019-10-19 05:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?