Re: Re: Re: Useful addition to Perl?

Replies are listed 'Best First'.
Re^4 Useful addition to Perl? by etcshadow (Priest) on Mar 05, 2004 at 05:39 UTC
OK... I bothered to finish it. Or at least get it to a working state (I don't really like just grepping out the "." and ".." directories... it feels so non-portable (even though I know it's cool on windows and nix)). package r; use strict; tie @ARGV, 'r::Tie::RecursiveARGVArray', @ARGV; sub import { } package r::Tie::RecursiveARGVArray; use Tie::Array; use base 'Tie::Array'; use File::Spec; sub TIEARRAY { my ($classname,@init) = @_; bless [@init], $classname; } sub FETCH { my ($self, $index) = @_; $self->_ReplaceDirs($index,$index); $self->[$index]; } sub FETCHSIZE { my ($self) = @_; scalar @$self; } sub STORE { my ($self, $index, $value) = @_; $self->[$index] = $value; } sub STORESIZE { my ($self, $count) = @_; $#$self = $count - 1; } sub SPLICE { my ($self,$offset,$length,@list) = @_; $self->_ReplaceDirs($offset,$offset+$length-1); splice(@$self,$offset,$length,@list); } sub POP { my ($self,$item) = @_; $self->_ReplaceDirs(-1,-1); pop(@$self); } sub _ReplaceDirs { my ($self, $fromindex, $toindex) = @_; # as long as the index range contains directories, substitute +the directory contents my $recursionguard = 0; while (my @indices = grep { -d $self->[$_] } ($fromindex..$toi +ndex) and $recursionguard++ < 10000) { my $index = $indices[0]; opendir DIR, $self->[$index] or do { warn "Cannot traverse directory $self->[$index +]: $!\n"; splice(@$self, $index, 1, ()); # remove the ba +d-apple next; }; my @contents = readdir DIR or do { warn "Cannot read directory $self->[$index]: $ +!\n"; splice(@$self, $index, 1, ()); # remove the ba +d-apple closedir DIR or warn "Cannot close directory $ +self->[$index] (weird): $!\n"; next; }; closedir DIR or warn "Cannot close directory $self->[$ +index] (weird): $!\n"; # if there is any portable way to do this... I'd like +to hear it! @contents = grep !/^\.{1,2}$/, @contents; # convert directory contents to paths by prepending th +e directory. # even be super nice about using catfile or catdir, ap +propriately @contents = map { my $asfile = File::Spec->catfile( $self->[$ind +ex], $_ ); -f $asfile ? $asfile : File::Spec->catdir( $se +lf->[$index], $_ ); } @contents; # replace directory with its contents splice(@$self, $index, 1, @contents); } } 1; [download] complete with example use: [me@host]$ cat `find d -type f` \| wc -l 58040 [me@host]$ perl -mr -lne '$x++; END{print $x}' d* 58040 [me@host]$ [download] I guess now I should pod this up and make it my first contribution to cpan :-D `------------ :Wq Not an editor command: Wq` [download]	[reply] [d/l] [select]
Re: Re^4 Useful addition to Perl? by demerphq (Chancellor) on Mar 06, 2004 at 13:53 UTC
I think the only problem with all of this is that you arent using it as a wrapper to File::Find. Youve got a good idea here, but hand rolling a directory traversal is not in my opinion smart. Also the way that you do it worries me a touch. Its an interesting implementation of a depth first traveral, but surely its quite inefficient? Arent you repeatedly doing file system checks over the same objects? I think you should rewrite this as an alternate interface to File::Find. Which would get you better portability and whole host of hooks and options to add. Overall its a good idea though. And I go with calling it something long and giving it a flexible import() interface. For instance: `use File::Find::ARGV filter=>sub { /\.txt/i }; while (<>){ ... }` [download] Anyway, its an interesting idea. ++ to you. --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi}	[reply] [d/l] [select]
Re: Re: Re^4 Useful addition to Perl? by etcshadow (Priest) on Mar 07, 2004 at 21:47 UTC
Well, the problem, as I see it, with writing this as a wrapper for File::Find is that that would be suboptimal for the most important use case, and that is perl one-liners (-pe and -ne). Also, for that matter, what this does and what File::Find do really only partially overlap, in that they both traverse directories... but that's about the end of it. The ultimate intent of this is to DWIM when I say `perl -mr -ne 'print if /foo/' *`, and to not do anything silly in the process, like creating a list of every file on the file-system. Maybe I'm wrong, here, but I think that this is an important enough goal (both to do and to do well), that it outweighs the importance of reusing File::Find. Granted, I'm not saying that reuse shouldn't be involved... I sure as heck wouldn't want to reimplement File::Spec. Really, what it comes down to is that File::Find implements a "push" interface from the file-system... that is, File::Find pushes file names into your code (because you give it a code-ref as an entry-point for your code). The thing is, though, that `perl -ne` or `perl -pe` would need a "pull" interface. That is, they translate to `while (<>) { ... }`. Which, itself, is essentially: `while (@ARGV) { $ARGV = shift @ARGV; open ARGV, $ARGV or warn("Couldn't open $ARGV: $!\n"), next; while (<ARGV>) { ... } }` [download] Now, to look at that code, you can see that it is definitely trying to pull filenames out of @ARGV... so the easiest way to implement an interface on that is to tie a behavior to reading from @ARGV... which is exactly what I've done. Now, it's true that I could make this pulling from @ARGV use File::Find as the behavior which underlies the read-event... but if I did that, then I'd end up reading in the whole file-system tree (or the whole sub-tree that is being accessed)... and if there's no good reason to do it that way, then I'd rather not. Granted, if File::Find offered a means to essentially say "depth => 1" (that is, give me all the contents of this directory, but don't traverse sub directories), then that might be worthwhile... as it would save the effort of opendir; readdir; closedir; grep; fix-file-names.... but that's just not what File::Find does. Moreover, I've never been happy with the fact that File::Find actually chdir's into the directory as it goes... that's just ugly. It should use File::Spec to prepend the leading path... but I digress. Anyway, I hope that explains why I didn't want to use File::Find for this. I did give it serious consideration... but ultimately, I think that the method I arrived at in the end is the best one that I considered. It is simple, elegant, efficient, and useful. And doing it with File::Find just couldn't make it be all of those at once. `------------ :Wq Not an editor command: Wq` [download]	[reply] [d/l] [select]
Re: Re: Re: Re^4 Useful addition to Perl? by demerphq (Chancellor) on Mar 07, 2004 at 22:24 UTC
Re: Re: Re: Re: Re^4 Useful addition to Perl? by etcshadow (Priest) on Mar 08, 2004 at 06:32 UTC
Some notes below your chosen depth have not been shown here
Re: Re^4 Useful addition to Perl? by BrowserUk (Patriarch) on Mar 06, 2004 at 11:12 UTC
Why use (open\|read\|close)dir and then have to concern yourself with removing '.' & '..' and use File::Spec->catfile/catdir to fix up the pathnames, when glob would take care of (most?) of that for you? Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l]
Re: Re^4 Useful addition to Perl? by tsee (Curate) on Mar 06, 2004 at 10:59 UTC
You really should because I'd be the first to download the module since I've been writing one-liners using File::Find far too often. As you all know, File::Find's interface sucks... If you need any support with packaging the module correctly for CPAN, feel free to contact me via email and I'll try to help. Steffen	[reply]


Pathologically Eclectic Rubbish Lister
	PerlMonks