Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: Streaming to Handles (iterator)

by tye (Sage)
on May 05, 2004 at 23:26 UTC ( #350948=note: print w/replies, xml ) Need Help??

in reply to Streaming to Handles

You don't need a stream; you want an iterator (yes, similar term). To turn this into an iterator in Perl5, you need to keep your own "stack". That is easy to do with an anonymous array (or two) inside your object.

I put file names that I have yet to output into @{ $self->{files} } and output the next one from there the next time the iterator is called. I put directory names that I have yet to read the list of files from into @{ $self->{dirs} } and when there aren't any more file names to return, I read the next directory.

First, here is how you'd use my iterator:

#!/usr/bin/perl use strict; use warnings; require List; my $f= List->new( @ARGV ); my $file; while( $file= $f->next() ) { print "$file\n"; }

And here is the code that implements it:

package List; # Terrible name use strict; use warnings; use Cwd qw( cwd ); require File::Spec; use vars qw( $VERSION ); $VERSION = '0.99'; sub new { my( $class, $path )= @_; my $self= { }; if( defined $path ) { $self->look_in( $path ); } bless $self, $class; return $self; } sub look_in { my( $self, $path )= @_; $path= cwd() unless @_ > 1; $path= File::Spec->canonpath($path); $self->{path}= $path; $self->{dirs}= [$path]; $self->{files}= []; } sub next { my( $self )= @_; while( 1 ) { if( @{ $self->{files} } ) { my $file = shift @{ $self->{files} }; if( -d $file ) { push @{ $self->{dirs} }, $file; } return $file; } if( ! @{ $self->{dirs} } ) { return; } my $dir= shift @{ $self->{dirs} }; if( opendir( DIR, $dir ) ) { $self->{files}= [ map { File::Spec->catfile( $dir, $_ ); } File::Spec->no_upwards( readdir(DIR) ) ]; closedir DIR; } else { warn "opendir failed, $dir: $!\n"; } } } 1;

I tested it enough to see that it appears to work just fine.

If you had directories with huge numbers of files directly in them (not in subdirectories), then you might want to make the iterator a bit more complicated such that you don't keep a list of file names and instead return each file name (almost) immediately after you get it back from readdir (but I'm not sure I would recommend that).

- tye        

Replies are listed 'Best First'.
Re: Re: Streaming to Handles (iterator)
by crabbdean (Pilgrim) on May 06, 2004 at 01:25 UTC
    Thanks, I haven't tested this but looking at it, it "makes sense" and on appearance appears to be what I'm looking for. Thanks. BIG GRINS!! ++ I'll test it in the coming days and let you know. I'll post back with my findings/results

    By the way, the package name "List" was only used for this example although I'm unsure of what to call the module. I'm assuming it will come under the "File::" modules and could call it "list" or "DirList" or something. Do you have any good suggestions for a name?

    Thanks once again. :-)

    The Funkster of Mirth
    Programming these days takes more than a lone avenger with a compiler. - sam
    RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers
Re: Re: Streaming to Handles (iterator)
by crabbdean (Pilgrim) on May 07, 2004 at 08:04 UTC
    I've written this into my code and it works perfectly! :-) I'll tweak it a bit and get it working with the other features in my module. But that's exactly what I was after. Big ++ !!!

    I'm just running a benchmark now to see a comparasion against the alternative solution of returning files as arrays.

    I intend to leave both methods in the module so it gives the user the choice of streaming or returning via arrays.

    Here are the benchmark results:
    Rate stream array stream 76.1/s -- -14% array 88.7/s 17% -- Rate stream array stream 79.8/s -- -5% array 83.9/s 5% -- Rate stream array stream 72.2/s -- -10% array 80.2/s 11% --
    As you can see returning via array's is faster.

    Once again a big thank you!

    The Funkster of Mirth
    Programming these days takes more than a lone avenger with a compiler. - sam
    RFC1149: A Standard for the Transmission of IP Datagrams on Avian Carriers

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://350948]
[Discipulus]: so my 'less' usage sounded not so correct? 'fewer' was more appropriate? i take a note too
Discipulus $ENV{PAGER}='fewer '
[Corion]: Discipulus: Hahahaa! ;)
[Corion]: This would be a great April fools joke, since it certainly makes sense to show fewer pages on screen :)
[marioroy]: greetings all
[Corion]: Hi marioroy!
[Corion]: I'm slowly thinking that IP::CloudHoster is a bad name. The module started out as a module to identify whether an IP address comes from a cloud hoster like AWS (or Google GCE, or Linode), but now I also added ...
[Corion]: ... a way to verify that an IP is the GoogleBot crawler. Maybe I should rename it to IP::HostInformatio n or IP::ServiceInforma tion or something like that...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (12)
As of 2018-01-23 09:18 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (242 votes). Check out past polls.