To traverse a directory tree and do stuff with some or all of the data files therein, this method works very fast, takes up very little memory, and is a relatively easy framework for handling lots of jobs of this ilk. It involves using the standard unix "find" utility (which has been ported for ms-windows users, of course).
# assume you have a $toppath, which is where the traversal starts
chdir $toppath or die "can't cd to $toppath: $!";
open( FIND, "find . -type d -print0 |" ) or die "can't run find: $!";
# find will traverse downward from current directory
# (ie. $toppath), and because of the "-type d" option,
# will only list the paths of directories contained here;
# the "-print0" (thanks, etcshadow) sets a null byte as the
# string terminator for each file name (don't rely on "\n",
# which could be part of a file name).
{
local $/ = "\x0"; # added thanks to etcshadow's reply
while ( my $dir = <FIND> ) {
chomp $dir;
unless ( opendir( DIR, $dir )) {
warn "$toppath/$dir: opendir failed: $!\n";
next;
}
while ( my $file = readdir( DIR )) {
next if ( -d "$dir/$file" ); # outer while loop will handle al
+l dirs
# do what needs to be done with data files
}
closedir DIR;
# anything else we need to do regarding this directory
}
}
close FIND;
Comments:
The nice thing about this approach is that the "find" utility is very good with the recursive descent into subdirectories, and that's all it needs to do. Meanwhile, perl is very good with reading directory contents and manipulating data files, and it's really easy to do this when you're just working with data files in one directory at a time. Here, Perl can just skip over any subdirectories that it sees, because the output from "find" will bring those up for treatment in due course.
(update: made minor adjustments to comments in the code, added "closedir"; also wanted to point out that the loop over files could be moderated by using "grep ... readdir(DIR)", etc.)
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.