Contributed by shandor on Jun 21, 2000 at 16:01 UTC
Q&A  > directories

Answer: How do I search a directory tree for files?
contributed by QandAEditors

There are two possibilities, depending on how static the directory contents are, and how willing you are to trade speed against memory.

The first solution searches the whole tree. This is the solution to go for if the directories themselves are static, but the contents of the directories are not static. This version is slow, but it dosen't consume much memory (on harddisk).

#!/usr/bin/perl -w use strict; use File::Find; my @directories = (".", "/home/mp3"); my @foundfiles; # Here, we collect all .mp3 files below each directory in @directories + and put them # into @foundfiles find( sub { push @foundfiles, $File::Find::name if /\.mp3$/ }, @direct +ories ); # and output them all print join("\n",@foundfiles), "\n";

The second version uses a two step approach. We compute a list of all (interesting) files in the directory tree once, and save it into a file. If we want to check if a certain file is in the directory tree, we load this file into a hash and have a really fast lookup (if we want to look up more than one file) or we go through the file line by line (if we only look for a single file). This method obviously only works if the directory contents don't change very often, because our file is not always up-to-date. The code above serves very well to create the list of interesting files, just redirect its output into a file called index.

#!/usr/bin/perl -w use strict; use File::Basename; my %files; my @searchfiles = ("foo", "bar", "xxx"); open( INDEX, "< index" ) or die "Couldn't read index : $!\n"; # now we read every filename from our index file and put it in the has +h. # If we are only checking for one file, we could do the check right he +re # in the loop. # We also strip the path from the filename, as we will be searching fo +r files # (and if we already knew the path to the file, -e would be faster :) +) # This method dosen't care for duplicates. If we have two files with t +he same name, # only the last file will be reported. my $filename; while( <INDEX> ) { $files{ basename( $_ ) } = $_; }; close INDEX; # And now we check if the filenames are in the hash foreach (@searchfiles) { print $files{ $_ } if (exists $files{ $_ }); };

Please (register and) log in if you wish to add an answer

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.