Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: How to read files in all subfolders?

by 2teez (Priest)
on Oct 21, 2012 at 11:22 UTC ( #1000205=note: print w/ replies, xml ) Need Help??


in reply to How to read files in all subfolders?

Hi coltman,
To open directory use opendir, not open which is use for files and associated with a FILEHANDLES.
The code below uses recursive call to go through all the folders in a directory given.

#!/usr/bin/perl use warnings; use strict; use Cwd qw(abs_path); die "no directory provided " unless defined $ARGV[0]; my $path = abs_path $ARGV[0]; search_all_folder($path); sub search_all_folder { my ($folder) = @_; if ( -d $folder ) { chdir $folder; opendir my $dh, $folder or die "can't open the directory: $!"; while ( defined( my $file = readdir($dh) ) ) { chomp $file; next if $file eq '.' or $file eq '..'; search_all_folder("$folder/$file"); ## recursive call read_files($file) if ( -f $file ); } closedir $dh or die "can't close directory: $!"; } } sub read_files { my ($filename) = @_; open my $fh, '<', $filename or die "can't open file: $!"; while (<$fh>) { print $_, $/; } }
Hope this helps.
UPDATE:
Like others have said before now, it will be half the work and a lot easiler to use module like File::Find like so:
use warnings; use strict; use Cwd qw(abs_path); use File::Find qw(find); die "no directory provided " unless defined $ARGV[0]; my $path = abs_path $ARGV[0]; find( \&search_all_folder, $path ); sub search_all_folder { chomp $_; return if $_ eq '.' or $_ eq '..'; read_files($_) if (-f); } sub read_files { my ($filename) = @_; open my $fh, '<', $filename or die "can't open file: $!"; while (<$fh>) { print $_, $/; } }

If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me


Comment on Re: How to read files in all subfolders?
Select or Download Code
Re^2: How to read files in all subfolders?
by Lotus1 (Chaplain) on Oct 21, 2012 at 17:59 UTC

    2teez,

    I noticed a few things that could help the performance of your File::Find solution above.

    In the search sub:

    find( \&search_all_folder, $path ); sub search_all_folder { chomp $_; return if $_ eq '.' or $_ eq '..'; read_files($_) if (-f); }
    • The chomp isn't needed here since File::Find changes to the subdirectory and returns just the filename in $_.
    • The return if line isn't needed since in the next line the read_files() sub is only called for regular files. '.' and '..' are directories so they won't be included.
    • Since the OP specified only to print text files why not use the file test -T? The -f test will allow the program to attempt to print binary files which is pretty annoying if there are any lurking in a subdirectory.

    In this function:

    sub read_files { my ($filename) = @_; open my $fh, '<', $filename or die "can't open file: $!"; while (<$fh>) { print $_, $/; } }
    • Printing $/ when printing each line will cause an extra blank line to appear since you did not use chomp when the line was read.

    And one question: what is the reason for using abs_path? I find that File::Find works fine with relative pathnames. Does it help the performance if an absolute pathname is given?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1000205]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (8)
As of 2014-07-22 08:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (107 votes), past polls