Beefy Boxes and Bandwidth Generously Provided by pair Networks Bob
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Processing multiple files

by mocnii (Novice)
on Dec 12, 2012 at 18:35 UTC ( #1008563=perlquestion: print w/ replies, xml ) Need Help??
mocnii has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I wrote the simple script to process all text files in a directory using opendir and readdir. I want to know what are alternatives for this task and what would you do differently (better).

#! /usr/bin/perl -w #test file to learn how to process multiple files at once use strict; use warnings; use Cwd; #module to get the current working directory #first get the path for current directory my $dir = getcwd; #declare the filehandeles outside the foreach loop my $text_file_fh_in; my $text_file_fh_out; #open and read only the .txt files opendir (DIR, $dir) or die $!; my @textFiles = grep /\.txt/, readdir DIR; #open each file for further processing foreach my $text_file (@textFiles) { open ($text_file_fh_in, "<", $text_file) || die "$!"; #open for r +eading #create the names of the new files my $text_file_out = $text_file; $text_file_out =~ s/\.txt//; $text_file_out = $text_file_out . '_new.txt'; open ($text_file_fh_out, ">", $text_file_out) || die "$!"; #open +for writing #tests to see if it works #print "$text_file\n"; print "$text_file_fh_in\n"; #print "$text_file_out\n"; print "$text_file_fh_out\n"; MAIN: { while (<$text_file_fh_in>){ chomp; if (/^(.*?)\.(.*?)\ (.*?)\.(.*?)\ (.*?)$/){ print $text_file_fh_out "$1\t$2\t$4\t$5\n";} } } #close the filehandles close $text_file_fh_in || die "Could not close $text_file_fh_in"; + close $text_file_fh_out || die "Could not close $text_file_fh_out" +; } closedir DIR || die "Could not close DIR";

UPDATE: Thank you for your replies.

Comment on Processing multiple files
Download Code
Re: Processing multiple files
by blue_cowdawg (Prior) on Dec 12, 2012 at 18:47 UTC
        I want to know what are alternatives for this task and what would you do differently (better).

    That's a pretty wide open question the answer to which depends on a lot of factors. More factors than I'd care to enumerate here.

    One of my personal favorites for working with files (although I use opendir and plain old open more often is Tie::File which allows you to treat a file as if it were an array. Pretty slick if you ask me.

    That said I am an adherent to the old adage "right tool for the job." If open does the job, and it does really well, then use it. Don't get fancy unless you have a real need to.

    There is a real temptation when you spot something "neat" that you end up having a solution in search of a problem. A good example of that is when I got my first router and router table for wood working. Pretty soon I was doing all sorts of fancy edging work on every wood project I had. I realized I'd overstepped the utility of it when I used a coving bit to round off the edges really pretty for something that was going to be out of sight anyway.

    Writing code is like that too. We get tempted to use that fancy programming technique when brute force is a quicker way to get the job done.

    Just some thoughts...


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Processing multiple files
by Anonymous Monk on Dec 12, 2012 at 18:56 UTC
Re: Processing multiple files
by tobyink (Abbot) on Dec 12, 2012 at 19:15 UTC

    I'd probably do it something like this. (Untested code...)

    use strict; use warnings; use aliased 'Path::Class::Rule'; use Path::Class qw( dir file ); my $files = Rule->new->file->name(qr{\.txt$})->iter; while ( my $file = $files->() ) { (my $newfile = $file) =~ s/\.txt$/_new.txt/; $newfile = file($newfile); print "$file -> $newfile\n"; $newfile->spew( map { s/^(.*?)\.(.*?)\ (.*?)\.(.*?)\ (.*?)$/$1\t$2\t$4\t$5/; $_ } $file->slurp; ); }
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: Processing multiple files
by aitap (Chaplain) on Dec 12, 2012 at 19:38 UTC

    Generally, your code is OK, but I would like to point out some things which can be improved.

    #declare the filehandeles outside the foreach loop
    Why? You can write less code (e.g. open my $filehandle, ...) and (if you declare it inside the loop) Perl will close the filehandle automatically right after current iteration of the block finishes executing.

    opendir (DIR, $dir) or die $!; my @textFiles = grep /\.txt/, readdir DIR;
    For simple matching of directory contents diamond operator (glob) can serve using smaller code: my @files = <*.txt>; - without any need to open or close any directories.

    By the way, you can open just "." instead of using Cwd. "." means "current directory" in modern operating systems.

    my $text_file_out = $text_file; $text_file_out =~ s/\.txt//; $text_file_out = $text_file_out . '_new.txt';
    This can be done in one operation: (my $text_file_out = $text_file) =~ s/\.txt$/_new.txt/;. Another option: my $text_file_out = $text_file =~ s/\.txt$/_new.txt/r; (/r modifier is available since Perl 5.14, see Regexp Quote Like Operators).

    Sorry if my advice was wrong.
Re: Processing multiple files
by jwkrahn (Monsignor) on Dec 12, 2012 at 19:56 UTC
    what would you do differently
    #!/usr/bin/perl use warnings; use strict; #test file to learn how to process multiple files at once #first get the path for current directory my $dir = '.'; # current working directory #open and read only the .txt files opendir my $DIR, $dir or die "Cannot open '$dir' because: $!"; my @textFiles = grep /\.txt\z/, readdir $DIR; closedir $DIR or die "Could not close '$dir' because: $!"; #open each file for further processing foreach my $text_file ( @textFiles ) { open my $text_file_fh_in, '<', $text_file or die "Cannot open '$te +xt_file' because: $!"; #open for reading #create the names of the new files my $text_file_out = $text_file; $text_file_out =~ s/(?=\.txt\z)/_new/; open my $text_file_fh_out, '>', $text_file_out or die "Cannot open + '$text_file_out' because: $!"; #open for writing #tests to see if it works #print "$text_file\n"; print "$text_file_fh_in\n"; #print "$text_file_out\n"; print "$text_file_fh_out\n"; while ( <$text_file_fh_in> ) { if ( /^(.*?)\.(.*?)\ (.*?)\.(.*?)\ (.*?)$/ ) { print $text_file_fh_out "$1\t$2\t$4\t$5\n"; } } #close the filehandles close $text_file_fh_in or die "Could not close '$text_file' becaus +e: $!"; close $text_file_fh_out or die "Could not close '$text_file_out' b +ecause: $!"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1008563]
Approved by 2teez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (12)
As of 2014-04-18 14:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (469 votes), past polls