Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Batch processing files in a directory

by newbie1991 (Acolyte)
on Feb 13, 2013 at 14:49 UTC ( #1018551=perlquestion: print w/ replies, xml ) Need Help??
newbie1991 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks! I have another question that is probably quite simple, but I would like some help since I'm new to the concept. My objective is to take all the files in a directory with the extension *.ptseq, and process them such that there are two output files for each input file. If the file in the directory is A.ptseq, I should get A1.txt and A2.txt as outputs, such that I can keep A1 and A2 to use for later. Is this following code enough?
@files = <*.ptseq>; foreach $file (@files) {...}
I've been reading about opendir and readdir as well, so I don't know what my starting point should be. I appreciate all the help you can provide because I am here to learn :)

Comment on Batch processing files in a directory
Download Code
Re: Batch processing files in a directory
by blue_cowdawg (Monsignor) on Feb 13, 2013 at 14:55 UTC

    my approach?

    my @files=(); opendir(DIR,"/path/to/my/dir") or die "/path/to/my/dir: $!"; while(my $fname=readdir(DIR)){ push @files,$fname if $fname =~ m@*\.ptseq$@; } closedir(DIR);

        process them such that there are two output files for each input file.

    This part of your requirements are not clear to me. Process them how?


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
      Well, the ptseq file has description and sequence information. A1 would contain only descriptions, and A2 would have the sequence information.
            Well, the ptseq file has description and sequence information. A1 would contain only descriptions, and A2 would have the sequence information.

        Maybe I'm stupid, but that means nothing to me.


        Peter L. Berghold -- Unix Professional
        Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Batch processing files in a directory
by tmharish (Friar) on Feb 13, 2013 at 15:23 UTC
    ls *.ptseq | perl -Mstrict -nwe 'my $file = $_; chomp( $file ); my ( $name, $dont +_care ) = split( /\./, $file ); print "cp $file $name" . "1" . ".txt +&& cp $file $name" . "2" . ".txt\n" ' > run.sh && sh run.sh && rm ru +n.sh
Re: Batch processing files in a directory
by 7stud (Deacon) on Feb 13, 2013 at 17:58 UTC

    I've been reading about opendir and readdir as well, so I don't know what my starting point should be.

    opendir() will return all the files in a directory--but you only want the .ptseq files, so why start with a bigger set of files than you want? I recommend globbing as your starting point. However, when globbing don't use the line input operator(<>)--use glob() instead.

    use strict; use warnings; use 5.010; use File::Basename; my $target_file_ext = '.ptseq'; my $target_file_pattern = "*$target_file_ext"; for my $in_name (glob $target_file_pattern) { my ($name) = fileparse($in_name, ($target_file_ext) ); my $out_1_name = "${name}1.txt"; my $out_2_name = "${name}2.txt"; open my $OUT_1, '>', $out_1_name or die "Couldn't open $out_1_name: $!"; open my $OUT_2, '>', $out_2_name or die "Couldn't open $out_2_name: $!"; open my $INFILE, '<', $in_name or die "Couldn't open $in_name: $!"; while (my $line = <$INFILE>) { chomp $line; #Some $line processing here: say {$OUT_1} "$line ($out_1_name)"; say {$OUT_2} "$line ($out_2_name)"; } close $INFILE; close $OUT_1; close $OUT_2; }
    Also, note that filenames can look like this: one.two.three.ptseq (which my not apply in your case), so I used File::Basename to extract the filename without the extension.
Re: Batch processing files in a directory
by Kenosis (Priest) on Feb 13, 2013 at 20:47 UTC

    Here's another option:

    use strict; use warnings; for my $file (<*.ptseq>) { my ($baseName) = $file =~ /^(.+)\.[^.]+$/; my $textFile1 = $baseName . '1.txt'; my $textFile2 = $baseName . '2.txt'; print "$file ->\t$textFile1\t$textFile2", "\n"; }

    Output from my dir:

    fileA.ptseq -> fileA1.txt fileA2.txt fileB.ptseq -> fileB1.txt fileB2.txt fileC.ptseq -> fileC1.txt fileC2.txt fileD.ptseq -> fileD1.txt fileD2.txt

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1018551]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (6)
As of 2014-09-21 02:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (166 votes), past polls