Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Opening multiple files

by mcasspj (Initiate)
on Jan 29, 2012 at 20:32 UTC ( #950649=perlquestion: print w/replies, xml ) Need Help??

mcasspj has asked for the wisdom of the Perl Monks concerning the following question:

Back to Perl after a long, too long break! I wish to make a csv data file (for R) and I have a series of files consisting of lists of single items. Rather than open the first file, outputing the first line to a new (output) file then opening the second file, reading the first item and appending it to the first line of the output file and so on until I run out of input files then doing the same for the second line I'm sure there must be a more elegant way. I know the files all have the same number of lines so don't need to worry about this. Somehow I'm guessing the solution uses
while (<>) {}
but not sure how to handle the handles (no pun intended!) Cheers Paul

Replies are listed 'Best First'.
Re: Opening multiple files
by Eliya (Vicar) on Jan 29, 2012 at 21:41 UTC

    You could do something like this:

    #!/usr/bin/perl -w use strict; my @fh; # open all files for my $i (0..$#ARGV) { open $fh[$i], "<", $ARGV[$i] or die $!; } while (1) { my @lines; # read one line from each file push @lines, scalar readline $fh[$_] for 0..$#ARGV; last unless defined $lines[0]; chomp @lines; print join(",", @lines), "\n"; }
    $ ./ infile1 infile2 infile3 ... > outfile

    (Note that you have to use readline with "complex" file handle expressions like $fh[$i], as <$fh[$i]> wouldn't work here.)


      Maybe I'm still asleep but, will this ever return out of the while loop?

      -Kiel R Stirling
        will this ever return out of the while loop?

        Yes, due to the

        last unless defined $lines[0];

        where $lines[0] becomes undef when EOF of the first file has been reached.

Re: Opening multiple files
by kielstirling (Scribe) on Jan 30, 2012 at 01:42 UTC

    If I understand correctly what you are wanting to do .. this is how I would do it.

    Have fun !!

    -Kiel R Stirling

    #!/usr/bin/perl -w use strict; use IO::File; my $output = pop @ARGV; my @merge; for my $file (@ARGV) { my $input_fh = IO::File->new($file); die "failed to open $file\n" unless defined $input_fh; $merge[$.-1] .= $_ while (defined ($_ = <$input_fh>) and chomp); $input_fh->close; } my $output_fh = IO::File->new("> $output"); die "failed to open $output\n" unless defined $output_fh; $" = "\n"; print $output_fh "@merge"; $output_fh->close;

    ./ file1 file2 file3 output_file

      That's exactly what I wanted, just tweaked it and it's perfect!

      Many Thanks Paul
Re: Opening multiple files
by Anonymous Monk on Jan 29, 2012 at 20:48 UTC

    Um, paste -d, one two three four > onetofour.csv

    $ perl -e " for my $l ( a..c ){ open $fh, '>', $l; print $fh qq[$l$_\n +] for 1 .. 3; } " $ paste -d, a b c a1,b1,c1 a2,b2,c2 a3,b3,c3
      #!/usr/bin/perl -- use strict; use warnings; use Text::CSV; use autodie qw/ open close /; Main( @ARGV ); exit( 0 ); sub Main { return Usage() unless @_ > 2; my( $outfile, @infiles ) = @_; my $csv_out = Text::CSV->new( { always_quote => 1, binary => 1, eol => $/, } ) or die Text::CSV->error_diag(); open $outfile, '>:raw', $outfile; # autodie $csv_out->print( $outfile, \@infiles ) or die $csv_out->error_diag + ; @infiles = map { open my $fh, '<:raw', $_; $fh } @infiles; while( not( grep eof, @infiles ) ){ my @lines = map { scalar readline( $_ ) } @infiles; #~ chomp( @lines ); #~ no warnings 'uninitialized'; #~ s/[\r\n]+$// for @lines; s/[\r\n]+$// for grep defined, @lines; $csv_out->print( $outfile, \@lines ) or die $csv_out->error_di +ag ; } undef $csv_out; close $outfile; # autodie } ## end sub Main sub Usage { print <<"__USAGE__"; $0 $0 outFile.csv inOne.csv inTwo.csv inThree.csv ... EXAMPLE SESSION \$ perl out.csv ta.csv tb.csv tc.csv td.csv \$ cat out.csv "ta.csv","tb.csv","tc.csv","td.csv" "a1","b1","c1","d1""st(i)n,ker""" "a2","b2","c2","d2'st(i)n,ker'" "a3","b3","c3","d3""'st(i)n,ker'""" ,,,"d4 stinker unquoted " __USAGE__ } ## end sub Usage


        -  while( not( grep eof, @infiles ) ){

        +  while( not( grep \&eof, @infiles ) ){

      Cheers, the reasons I didn't want to use paste are:
      1. The people who may use this wont be on linux/unix boxes or have cygwin.
      2. I want to include some pre-processing in the program, it's Russian Election Data scraped from the web and needs a little cleaning i.e. numbers are formatted with spaces.
      3. To learn some Perl.
      Many Thanks

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://950649]
Approved by ww
Front-paged by Corion
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (3)
As of 2019-08-24 02:13 GMT
Find Nodes?
    Voting Booth?

    No recent polls found