Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

reading N files

by gri6507 (Deacon)
on Jul 19, 2006 at 14:30 UTC ( #562297=perlquestion: print w/replies, xml ) Need Help??

gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have a need to read N files one line at a time and then manipulate those individual lines depending on their content. I was hoping to do something like this

use strict; use warnings; use English; print "Usage: $0 output input1 input2 ...\n"; my $outfile = shift @ARGV; open(OUT, ">$outfile") || die "Can't open $outfile for writing: $!\n"; my @infile; foreach (@ARGV) { open(IN, $_) || die "can't open $_ for reading: $!\n"; push @infile, \*IN; } foreach(@infile){ my $i = <$_>; print "Got: $i"; }

where the @infile array contains all the open file handles so I could read from each one individually. Unfortunately, my test loop at the bottom seems to only read the contents of the last input file specified on the command line. I have this nagging feeling that my problem has something to do with closures (a concept I do not completely comprehend yet). What am I doing wrong?

Replies are listed 'Best First'.
Re: reading N files
by Fletch (Chancellor) on Jul 19, 2006 at 14:33 UTC
    for (@ARGV ) { open( my $in, "<", $_ ) or die "Can't open '$_': $!\n"; push @infile, $in; }

    Presuming a recent enough Perl, of course. See perlopentut and the open docs. Older Perls you could use IO::File instead in a similar fashion.

      Thank you. That was exactly it!
Re: reading N files
by polettix (Vicar) on Jul 19, 2006 at 14:42 UTC
    The problem is that the IN you open is the same at each iteration, because it is the filehandle slot into the IN symbol in package main. In your case, it is the same as using global variables: you always access the same variable, and you keep writing on it.

    The solutions are in Fletch's post.

    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Don't fool yourself.
Re: reading N files
by GrandFather (Saint) on Jul 19, 2006 at 17:36 UTC

    This smacks of an XY Problem. Unless you are interleaving the files in some fashion, opening the file handles in advance does not seem like a good solution. If this is not a task requiring interleaving you might like to explain what you want to achieve so we can help with the larger problem.

    BTW, you should generally use the three parameter open to avoid surprises. Your two open lines change to:

    open (OUT, '>', $outfile') || ... and open (IN, '<' $_) || ...

    Generally the usage line is better printed only when "required":

    if (! @ARGV) { print "Usage: $0 output input1 input2 ...\n"; exit -1; }

    DWIM is Perl's answer to Gödel
      Actually, this was to interleave the input files. I had a problem where I had N files, each with 2 columns: a time column and a value column. I needed to interleave the N files, so that the output would have N+1 columns: one sorted time columns and N columns of either empty cells or the corresponding value. I hope this is a bit clearer.
        So, if you happened to be on a *n*x box (or have a windows port of standard unix utilities), you could just do a shell command (that includes a perl one-liner):
        # assuming N files are named in some systematic way, # and columns are separated by whitespace: paste file.* | perl -pe '($t)=(/^(\S+)/); s/\t$t//g;' > multi-column.f +ile
        The unix "paste" command takes a list of file names and concatenates them "horizontally", line by line; for a list of input files (1..N), its default behavior replaces the newline with a tab for each line of files 1..N-1.

        Assuming that all files in the set have the same series of values in the first column, the perl script removes all but the first occurrence of that value on each line. (If all these assumptions don't apply, then your approach of reading from a set of file handles in a loop is fine, of course.)

Re: reading N files
by Solo (Deacon) on Jul 19, 2006 at 15:13 UTC
    I have a need to read N files one line at a time and then manipulate those individual lines

    The diamond operator also works for this, and is much simpler, IMO.

    use strict; use warnings; use English; print "Usage: $0 output input1 input2 ...\n"; my $outfile = shift @ARGV; open(OUT, ">$outfile") || die "Can't open $outfile for writing: $!\n"; while(<>){ # $_ contains line }

    Simplicity is in the eye of the beholder, of course. YMMV.

    Update: Later posts make it clear the OP wanted to process each file's line N before moving to each file's line N+1. Obviously, the diamond operator is not very useful for that behavior.


    You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://562297]
Approved by McDarren
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (1)
As of 2021-09-26 19:14 GMT
Find Nodes?
    Voting Booth?

    No recent polls found