Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

printing contents of small files into one larger one

by Angharad (Pilgrim)
on Nov 09, 2005 at 11:25 UTC ( #507030=perlquestion: print w/replies, xml ) Need Help??

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to think of a way of solving what seems to be a relatively simple problem but is turning out to be a bit more tricky than I thought.
I have in a directory 73 text files. Each text file looks something like this.
3.544 3.765 6.2341 8.9756
I want to create another file with all but one of these text files included ... but printed like this
file1datapt1 file2datapt1 file3datapt1 file1datapt2 file2datapt2 file3datapt2 etc ...
My task is to create 73 text files with all the datapoints of the files in the directory printed bar one ... the file to be excluded perhaps parsed to the program via the command line.
Right now ... because each small file has to be printed in the larger file with the datapoints going down the page as it where, the only method my pea brain comes up with is a horrible bested loop idea that will probably grind my wee computer to a halt.
I'm not looking for code necessarily (although all donations gratefully recieved) just suggestions of how to tackle this would be appreciated. Thanks

Replies are listed 'Best First'.
Re: printing contents of small files into one larger one
by holli (Abbot) on Nov 09, 2005 at 13:17 UTC
    Minimum memory footprint solution:
    use strict; use warnings; use FileHandle; my @files = ("test1.dat", "test2.dat", "test3.dat"); my @handles; @handles = map { FileHandle->new ($_) } @files; my $while = 1; while ($while) { my @lines; for ( @handles ) { $while = 0, last if $_->eof; $line = $_->getline; chomp $line; push @lines, $line; } print join ("\t", @lines), "\n"; }

    holli, /regexed monk/
Re: printing contents of small files into one larger one
by mlh2003 (Scribe) on Nov 09, 2005 at 11:50 UTC
    Depending on the actual size of your files, you could (this is pseudocode, sorry):

    1. loop through each file (open file, slurp into an array, close file).
    2. Add the array data to a 2-D array (each 'row' of the array would contain the data from each file).
    3. Do a transpose on the 2-D array to get the data in the format you want.

    This method would require 2 loops: The first to read the data from the files and add to the array. The second to transpose (which would be a simple 2-D nested loop).
    Code is untested unless explicitly stated
Re: printing contents of small files into one larger one
by inman (Curate) on Nov 09, 2005 at 12:18 UTC
    The following uses the <> operator to read all the files passed on the command line in a line at a time fashion. Each value is appended to a list in memory and then joined and printed when the internal file handle used by the diamond operator hits an end of file (see the description of eof in perlop for details).

    #! /usr/bin/perl use strict; use warnings; my @line; while (<>) { chomp; push @line, "$_"; } continue { if (eof) { print join("\t", @line), "\n"; @line = (); close ARGV; } }
    Update: Simplified. No need for an array or any fancy joining!
    while (<>) { chomp; if (eof) {print "$_\n"} else {print "$_\t"} }

    The order that the files are appended is taken from the order in which they are listed in @ARGV. This can be sorted before the files are processed.

      That's cute, but it doesn't quite do what was asked. It prints each file's entries per output row, when what's needed is each file per output column. Unfortunately that transposition greatly complicates the task. Either you need to read all the files into memory in some data structure, or open all the files and for every line that is to be output, read one entry per filehandle (see holli's answer). For the sake of completeness, here's a way to read all the data into a "hash of arrays" data structure. The keys are the names of the data files, and the values are references to the data read from each data file.

      If the only goal here is to print out the data, I would do it holli's way. Reading all the files into memory into one big data structure wouldn't scale very well.

      #!/usr/bin/perl use warnings; use strict; use Data::Dumper; my @files = glob('data/*'); my %filedata; for my $file (@files) { open my $fh, "<$file" or die "Ack, Can't open $file"; chomp( my @data = <$fh> ); close $fh; $filedata{$file} = \@data; } print Dumper(\%filedata);
      $VAR1 = { 'test/bb' => [ '5.6', '5.7', '5.9' ], 'test/aa' => [ '1.3', '1.4', '1.5' ] };
        Fair enough. I should have read the question!

        I have modified the technique to build an AoA in memory before writing it out. The OP wasn't clear on how files of differing length were to be treated. The code below expands the AoA as necessary to cope with files of different lengths.

        my @lines; my $line = 0; my $filecount = 0; while (<>) { chomp (${$lines[$line++]}[$filecount] = $_); if (eof) { $line = 0; $filecount++; } } foreach (@lines) { $#{$_} = $filecount -1; print join("\t", @{$_}), "\n" ; }
      What's wrong with slurping each file into an array? As in
      my @line = <FH>;
      (where FH is the filehandle of the currently-opened file). Am I missing something?

      Aside from using a different approach to read the file into an array, the final output still needs to have the data transposed.
      Code is untested unless explicitly stated
Re: printing contents of small files into one larger one
by Fletch (Chancellor) on Nov 09, 2005 at 14:26 UTC

    TMTOWTDI, sometimes not even using Perl: man paste

Re: printing contents of small files into one larger one
by Angharad (Pilgrim) on Nov 09, 2005 at 13:42 UTC
    Thanks everyone. You have all been very helpful :)

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://507030]
Approved by spiritway
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2021-12-01 02:24 GMT
Find Nodes?
    Voting Booth?

    No recent polls found