http://www.perlmonks.org?node_id=1039418

james4545 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I am newbie to perl and i am stuck in arranging the output. Here is my problem.

I have N number of files that i process and get data such as :

file 1 output :

a
b
c
file 2 output:

2
f
4
s
w
i want the final output to be like this (arranged in column and comma separated):
a,2
b,f
c,4
,s
,w
Please note that the number of files can be in 100's , so i cant simply use join function. Thanks.

Replies are listed 'Best First'.
Re: arrange output in columns
by kennethk (Abbot) on Jun 17, 2013 at 17:34 UTC
    If you need to output a CSV file, I'd suggest using a module that writes CSV files, such as Text::CSV. As your data grows in complexity, this will scale easily without gotchas.

    Please note that the number of files can be in 100's , so i cant simply use join function.
    I don't follow why you can't populate an array (1 read per file) and then join the results and write them for your simple case. I'd think opening and reading from 100's of files simultaneously would represent a more significant technical challenge.

    So what have you tried? What failed? We're happy to help debug, but we're not a code writing service.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: arrange output in columns
by graff (Chancellor) on Jun 17, 2013 at 18:00 UTC
    In terms of pseudo-code, your task sounds sort of like this:
    declare a "master" array that will hold a set of arrays declare a max_array_size variable, initially set to zero for each input file { read its lines into a new array if the size of this array (# of lines in file) > max_array_size { update max_array_size to this new value (size of current array) } push a reference to this array onto the "master" array. } for each index ( 0 .. max_array_size -1 ) { join the values at master[0..$#master][index] into a comma-delimite +d string # note that some of the values in that set may be undef print the resulting string }
    HTH.
Re: arrange output in columns
by rjt (Curate) on Jun 17, 2013 at 18:46 UTC

    I could give you hints and try to string you along, but for a self-professed newbie to Perl, sometimes a good example is more instructive than coming up with a sub-optimal solution of your own:

    use warnings; use strict; my @fh; # Store filehandles open $fh[@fh], '<', $_ or die "Can't open `$_': $!" for <1039418/*.tx +t>; while (1) { no warnings qw/uninitialized io/; my $line; $line .= <$_> // "\n" for @fh; last if $line =~ /^\n*$/; # All files finished or blank $line =~ s/\n(?!\z)/,/g; # Delimit with commas, except for last +\n print $line; }

    The challenge to you, james4545, is to take this and understand it. Break it apart. Figure out why each statement does what it does. Put it back together in a different way. Ask specific questions. Thoroughly read the documentation for each of the functions, plus perlop and perlre. Watch an episode of Futurama and then code it (this example, not Futurama) yourself from scratch without looking. Meditate on the beauty of =~ . Have a @π[1..3]. And then so shall you follow the path of a Perlmonk.

    Or, by all means, copy/paste this code and move on. As much as we like to see others walk the monastic path, we monks tend to be a self-satisfied bunch.

Re: arrange output in columns
by hdb (Monsignor) on Jun 18, 2013 at 13:19 UTC
    use strict; use warnings; my @result; my $count = 0; for (<"j*.txt">) { open my $fh, "<", $_ or die "Cannot open $_ $!\n"; while(<$fh>){ chomp; $result[$.-1][$count] = $_; } $count++; } for my $line (@result) { print join( ",", map { $_//"" } @$line ), "\n"; }
Re: arrange output in columns
by lee_crites (Scribe) on Jun 17, 2013 at 18:28 UTC

    homework???

    This just feels like a homework assignment, so I'll answer in plain english, and let you translate it into perl. After all, it does you no good to have someone with three dozen years of experience do your homework, right?

    Read the files, one by one. Nothing is mentioned about how you get the list of files, so I assume you know what is needed to get the next file name, open the file, and read the information in.

    I would keep up with the number of the file being processed (0 for first, 1 for second, etc) and the number of the line read in (0 for first, etc). Then build a two-dimensional array with the data ($goo{linenum}{filenum}). This means you have all of your data in one pass through the files.

    Then report the information, going first by "linenum" and "joining" them together. There is a good chance you will have missing items in the two-dimentional array, so that is something you will have to keep in mind.

    I am betting this assignment requires extensive use of for loops, so that is something you should go back into the text and find.

    Once you have something working, and want some help with debugging or whatever, that is something you might want to post for comment.

    Lee Crites
    lee@critesclan.com
      > <h1>homework???</h1>

      morbus headlineritis???

      Cheers Rolf

      ( addicted to the Perl Programming Language)

Re: arrange output in columns
by Anonymous Monk on Jun 17, 2013 at 18:16 UTC
    This is literally a two-dimensional array ... then iterating through the first dimension of the array and using Text::CSV on each arrayref thus found.
Re: arrange output in columns
by space_monk (Chaplain) on Jun 18, 2013 at 12:22 UTC

    Update:Working version :-)

    It's a slow day....
    use strict; use warnings; use English; # let's talk like pirates.... my @arr; my $fc = 0; for my $file (@ARGV) { open my $fh, "<$file" or die "We're screwed with $file"; while (<$fh>) { chomp; # maybe s/\s+$// ? $arr[$INPUT_LINE_NUMBER][$fc] = $_; } close $fh; $fc++; # inc filecount } # print for (my $i =1; $i <= $#arr; $i++) { # fill in the blanks print join(',', map { $arr[$i][$_] // '' } 0..$fc-1)."\n"; }
    If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)
      "I might get the bugs out of this later.... "

      Here's the 3 missing characters you'll need to fix the script  = { }

      but even when fixed it doesn't give the required result for files with different record counts.

      Output Required a,2 a,2 b,f b,f c,4 c,4 s ,s w ,w
      poj
        Thanks for the input. I knocked this up in part of my lunchbreak while munching a burrito, and it was totally untested. :-)

        Update: I spotted the problems you mentioned before you posted and they've now been fixed.

        Anyway, as my footer says, it's polite to ensure that the reader still has to think through some of the issues, especially as this may be a homework assignment! :-)

        If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)