Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
P is for Practical
 
PerlMonks  

merging csv files into a third file preserving column & row

by zing (Beadle)
on Apr 01, 2013 at 10:10 UTC ( #1026455=perlquestion: print w/ replies, xml ) Need Help??
zing has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I have these 4 csv files : File_1.csv
target1,48,12,7 target2,17,16,2 target3,22,6,1
File_2.csv
target5,14,12,8,,3, target6,5,7,9,,,15
File_3.csv
target1,,,,,,13 target2,,,,,,8 target4,,,,11,5,6
File_4.csv
target1,,,,51,8 target2,,,,87,42 target4,22,3,7,,
And I want to merge them all specific to row and column each value is in(preserving column information also). My present code is working fine but its not able to preserve the column information for any value. Here goes my present script
use strict; use warnings; print ("Now merging \n"); my $filenum = 0; my ( %row_val, %data ); foreach my $file ( sort glob("*.csv") ) { $filenum++; open my $fh, "<", $file or die $!; while ( my $line = <$fh> ) { chomp $line; my ( $row_val, @values ) = split /,/, $line; $row_val{$row_val} = 1; $data{$filenum}{$row_val} = \@values; } close $fh; } foreach my $row_val ( sort keys %row_val ) { print $row_val, ",", join( ",", map { $data{$_}{$row_val} ? @{ $da +ta{$_}{$row_val} } : ",," } 1 .. $filenum ), "\n"; }
The merged output(WRONG) generated by above code
target1,48,12,7,,,,,,,,,13,,,,51,8 target2,17,16,2,,,,,,,,,8,,,,87,42 target3,22,6,1,,,,,,,,, target4,,,,,,,,,,11,5,6,22,3,7 target5,,,,14,12,8,,3,,,,,, target6,,,,5,7,9,,,15,,,,,,

-----DESIRED/CORRECT OUTPUT-----

target1,48,12,7,51,8,13 target2,17,16,2,87,42,8 target3,22,6,1,,, target4,22,3,7,11,5,6 target5,14,12,8,,3, target6,5,7,9,,,15

Comment on merging csv files into a third file preserving column & row
Select or Download Code
Re: merging csv files into a third file preserving column & row
by Anonymous Monk on Apr 01, 2013 at 10:21 UTC
Re: merging csv files into a third file preserving column & row
by poj (Curate) on Apr 01, 2013 at 10:51 UTC
    You could change the data structure and merge the values early at the input stage like this. It assumes only one value in any one column. Then the print stage becomes more straight forward.
    foreach my $file ( sort glob("*.csv") ) { $filenum++; open my $fh, "<", $file or die $!; while ( my $line = <$fh> ) { chomp $line; my ( $row_val, @values ) = split /,/, $line; $row_val{$row_val} = 1; # delete $data{$filenum}{$row_val} = \@values; # add these 3 lines for my $c (1..@values){ $data{$row_val}[$c-1] .= $values[$c-1]; } } close $fh; } foreach my $row_val ( sort keys %row_val ) { print join ",",$row_val,@{$data{$row_val}},"\n" }
    poj
Re: merging csv files into a third file preserving column & row
by kcott (Abbot) on Apr 01, 2013 at 11:47 UTC

    G'day zing,

    I think you've possibly overcomplicated your solution. You mostly seemed to be on the right track until you got to the innermost loop (of the initial loop); further on, things became very complicated with the second loop that's producing the output. Possibly adding to any confusion are the variables %row_val and $row_val (two instances) giving you code like $row_val{$row_val}.

    A further issue is that you haven't fully described how you want your merge to work. For instance, what happens if two or more files have different (non-zero length) data for the same cell. Your sample input and output is insufficient to glean your intent here.

    In the code below: I've added logic to specify how the merge should occur (your real-world application may require different logic); replaced %row_val and %data with the single hash %merged; and greatly simplified the output code.

    $ perl -Mstrict -Mwarnings -E ' no warnings q{qw}; my @file_data = ( [qw{target1,48,12,7 target2,17,16,2 target3,22,6,1}], [qw{target5,14,12,8,,3, target6,5,7,9,,,15}], [qw{target1,,,,,,13 target2,,,,,,8 target4,,,,11,5,6}], [qw{target1,,,,51,8 target2,,,,87,42 target4,22,3,7,,}], ); use warnings q{qw}; my %merged; for my $file (@file_data) { for my $line (@$file) { my ($row_val, @values) = split /,/ => $line; for (0 .. $#values) { if (! defined $merged{$row_val}[$_] or ( ! length $merged{$row_val}[$_] and length $va +lues[$_] ) ) { $merged{$row_val}[$_] = $values[$_]; } } } } say join q{,} => $_, @{$merged{$_}} for sort keys %merged; ' target1,48,12,7,51,8,13 target2,17,16,2,87,42,8 target3,22,6,1 target4,22,3,7,11,5,6 target5,14,12,8,,3 target6,5,7,9,,,15

    -- Ken

Re: merging csv files into a third file preserving column & row
by Loops (Hermit) on Apr 01, 2013 at 12:02 UTC
    There have already been lots of good answers, but since i had already thrown this together, here it is for whatever value it might be. Cheers.
    sub process_file { my ($hash, $file) = @_; open my $fh, "<", $file or die $!; while (<$fh>) { chomp; my ($target, @columns) = split /,/; $hash->{$target}[0] = $target; for my $c (0 .. $#columns) { $hash->{$target}[$c+1] .= $columns[$c]; } } close $fh; } $, = ','; print ("Now merging \n"); my $data = {}; process_file($data, $_) for glob("*.csv"); say @{$data->{$_}} for sort keys %$data;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1026455]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (4)
As of 2014-04-21 05:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (490 votes), past polls