Hello gowthamvels,
When manipulating tabular data, I often like to use the
Data::Table module. I noticed a potential problem with your input data. It had two columns named
col4. So, I edited the header of your data such that the columns you want to calculate sums for are
col7 and
col8. If your data is in
data.csv as follows,
col1,col2,col3,col4,col5,col6,col7,col8
1234,GP,20170715,0,V,97517,24,0.6
5678,Pack,20170715,0,V,97516,88,1.8
1234,GP,20170715,0,V,97517,22,0.6
5678,Pack,20170715,0,V,97517,183,3.9
1234,PRS,20170715,0,S,97517,261,5.4
5678,PRS,20170715,0,M,97517,36,0.9
then the following code will result in the output that you want.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Table;
# Load input data from csv file
my $dt = Data::Table::fromCSV('data.csv');
# Make a new table that only contains the relevant columns
my $st = $dt->subTable(undef, [ 'col2', 'col7', 'col8' ]);
# Group by 'col2', calculate sums for 'col7' and 'col8'
my $ot = $st->group(
['col2'], # column to group by
['col7', 'col8'], # Columns to perform calculation on
[ \&sum, \&sum ], # Apply sum function to 'col7' and 'col8'
['sum_of_col7', 'sum_of_col8'] # Put the sums in these columns
);
print $ot->csv, "\n";
sub sum {
my @data = @_;
my $sum = 0;
foreach my $x (@data) {
next unless $x;
$sum += $x;
}
return $sum;
}
exit;
The output is
col2,sum_of_col7,sum_of_col8
GP,46,1.2
Pack,271,5.7
PRS,297,6.3