Re: How can you make this script general?
by davido (Cardinal) on Mar 08, 2014 at 19:03 UTC
|
You can't make it more general until you understand how to use multi-dimensional data structures in Perl, which means learning how to use references. Start with perlreftut, and then if you need more depth, continue to perlref, perllol, and perldsc. These documents will be enlightening.
Eventually you will work toward an implementation where rather than giving each column a named array, you will have an array of rows, and each row element will hold a reference to an anonymous array of columns. ...or you might invert it so that the top level array represents columns, and each column element holds a reference to an anonymous array of row elements.
| [reply] |
Re: How can you make this script general?
by BrowserUk (Patriarch) on Mar 08, 2014 at 19:17 UTC
|
C:\test>perl -F\t -anle"$sums[$_]+=$F[$_] for 0 .. $#F; }{ printf qq[C
+olumn:%u total:%u\n], $_, $sums[$_] for 0 .. $#sums"
1 2 3
4 5 6
7 8 9
^Z
Column:0 total:12
Column:1 total:15
Column:2 total:18
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
#The idea is to read the first line of the file and see how many colum
+ns we have, then we proceed accordingly, column-by-column
open (INFILE10, '<', 'ex1.dat') or die "File ex1.dat does not exist!\n
+";
my $firstLine = <INFILE10>;
close INFILE10;
my @array_firstLine=split(/\t/, $firstLine);
my $total_columns=scalar(@array_firstLine);
print "This file has $total_columns columns in total.\n";
for(my $k=1; $k<=$total_columns; $k++)
{
print "Calculate sum for column $k\n";
my $wanted_column_number=$k; #this is the column that we want to
+sum up each time, until we finish the columns
my $sum_of_column=0;
open (INFILE10, '<', 'ex1.dat') or die "File ex1.dat does not exist!
+\n";
while( my $line10 = <INFILE10>)
{
my @split_line10 = split(/\t/, $line10);
my $respective_element = $split_line10[$k-1];
$sum_of_column = $sum_of_column + $respective_element;
}
close INFILE10;
print "The sum for column $k is: $sum_of_column.\n";
}
| [reply] [d/l] |
|
mate you're wrong,
with perl you don't have to do any of what you're doing.
while(@a = split /\t/, <DATA>){
$b[$_] += $a[$_] for 0..$#a;
}
print "@b\n";
__DATA__
1 2 3 4
5 6 7 8
9 10 11 12
| [reply] [d/l] |
Re: How can you make this script general?
by tangent (Parson) on Mar 08, 2014 at 19:12 UTC
|
If all you need to do is calculate the sums then this might help:
my %count;
while (my $line = <DATA>) {
chomp $line;
my @cols = split("\t",$line);
$count{$_} += $cols[$_] for 0 .. $#cols;
}
for my $col_num (sort { $a <=> $b } keys %count) {
print "Total for column $col_num: $count{$col_num}\n";
}
__DATA__
1 2 3 4
3 4 5 6
6 7 8 9
Output:
Total for column 0: 10
Total for column 1: 13
Total for column 2: 16
Total for column 3: 19
| [reply] [d/l] [select] |
|
| [reply] |
|
Why use a hash and then have to sort
Just a habit really. An array would be more efficient.
| [reply] |
|
perhaps tangent wants to be prepared for the general case, where the columns have names...
| [reply] |
|
Re: How can you make this script general?
by AnomalousMonk (Archbishop) on Mar 08, 2014 at 21:11 UTC
|
use 5.010; # for // operator
use warnings;
use strict;
use Data::Dump;
use Test::More
# tests => ?? + 1 # Test::NoWarnings adds 1 test
'no_plan'
;
use Test::NoWarnings;
my @totals;
while (my $line = <DATA>) {
chomp $line;
my @fields = split ' ', $line;
for my $i (0 .. $#fields) {
# $totals[$i] = $fields[$i] + (defined($totals[$i]) ? $totals[$i
+] : 0);
$totals[$i] = $fields[$i] + ($totals[$i] // 0);
}
# dd \@fields; dd\@totals; # FOR DEBUG
}
my $max_input_cols = @totals;
ok $max_input_cols == 5, qq{max number input columns};
is_deeply \@totals, [ 21, 176, 909, 6006, 20002 ], qq{column totals};
printf qq{max cols in input data: %d \n}, $max_input_cols;
print qq{column totals: \n};
printf qq{%6d}, $_ for 0 .. $#totals;
print qq{\n};
for my $col (@totals) {
printf qq{%6d}, $col;
}
print qq{\n};
__DATA__
1 11
2 22 202 2002 20002
3 33 303
4 44 404 4004
5
6 66
Output:
c:\@Work\Perl\monks\Anonymous Monk\1077543>perl ragged_field_summation
+_1.pl
ok 1 - max number input columns
ok 2 - column totals
max cols in input data: 5
column totals:
0 1 2 3 4
21 176 909 6006 20002
ok 3 - no warnings
1..3
| [reply] [d/l] [select] |
Re: How can you make this script general?
by Laurent_R (Canon) on Mar 08, 2014 at 23:54 UTC
|
No real need for Perl references in my view, nor for any complex data structure. A simple array should do the work (if I understood the requirement well).
use strict;
use warnings;
my @sums;
$sums[$_] = 0 for 0..20;
while (<DATA>) {
my @fields = split /\s+/, $_;
for (0..20) {
$sums[$_] += $fields[$_] if defined $fields[$_];
}
}
print "@sums", "\n";
__DATA__
1 2 3 4 6
3 4 5 6
6 7 8 9
My only assumption is that the number of columns is equal to or less than 21. This is the resuling outputt;
$ perl column_sum.pl
10 13 16 19 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
| [reply] [d/l] [select] |
Re: How can you make this script general?
by llancet (Friar) on Mar 10, 2014 at 09:34 UTC
|
You should notice that Perl arrays are autovivified, where if you use an index out of range, the array will be auto-expanded.
So what you need is: use an array to record the sum of columns. each time you read a line, and add the columns to the sum array.
my @sum;
while (<FH>)
{
chomp;
my @F = split /\t/;
$sum[$_] += $F[$_] for 0..@F-1;
}
| [reply] [d/l] |