http://www.perlmonks.org?node_id=951846


in reply to Re^4: average a column in tab-delimited file
in thread average a column in tab-delimited file

Here is a solution I came up with. The same general idea - but uses the module List::Util, (part of perl core since v5.7.3), to get the sum of the 2 columns.

I compared the join of (chr, location), to determine the end of a run instead of just location by itself.

#!/usr/bin/perl use strict; use warnings; use List::Util qw/ sum /; chomp(my $line = <DATA>); my $prev_chr_loc = join "-", (split /\t/, $line)[1,2]; my @lines = $line; while (my $line = <DATA>) { chomp $line; my $chr_loc = join "-", (split /\t/, $line)[1,2]; if ($chr_loc eq $prev_chr_loc) { push @lines, $line; } else { print "$_\n" for compute_avg( @lines ); @lines = $line; } $prev_chr_loc = $chr_loc; } print "$_\n" for compute_avg( @lines ); sub compute_avg { my @lines = @_; my $avg_col5 = sprintf "%.9f", sum(map {(split /\t/)[5]} @lines) / @lines; my $avg_col6 = sprintf "%.9f", sum(map {(split /\t/)[6]} @lines) / @lines; for (@lines) { my ($c5,$c6) = (split /\t/)[5,6]; s/$c5\t$c6$/$avg_col5\t$avg_col6/; } return @lines; }

Update: Redid substitution for closer spec.