Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: average a column in tab-delimited file

by Anonymous Monk
on Jan 28, 2012 at 00:41 UTC ( #950474=note: print w/ replies, xml ) Need Help??


in reply to average a column in tab-delimited file

Should output look like this?

Name10128 1A 33.2 A Test 0.715650000 -0.011 +379692 Name4382 1A 34.3 A Test 0.904300000 -0.035 +366577 Name1635 1A 34.9 A Test 0.547452083 -0.004 +684633 Name10267 1A 34.9 A Test 0.547452083 -0.004 +684633 Name10039 1A 34.9 A Test 0.547452083 -0.004 +684633 Name22270 1A 44.7 A Test 0.030666667 0.1572 +64030 Name22285 1A 44.7 A Test 0.030666667 0.1572 +64030 Name22701 1A 44.7 A Test 0.030666667 0.1572 +64030 Name10054 1A 46.4 A Test 0.000000000 0.6416 +18497


Comment on Re: average a column in tab-delimited file
Download Code
Re^2: average a column in tab-delimited file
by garyboyd (Acolyte) on Jan 30, 2012 at 10:36 UTC

    thanks for all the suggestions, and yes the output should look like the output that anonymous monk posted above.

    I can read the file into an array, but I wasn't sure how to proceed through the array to average columns 6 and 7.

    #!/usr/bin/perl use strict; use warnings; use Getopt::Long; my $infile; my @fields; my $i; my $j; GetOptions ( "infile=s" => \$infile, ); open INFILE, "<$infile" or die $!; #open RES, ">result.txt" or die $!; while (<INFILE>){ @fields = split(/\t+/, $_); my ($name, $chr, $location, $gen, $dom, $pval, $fst) = @fields +[0..6]; }
Re^2: average a column in tab-delimited file
by garyboyd (Acolyte) on Jan 31, 2012 at 12:10 UTC

    once I have my values in an array, how do I find average of all similar values for column 3? eg how do I find all of those that have 34.9 and average the result in column 6?

      Please explain in words, how you would do it, using paper and pencil

        well I've written something that almost does the job, except it goes wrong on the very last line of the file. I read the file into an array and calculate the averages which I then put into a hash and then iterate through the array and lookup values from the hashes. I'm sure there's a more elegant way to do it though....

        #!/usr/bin/perl # Usage: perl average_table.pl -i <input file> use strict; use warnings; use Getopt::Long; my $infile; my @fields; my @array; my @values; my %hash; my $add_fst; my @out; my @count; my $i=1; my @final; my %pval; my %fst; my $prev_loc; my $prev_pval; my $prev_fst; my $prev_join; GetOptions ( "infile=s" => \$infile, ); open INFILE, "<$infile" or die $!; @fields = <INFILE>; foreach (@fields){ @array = split(/\t+/, $_); chomp (@array); my ($name, $chr, $location, $gen, $dom, $pval, $fst) = @array[ +0..6]; my $join = "$chr-$location"; push (@values, $join, $pval); if ($location == $prev_loc){ $pval = $pval + $prev_pval; $fst = $fst + $prev_fst; $i++; } elsif ($location != $prev_loc) { my $pval_average = ($prev_pval / $i); my $fst_average = ($prev_fst / $i); my $join2 = "$prev_join\t$pval_average\t$fst_average"; $pval{$prev_join} = $pval_average; $fst{$prev_join} = $fst_average; push (@final, $join2); $i = 1; } $prev_loc = $location; $prev_pval = $pval; $prev_fst = $fst; $prev_join = $join; } foreach (@fields){ @out = split(/\t+/, $_); chomp (@out); my ($name, $chr, $location, $gen, $dom, $pval, $fst) = @ou +t[0..6]; my $join = "$chr-$location"; my $new_pval = $pval{$join}; my $new_fst = $fst{$join}; print $name."\t".$chr."\t".$location."\t".$gen."\t".$dom."\t". +$new_pval."\t".$new_fst."\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://950474]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (8)
As of 2014-08-30 15:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls