Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

summary statistics using Data::Table

by sisterdot (Initiate)
on Dec 11, 2015 at 11:03 UTC ( #1150012=perlquestion: print w/replies, xml ) Need Help??

sisterdot has asked for the wisdom of the Perl Monks concerning the following question:

dear monks,

i wanted to give Data::Table a try for generating summary statics on a text file from within perl.

given a table such as

#! /usr/bin/perl use Data::Table; my $t = new Data::Table( [ ['a', 1000, 2000, 3000, 200,500], ['b', 2000, 1000, 1000, 700,800], ['c', 3000, 3000, 3000, 5,7], ], ['Name', 'value1', 'value2', 'value3', 'value4', 'value5'], 0);

what would be the easiest way to get a new column in the table object with the mean of 'value1', 'value2', 'value3' and a separate new column with the mean of 'value4', 'value5'. group is obviously not made for the task.

thanks for any ideas!

sisterdot

Replies are listed 'Best First'.
Re: summary statistics using Data::Table
by choroba (Archbishop) on Dec 11, 2015 at 11:37 UTC
    I've never used Data::Table. But I skimmed its documentation, installed it and was able to implement a solution using the colsMap method:
    #! /usr/bin/perl use warnings; use strict; use List::Util qw{ sum }; use Data::Table; my $t = new Data::Table( [ ['a', 1000, 2000, 3000, 200,500], ['b', 2000, 1000, 1000, 700,800], ['c', 3000, 3000, 3000, 5,7], ], ['Name', 'value1', 'value2', 'value3', 'value4', 'value5'], 0); $t->addCol(undef, 'v1-3'); $t->colsMap(sub { $_->[-1] = sum(@$_[1 .. 3]) / 3 }); $t->addCol(undef, 'v4,5'); $t->colsMap(sub { $_->[-1] = sum(@$_[4, 5]) / 2 }); print $t->csv;

    Or, more DRY:

    sub set_avg { my @cols = @_; sub { $_->[-1] = sum(@$_[@cols]) / @cols } } $t->addCol(undef, 'v1-3'); $t->colsMap(set_avg(1 .. 3)); $t->addCol(undef, 'v4,5'); $t->colsMap(set_avg(4, 5));
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Thanks a lot charob! i like DRY :-)

      here are suggestion i got which use the colnames:

      Version1:

      my @mean123=map( ($t->elm($_,'value1')+$t->elm($_,'value2')+$t->elm($_ +,'value3'))/3.0, 0..$t->nofRow()-1); $t->addCol(\@mean123, 'Mean123'); print $t->csv;

      Version2:

      $t->addCol(undef, 'Mean45'); # add a new column for (my $i=0; $i< $t->nofRow; $i++) { $r=$t->rowHashRef($i); $t->setElm($i, 'Mean45', ($r->{'value4'}+$r->{'value5'})/2.0); } print $t->csv;
      all works super fine- thanks a lot for your help!
Re: summary statistics using Data::Table
by u65 (Chaplain) on Dec 11, 2015 at 11:27 UTC

    Welcome, sisterdot! We are happy to advise but would appreciate seeing what you have tried first. Be sure to use code blocks and neat formatting in your code. Also, the code is best studied if it is a complete, but short, program.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1150012]
Approved by choroba
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2021-06-18 22:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (90 votes). Check out past polls.

    Notices?