Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: Consolidating biological data into occurance of each biological unit per sample in table

by Cristoforo (Deacon)
on Jan 13, 2013 at 03:49 UTC ( #1013081=note: print w/replies, xml ) Need Help??


in reply to Consolidating biological data into occurance of each biological unit per sample in table

This would apply only to the data you provided - an animal name and its color, coded as they are.
#!/usr/bin/perl use strict; use warnings; use Text::Table; my (%animal, %color); while (<DATA>) { chomp; my (undef, $sample, $animal, $color) = split /[\t;]/; s/^\w__// for $animal, $color; $animal{$sample}{$animal}++; $color{$sample}{$color}++; } for my $entity (\%animal, \%color) { my @samples = sort keys %$entity; my $tb = Text::Table->new( map {title => $_}, " ", @samples); my %seen; my @keys = grep $_ && !$seen{$_}++, map keys %$_, values %$entity; for my $key (@keys) { $tb->load( [$key, map $entity->{$_}{$key} || 0, @samples] ); } print $tb; print "\n\n"; }
Update: Upon reflection, several points occured to me. For one thing, this code doesn't deal with any possible additional fields other than 'animal' and 'color'. That could be fixed with some minor changes to the code. And using Text::Table for the presentation is probably the wrong choice given that there may be 100's of animals/colors (as well as more than a dozen or so 'samples', (A to Z??). I think Text::Tables is more suited to smaller datasets that you can see on 1 or 2 pages of output.

Two possible solutions I can think of would be to create a comma separated values file with the program output to be read by a spreadsheet program like Excel. Excel allows you to freeze the headers or first column, (animal or color), so you can scroll through the output and still keep the categories in sight.

The other possibility would be to load the processed data into a database like SQLite and then query the database for output.

The output from the above program is:

C:\Old_Data\perlp>perl t4.pl A B C bear 6 2 6 wolf 4 7 5 A B C white 1 0 1 black 1 0 1 brown 3 1 4 red 1 1 0 grey 1 5 3
  • Comment on Re: Consolidating biological data into occurance of each biological unit per sample in table
  • Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1013081]
help
Chatterbox?
[LanX]: I told you the story about the “new" currency trading system of Dresdner Bank
[LanX]: ?
[Corion]: No, or I don't remember the story ;)
LanX my favourite don't tell me the proof of concept is good enough for production anecdote
[ambrus]: ah, it's one of thos
[Eily]: what, there's a difference between proof of concept and production?
[LanX]: 20 years ago traders were complaining about the latency of the trading system...
[ambrus]: I'm currently in the process of rewriting my proof of concept programs. They sort of developped organically as I was experimenting, so now I've got an ugly mess of multiple programs and one-liners held together by nothing. I'll have to rewrite them to som
[ambrus]: ething that's both cleanly organized and mostly automated.

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (12)
As of 2017-03-29 11:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should Pluto Get Its Planethood Back?



    Results (350 votes). Check out past polls.