http://www.perlmonks.org?node_id=1013081


in reply to Consolidating biological data into occurance of each biological unit per sample in table

This would apply only to the data you provided - an animal name and its color, coded as they are.
#!/usr/bin/perl use strict; use warnings; use Text::Table; my (%animal, %color); while (<DATA>) { chomp; my (undef, $sample, $animal, $color) = split /[\t;]/; s/^\w__// for $animal, $color; $animal{$sample}{$animal}++; $color{$sample}{$color}++; } for my $entity (\%animal, \%color) { my @samples = sort keys %$entity; my $tb = Text::Table->new( map {title => $_}, " ", @samples); my %seen; my @keys = grep $_ && !$seen{$_}++, map keys %$_, values %$entity; for my $key (@keys) { $tb->load( [$key, map $entity->{$_}{$key} || 0, @samples] ); } print $tb; print "\n\n"; }
Update: Upon reflection, several points occured to me. For one thing, this code doesn't deal with any possible additional fields other than 'animal' and 'color'. That could be fixed with some minor changes to the code. And using Text::Table for the presentation is probably the wrong choice given that there may be 100's of animals/colors (as well as more than a dozen or so 'samples', (A to Z??). I think Text::Tables is more suited to smaller datasets that you can see on 1 or 2 pages of output.

Two possible solutions I can think of would be to create a comma separated values file with the program output to be read by a spreadsheet program like Excel. Excel allows you to freeze the headers or first column, (animal or color), so you can scroll through the output and still keep the categories in sight.

The other possibility would be to load the processed data into a database like SQLite and then query the database for output.

The output from the above program is:

C:\Old_Data\perlp>perl t4.pl A B C bear 6 2 6 wolf 4 7 5 A B C white 1 0 1 black 1 0 1 brown 3 1 4 red 1 1 0 grey 1 5 3
  • Comment on Re: Consolidating biological data into occurance of each biological unit per sample in table
  • Select or Download Code