http://www.perlmonks.org?node_id=1210290

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

For sure an easy task. But I need some help. I have a file with three columns. I need to organize the information to obtain a data structure that mirrors the following: for each instance of $tag3 my data structure needs to have its frequency (count) and all the unique $tag1 that may come along with $tag3 (same row). ($tag2 is just a control). I have written the following script which creates a multi dimensional hash. The counting is done correctly, but what would be the best way to save all unique values of $tag1? In my script only the last $tag1 seen is kept.

Furthermore I have an ugly "Use of uninitialized value in addition (+) at .\myscript.pl line 14" the first time a $tag3 is inserted in the hash. How can elegantly I prevent it?

Example of result: for my $tag3='conference' the counter should be 3 (correctly done by the script), but I should also register the 2 unique $tag1 which are "conference" and "conferences" (which I don't know the best way to do it).

#!/usr/bin/perl use strict; use warnings; use Data::Dumper qw(Dumper); my $line = <DATA>; my %hash; while($line){ my ($tag1, $tag2, $tag3) = split(/\t/, $line); if ($tag2 =~/NN/) { $hash{$tag3}{frequency} = (($hash{$tag3}{frequency +})+1); $hash{$tag3}{variants} = $tag1; } $line = <DATA>; } print Dumper \%hash; __DATA__ The DT the International NN International for IN for well NN well preparation NN preparation preparation NN preparation in IN in conference NN conference conference NN conference conferences NN conference good VVG good