Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Find number of unique values in hash

by rethaew (Sexton)
on Jan 19, 2010 at 21:51 UTC ( #818301=perlquestion: print w/ replies, xml ) Need Help??
rethaew has asked for the wisdom of the Perl Monks concerning the following question:

Monks, please help. I need to find the unique number of values in a hash of hashes. For example, given the array below, I need to know how many kids are 12 years old.
%hoh = ( kevin => { age => "12", favorite_color => "blue", gender => "boy", }, john => { age => "11", favorite_color => "green", gender => "boy", }, lisa => { age => "11", favorite_color => "pink", gender => "girl", }, sara => { age => "13", favorite_color => "purple", gender => "girl", }, shelly => { age => "12", favorite_color => "purple", gender => "girl", }, );
How many kids are 12? In this case 2. Is there a quick way to get this number without looping through each hash and counting?

Comment on Find number of unique values in hash
Download Code
Re: Find number of unique values in hash
by zwon (Monsignor) on Jan 19, 2010 at 22:03 UTC
    my $i; for (values %hoh) { $_->{age} == 12 && $i++; }

    PS: ah, you want to do it "without looping through each hash and counting." No, I don't think that's possible.

      rethaew want[s] to do it "without looping through each hash and counting." No, I don't think that's possible.

      I agree. Even a solution like
          my $age12 = grep $_->{age} == 12, values %hoh;
      implicitly loops through the hash. rethaew will have to wait for the arrival of the personal desktop quantum computer to solve the problem without looping.

Re: Find number of unique values in hash
by saberworks (Curate) on Jan 19, 2010 at 22:04 UTC
    Here are two ways to do it.
    use warnings; use strict; my %hoh = ( kevin => { age => "12", favorite_color => "blue", gender => "boy", }, john => { age => "11", favorite_color => "green", gender => "boy", }, lisa => { age => "11", favorite_color => "pink", gender => "girl", }, sara => { age => "13", favorite_color => "purple", gender => "girl", }, shelly => { age => "12", favorite_color => "purple", gender => "girl", }, ); my $num_12 = grep { $_ == 12 } map { $hoh{$_}->{'age'} } keys %hoh; my $num_alt = grep { $hoh{$_}->{'age'} == 12 } keys %hoh; print "Num 12: $num_12\n"; print "Num 12 alt: $num_alt\n";
      Improving on your idea:
      print scalar grep{$_->{'age'} == 12} values %hoh;

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        I like yours better.
Re: Find number of unique values in hash
by kennethk (Monsignor) on Jan 19, 2010 at 22:07 UTC
    If this were simply finding the number of unique value entries in a flat hash, this would be a fairly straight forward issue - simply use reverse to invert the hash and use keys in either list or scalar context depending on whether you want the list of values or just the count.

    However, if what you want to know is how many people are of age 12, as per your example, you have to collect to data by key. Since this means you need to look at each entry, how would you expect to not have to examine each node in a loop? You can use built-ins like map to reduce the total number of characters, but will likely make the code more difficult to read in the future. The clearest way to determine how many people are 12 would be to loop over the hash and collect counts in a hash:

    #!/usr/bin/perl use strict; use warnings; my %hoh = ( kevin => { age => "12", favorite_color => "blue", gender => "boy", }, john => { age => "11", favorite_color => "green", gender => "boy", }, lisa => { age => "11", favorite_color => "pink", gender => "girl", }, sara => { age => "13", favorite_color => "purple", gender => "girl", }, shelly => { age => "12", favorite_color => "purple", gender => "girl", }, ); my %counts; my $key = 'age'; foreach my $person (values %hoh) { $counts{$person->{$key}}++; } use Data::Dumper; print Dumper(\%counts);
Re: Find number of unique values in hash
by CountZero (Bishop) on Jan 19, 2010 at 22:07 UTC
    You will need to employ some form of loop, but why shouldn't you? It can be as easy as:
    $ages{$_->{'age'}}++ foreach values %hoh;
    And you will find the number of children with the same age as the values keyed by the age in the hash %ages:
    { 11 => 2, 12 => 2, 13 => 1 }

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Find number of unique values in hash
by ww (Bishop) on Jan 19, 2010 at 22:11 UTC

    This really begs for an "RTFM" answer. Have you done so?

    If that hasn't (doesn't) make it clear to you, then Data Type: Hash should help (special attention to the content and links in planetscape's offering).

    In either case, you really need to show that you've made some effort to find your own answer to the problem: where's the code for what you've tried so far and how does it fail? See, please, On asking for helpand How do I post a question effectively?.

Re: Find number of unique values in hash
by jethro (Monsignor) on Jan 19, 2010 at 22:11 UTC
    I don't see a faster way for this. But if you need to count often, you could generate a second HoH for a one-time cost and have that and all other counts for no cost at all after that.

    my %counts=(); foreach $kid (%hoh) { foreach my $attribute (%$kid) { $counts{$attribute}{$kid->{attribute})++; } } ... print $counts{'age'}{12};

    Sadly this "one-time" cost has to be paid whenever the data changes (unless you also adjust the counts in the second HoH whenever you change anything)

    UPDATE: Fixed the bug in the first line thanks to warnings from johngg and chromatic. It was too much to hope that I could write even a 4-liner without a trivial bug.

      my %counts={};

      I think you need parentheses rather than curlies there. As it is, you are assigning a single hash reference to the hash, which it will then stringify and use as a key with no corresponding value. Also, if you use warnings;, I think you will get one complaining about odd number of elements in hash assignment, or words to that effect. Trying it I get this.

      $ perl -MData::Dumper -Mstrict -wle ' > my %h = {};' > print Data::Dumper->Dump( [ \ %h ], [ qw{ *h } ] );' Reference found where even-sized list expected at -e line 2. %h = ( 'HASH(0x817f880)' => undef ); $

      I hope this is useful.

      Cheers,

      JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://818301]
Approved by saberworks
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (17)
As of 2014-10-23 20:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (128 votes), past polls