http://www.perlmonks.org?node_id=932674

deMize has asked for the wisdom of the Perl Monks concerning the following question:

Question: How to find all occurrences of a key in a deeply nested structure?

Here's what the object might look like in a Data::Dumper output:

$VAR1 = { 'foo' => { 'child1' => { 'gchild1' => { 'values' => { 'bar1' => 1, 'bar2' => 1 } } }, 'child2' => { 'gchild1' => { 'values' => { 'bar1' => 1, 'bar9' => 1 } }, 'gchild2' => { 'values' => { 'bar2' => 1, 'bar3' => 1 } } }, 'child3' => { 'values' => { 'bar1' => 1, 'bar2' => 1, 'bar4' => 1, 'bar5' => 1 } } } };

So the goal is to scan the hash of hashes and return all the references of 'values' most likely as an array of hashrefs. Notice 'values' can be a child of a grandchild, or the grandchild itself (as in child3). There may also be great-great-...-grandchildren. Basically, it could be anywhere in the chain.




Demize

Replies are listed 'Best First'.
Re: How to find all occurrences of a key in a deeply nested structure?
by Util (Priest) on Oct 20, 2011 at 15:07 UTC

    The Data::Search module is a perfect fit for your problem.

    Working, tested code:

    use strict; use warnings; use Data::Search; use Data::Dumper; $Data::Dumper::Useqq=1; $Data::Dumper::Terse=1; my $href = { foo => { child1 => { gchild1 => { values => { bar1 => 1, bar2 => 1 } } }, child2 => { gchild1 => { values => { bar1 => 1, bar9 => 1 } }, gchild2 => { values => { bar2 => 1, bar3 => 1 } }, }, child3 => { values => { bar1 => 1, bar2 => 1, bar4 => + 1, bar5 => 1 } }, }, }; for my $wanted ( qw( bar5 bar3 bar2 ) ) { my @results = datasearch( data => $href, search => 'keys', find => qr{ \A $wanted \z }msx, return => 'container', ); print "Found key '$wanted' in these hashrefs: ", Dumper @results; }

    Output:

    Found key 'bar5' in these hashrefs: {
      "bar1" => 1,
      "bar4" => 1,
      "bar2" => 1,
      "bar5" => 1
    }
    Found key 'bar3' in these hashrefs: {
      "bar2" => 1,
      "bar3" => 1
    }
    Found key 'bar2' in these hashrefs: {
      "bar2" => 1,
      "bar3" => 1
    }
    {
      "bar1" => 1,
      "bar4" => 1,
      "bar2" => 1,
      "bar5" => 1
    }
    {
      "bar1" => 1,
      "bar2" => 1
    }
    

      Thank you. Right now, I'm going to use Data::Walk, because of my familiarity and its resemblance to File::Find. However, I'm definitely going to look into this, since it seems to have some additional useful features (less typing). This looks like something that I can use with another project as well.

      Edit: I just tested this out and it works well. Using this, I could loop through the results array, and add the keys of that hash to another array/hash to get one list of values.


      Cheers,

      Demize
Re: How to find all occurrences of a key in a deeply nested structure?
by suaveant (Parson) on Oct 20, 2011 at 14:46 UTC
    There are a couple of promising modules on CPAN when searching "traverse" Data::Walk and Data::Traverse

                    - Ant
                    - Some of my best work - (1 2 3)

      Thanks, this is something I was looking for. I didn't want to roll my own, because I thought there would be something like this out there.

      Cheers,

      Demize
Re: How to find all occurrences of a key in a deeply nested structure?
by moritz (Cardinal) on Oct 20, 2011 at 14:29 UTC

    What have you tried?

    I'd write a subroutine that receives a hash ref as an argument, and stores the right value if the key in question is there. Then it recurses into all the values that are hashrefs.

      First: I'm not sure why asking a valid question is worth -1 rep, but I guess that's someone's opinion. Anyhow, Tye came out with Data::Diver, so I was hoping there would be an easier way than to build my own recursive "find" tool. I can see how to set values with his module, but not sure how to scan for all occurrences.


      Demize
        It's worth -1 because we're not here to do all your work - or your homework! - for you. We can help you more if you show us what you've already tried and explain how its results differ from what you expected.
Re: How to find all occurrences of a key in a deeply nested structure?
by Anonymous Monk on Oct 20, 2011 at 15:13 UTC

    Here using Data::Walk, Data::Visitor::Callback

    #!/usr/bin/perl -- use strict; use warnings; use Data::Dumper (); Main( @ARGV ); exit( 0 ); sub Main { my $root = { 'foo' => { 'child1' => { 'gchild1' => { 'values' => { 'bar1' => 1, 'bar2' => 1 } } }, 'child2' => { 'gchild1' => { 'values' => { 'bar1' => 1, 'bar2' => 1 } }, 'gchild2' => { 'values' => { 'bar1' => 1, 'bar2' => 1 } } }, 'child3' => { 'values' => { 'bar1' => 1, 'bar2' => 1 } } } }; use Data::Visitor::Callback; my $vorp = sub { my @v; my $v = Data::Visitor::Callback->new( ignore_return_values => 1, hash_entry => sub { my ( $self, $k, $v ) = @_; push @v, $v if $k eq 'values' and $v; return; }, ); $v->visit( @_ ); return @v; }; print DD( $root, [ $vorp->( $root ) ], ), '#' x 33 , "\n"; use Data::Walk(); $vorp = sub { my @v; Data::Walk::walk( sub { ## same result as 2nd if block below #~ no warnings 'uninitialized'; #~ if( 'HASH' eq $Data::Walk::type and ref $_){ #~ my $v = $_->{values}; #~ push @v, $v if $v; #~ } if( $_ eq 'values'){ push @v, $Data::Walk::container->{values}; } return; }, @_ ); return @v; }; print DD( $root, [ $vorp->( $root ) ], ), '#' x 33 , "\n"; } sub DD { scalar Data::Dumper->new( \@_ )->Indent(1)->Useqq(1)->Dump; } __END__ $VAR1 = { "foo" => { "child2" => { "gchild2" => { "values" => { "bar1" => 1, "bar2" => 1 } }, "gchild1" => { "values" => { "bar1" => 1, "bar2" => 1 } } }, "child3" => { "values" => { "bar1" => 1, "bar2" => 1 } }, "child1" => { "gchild1" => { "values" => { "bar1" => 1, "bar2" => 1 } } } } }; $VAR2 = [ $VAR1->{"foo"}{"child2"}{"gchild2"}{"values"}, $VAR1->{"foo"}{"child2"}{"gchild1"}{"values"}, $VAR1->{"foo"}{"child3"}{"values"}, $VAR1->{"foo"}{"child1"}{"gchild1"}{"values"} ]; ################################# $VAR1 = { "foo" => { "child2" => { "gchild2" => { "values" => { "bar1" => 1, "bar2" => 1 } }, "gchild1" => { "values" => { "bar1" => 1, "bar2" => 1 } } }, "child3" => { "values" => { "bar1" => 1, "bar2" => 1 } }, "child1" => { "gchild1" => { "values" => { "bar1" => 1, "bar2" => 1 } } } } }; $VAR2 = [ $VAR1->{"foo"}{"child2"}{"gchild2"}{"values"}, $VAR1->{"foo"}{"child2"}{"gchild1"}{"values"}, $VAR1->{"foo"}{"child3"}{"values"}, $VAR1->{"foo"}{"child1"}{"gchild1"}{"values"} ]; #################################

    Data::DPath might also be a candidate

      Very helpful. I was on my way of doing the same thing, but didn't see the needed container. If I were to try and get a unique list of all the values, does this make sense:
      my $struct = {...}; # the hash of hashes in the question my $found; Data::Walk::walk \&wanted, $struct; print join(" ",sort keys %$found) ,"\n"; sub wanted{ if (/^values$/){ foreach ( keys %{$Data::Walk::container->{$_}} ){ $found->{$_}=1; } } }



      Demize
        No, don't do that, do it the way I did it :) except only push if not $Data::Walk::seen