http://www.perlmonks.org?node_id=350697


in reply to Re: Optimizing quickly hash tables by checking multiple conditions
in thread Optimizing quickly hash tables by checking multiple conditions

Hello,
I am not that experienced with the Perl and I am not a programmer by nature although I have been touched by Perl since years, I still have no idea on how to use map,grep and bar inside hash tables. I can use it on arrays but that would be about it. I have tried to make it work the easy to understand way by using foreach loops and if statement and then pushing everything into a new hash but got totally stuck as you would be able to imagine. I could not find much documentation on the different handling possibilities of hash tables although I don't tend to use anything else then hash tables in my codes. If anybody has some good documentation/examples on how to use functions with hash tables like map and grep that would be great.
  • Comment on Re: Re: Optimizing quickly hash tables by checking multiple conditions

Replies are listed 'Best First'.
Re: Re: Re: Optimizing quickly hash tables by checking multiple conditions
by revdiablo (Prior) on May 05, 2004 at 16:37 UTC
    I still have no idea on how to use map,grep ... inside hash tables

    The basic technique in my example is fairly simple. First we filter out the hash keys we want, with grep. Once that list of wanted keys is obtained, we have to turn it back into a hash. This is what the map is for. The foo and bar were just examples. Those are your selection criteria (i.e. how you decide which keys you want to toss and which you want to keep).

    Here it is laid out a bit more clearly. Remember, since the pipeline goes from right to left, read the comments from bottom to top:

    my %newhash = # keys/values into a hash map { $_ => $hash{$_} } # a list of keys/values grep { foo($hash{$_}) # just the keys we want and bar($hash{$_}) } keys %hash; # a list of keys in %hash

    Now, if there are a lot of keys in your hash (as you seem to imply), building all these big lists might be rather slow. I would try it before ruling it out, though. Also note that while we're making a whole new hash, the sub-hashes are not going to all get copied. The references will get copied, but they'll be referring to the same anonymous hashes as before. This shouldn't be a big bottleneck (but, again, without trying it's hard to say for sure).

    I have tried to make it work the easy to understand way by using foreach loops and if statement and then pushing everything into a new hash but got totally stuck

    This might be the way you want to go, if the map and grep solution is still too confusing. Again, if you show us the code you tried, we can try to help figure out why it's not working. You might want to post a new question, then reply here with a link to it. That way you get more people looking at it, but people following the discussion here can still follow the new post.

      my %newhash = map { $_ => $refdes_bom{$_} } grep { CPN($refdes_bom{$_}) and ITEM($refdes_bom{$_}) } keys %refdes_bom;
      I have tried to make that basic routine you gave me work but failed as it will not process the keywords bar and foo. The error message I get is :
      Undefined subroutine &main::CPN called at compal_test.pl line 37.
      Am I missing the obvious basics here ?

        What is in your CPN and ITEM subroutines? You have to define them to check for criteria you want. They don't just magically appear out of nowhere. Here is a fully-functional example, hopefully it will make something click for you:

        my %hash = ( first => { a => 2, b => 3 }, second => { a => 3, b => 2 }, third => { a => 5, b => 4 }, fourth => { a => 4, b => 6 }, ); my %newhash = map { $_ => $hash{$_} } grep { foo($hash{$_}) and bar($hash{$_}) } keys %hash; print "$_\n" foreach keys %newhash; sub foo { my ($href) = @_; $href->{a} > 3; } sub bar { my ($href) = @_; $href->{b} > 2; }

        As you can see, in this example foo makes sure the value at a is greater than 3, and bar makes sure the value at b is greater than 2. The end result should print third and fourth, since those are the only keys that pass the criteria.

      This might be the way you want to go, if the map and grep solution is still too confusing. Again, if you show us the code you tried, we can try to help figure out why it's not working. You might want to post a new question, then reply here with a link to it. That way you get more people looking at it, but people following the discussion here can still follow the new post.

      Well I have made it work using foreach loops but I think most people would find it very onorthodox. First I convert the hast table into a table I can more easy process. And I concatenate all the states into one string for easy comparisation. So at the end I will compare the string to match.
      my %new_hash; foreach my $line (%refdes_bom) { foreach my $states (sort keys %{$refdes_bom{$line}->{STATES}}) { $new_hash{$refdes_bom{$line}->{CPN}} ->{$refdes_bom{$line}->{REFDES}} ->{SIDE} = $refdes_bom{$line}->{SIDE}; $new_hash{$refdes_bom{$line}->{CPN}} ->{$refdes_bom{$line}->{REFDES}} ->{DESC} = $refdes_bom{$line}->{DESC}; push (@{$new_hash{$refdes_bom{$line}->{CPN}} ->{$refdes_bom{$line}->{REFDES}} ->{STATES}}, $states); $new_hash{$refdes_bom{$line}->{CPN}} ->{$refdes_bom{$line}->{REFDES}} ->{STRING} .= $states; } }
      The result is a hash table that looks as follows :
      'SB57002020P' => HASH(0x197d78c) 'Q20' => HASH(0x195a120) 'DESC' => 'S TR 2N7002DW 2N SOT-363' 'SIDE' => 'bottom' 'STATES' => ARRAY(0x195a150) 0 88002 'STRING' => 88002 'Q82' => HASH(0x1985f08) 'DESC' => 'S TR 2N7002DW 2N SOT-363' 'SIDE' => 'bottom' 'STATES' => ARRAY(0x1985f38) 0 88001 1 88003 'STRING' => 8800188003 'Q83' => HASH(0x19afd24) 'DESC' => 'S TR 2N7002DW 2N SOT-363' 'SIDE' => 'bottom' 'STATES' => ARRAY(0x19afd54) 0 88001 1 88003 'STRING' => 8800188003
      Now I still need to make all the matching conditions work. As you can see in the above sample two of the Q keys match, one does not. And probably here I go into the crazy but easy to read way. So the condition was for all the Q.. where the side is the same and all the states are the same we want to bundle.
      my $count = 1; my $new_count; my $row; my %bundled_hash; foreach my $CPN (%new_hash) { foreach my $REFDES (sort keys %{$new_hash{$CPN}}) { $new_count = length $count; $row = $count; my $i; for ($i = $new_count; $i <= 4; ++$i) # Start with $i = 1 { $row = '0'.$row; } if (not defined $new_hash{$CPN}->{$REFDES}->{MATCH}) { push (@{$bundled_hash{$row}->{$CPN}->{REFDES}},$REFDES); $bundled_hash{$row}->{$CPN}->{SIDE} = $new_hash{$CPN}->{$REFDES} +->{SIDE}; $bundled_hash{$row}->{$CPN}->{DESC} = $new_hash{$CPN}->{$REFDES} +->{DESC}; $bundled_hash{$row}->{$CPN}->{STATES} = $new_hash{$CPN}->{$REFDE +S}->{STATES}; foreach my $REFDES1 (sort keys %{$new_hash{$CPN}}) { if ($new_hash{$CPN}->{$REFDES}->{STRING} eq $new_hash{$CPN}->{ +$REFDES1}->{STRING} && $new_hash{$CPN}->{$REFDES}->{SIDE} eq $new_has +h{$CPN}->{$REFDES1}->{SIDE} && $REFDES ne $REFDES1) { #$bundled_hash{$row}->{$CPN}->{REFDES} = $REFDES; + push (@{$bundled_hash{$row}->{$CPN}->{REFDES}},$REFDES1); + $new_hash{$CPN}->{$REFDES1}->{MATCH}=1; } } } $count++; } }
      So first of all I need to use a counter because some keys can come back twice with different values. So with using a counter I am able to create multiple keys. After this I will loop through times through the same hash. Once a match is found I need to make sure that the matching key will not be used again in the future looping. I first tried to delete the key but this does not work as it seem that Perl loads the whole loop into memory and deleting the key would result in bad results. So I added a memory tag and check that memory tag for existance. If so then I can skip it. Eventually I got the result. But I am sure that you REAL PROGRAMMERS will say I am nuts. :-)
      Result Hash :
      00167 => HASH(0x19c39a8) 'SB57002020P' => HASH(0x19c39c0) 'DESC' => 'S TR 2N7002DW 2N SOT-363' 'REFDES' => ARRAY(0x19c39d8) 0 'Q20' 'SIDE' => 'bottom' 'STATES' => ARRAY(0x1980a44) 0 88002 00168 => HASH(0x19c5404) 'SB57002020P' => HASH(0x19c541c) 'DESC' => 'S TR 2N7002DW 2N SOT-363' 'REFDES' => ARRAY(0x19c5434) 0 'Q82' 1 'Q83' 'SIDE' => 'bottom' 'STATES' => ARRAY(0x1989a10) 0 88001 1 88003
      As you see in the result code I have reduced the sections from three to two.