Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Need to find unique values in hash

by dipit (Acolyte)
on Feb 04, 2019 at 11:00 UTC ( #1229343=perlquestion: print w/replies, xml ) Need Help??

dipit has asked for the wisdom of the Perl Monks concerning the following question:

I have a following hash Dumper structure and need to find unique values per key. I am able to identfy unique values from all-over hash by iterating it but the problem is they are not mapped with their particular key. I need to print the number of duplicate values per key. Please help

$VAR1 = '33|srv2'; $VAR2 = [ 'users', 'users', 'users', 'admin', 'admin', 'admin', 'manager', 'manager', 'manager' ]; </p> $VAR3 = '27|rufserv3'; $VAR4 = [ 'system' ]; $VAR5 = '16|lbapp0112'; $VAR6 = [ 'admin (priv1', ' priv2)' ]; $VAR7 = '34|srv2'; $VAR8 = [ 'users', 'users', 'users', 'admin', 'admin', 'manager' ];

Replies are listed 'Best First'.
Re: Need to find unique values in hash
by Veltro (Friar) on Feb 04, 2019 at 11:56 UTC

    Hi dipit,

    We were talking about your question on the chat since it is not entirely clear what you want.

    I suggested that you may need to change the way you use Dumper

    use strict ; use warnings ; use Data::Dumper ; my %h = ( a => 1, b => [ 1, 2, 3], ) ; print Dumper(%h) ;


    $VAR1 = 'b'; $VAR2 = [ 1, 2, 3 ]; $VAR3 = 'a'; $VAR4 = 1;


    use strict ; use warnings ; use Data::Dumper ; my %h = ( a => 1, b => [ 1, 2, 3], ) ; print Dumper(\%h) ; # Note the additional \


    $VAR1 = { 'b' => [ 1, 2, 3 ], 'a' => 1 };

    Hope this helps.

    PS. Discipulus brought up  map{ $hash{$_}=[ uniq(@{$hash{$_}}) ] } keys %hash using List::Util

      Thank you for the answer. Actually its a simple hash but not anonymous hash. I was trying to find a way without usinf List::Util module because the script has to run on different systems. The structure is as :

      %hash=(k1 => [ab,cd,ab,gh,cd], k2 => [mn,jk,jk,ab,op,mn]);

        There is a Perl module, Module::CoreList that provides a utility called corelist. It can be used to ask Perl the question of whether a module is in the Perl core, and if so, when it got added to Perl's core set of modules. Example:

        $ corelist List::Util Data for 2019-01-20 List::Util was first released with perl v5.7.3

        This means that List::Util has been in the Perl core distribution since Perl 5.7.3. The Perl documentation, "perlhist" can be used to find when a particular version of Perl was introduced.

        $ perldoc perlhist |grep 5.7.3 5.7.3 2002-Mar-05 5.7.3 3299 85 4295 537 2196 300 2176 626 4171 + 120 t t/**/*(.) (for 1-5.005_56) or **/*.t (for 5.6.0-5.7.3) 5.6.0 5.6.1 5.6.2 5.7.3 Jarkko 5.8.0 2002-Jul-18 1205 31 471 From 5.7 +.3

        That's a little cryptic since we're grabbing lines at their textual value, out of context. You could add some more context by using the -C 3 switch on the grep, but we have enough to work with here. What this is telling us is that Perl 5.7.3 was released in March of 2002, and that in July of 2002 it was replaced by Perl 5.8.0. So List::Util has been bundled with a mainline Perl version for over sixteen years.

        For what it's worth, corelist Module::CoreList tells us that the Module::CoreList "corelist" utility has also been bundled with Perl for a long time, beginning with Perl 5.8.9, which was released in December of 2008 (over ten years ago).

        One should generally not be shy of using modules that are bundled with Perl. We don't give a second thought to putting use strict; and use warnings; at the top of our scripts. Those pragmas are also bundled with the core Perl distribution. If we're hesitant to use List::Util only because we are concerned it won't exist on the target system, we must be concerned that we have such a broken Perl that use warnings may also be suspect.

        What is less clear is whether or not List::Util provided uniq contemporarily with the version of Perl that may be installed on your target systems. By reading the Changes file that ships with List::Util we can see that uniq was added in March of 2016, which means the earliest that feature could have gotten into the mainline Perl core would have been Perl 5.24, released in May 2016. But looking at perldoc perl5240delta we see that the version of List::Util bundled with Perl at that time was 1.42, which did not include uniq (added in 1.44). The first mainline Perl version that would have had a version of List::Util greater or equal to 1.44 was Perl 5.26.0 (it bundled List::Util 1.46), released in May 2017. So while it may seem simple to suggest that List::Util has been a part of Perl for so long that it's ancient history, the fact is the feature referred to in this thread was only added to core Perl a little less than two years ago.

        However, perlfaq4 has discussed finding unique keys ( for as long as I can remember.


        As others have said, List::Util is a core module, so you should be able to rely on it being available (assuming you don't need to support Perl older than 5.7.3). If not, it's really easy to implement it yourself:

        sub uniq (@) { my %seen; my $undef; my @uniq = grep defined($_) ? !$seen{$_}++ : !$undef++, @_; @uniq; }
        I was trying to find a way without usinf List::Util module because the script has to run on different systems.
        List::Util is a standard module, so it should presumably be available on all your various systems.

        This being said, as shown by poj, it is also fairly easy to roll out your own code yourself, if so you wish.

        Not sure which way you want the grouping count1 or count2.

        #!perl use strict; use Data::Dumper; my %hash = ( '33|srv2' => [ 'users','users','users', 'admin','admin','admin', 'manager','manager','manager' ], '27|rufserv3' => ['system'], '16|lbapp0112' => [ 'admin (priv1', ' priv2)' ], '34|srv2' => [ 'users', 'users', 'users', 'admin', 'admin', 'manager' ] ); #print Dumper %hash; my %count1; my %count2; for my $key (keys %hash){ ++$count1{$_}{$key} for @{$hash{$key}}; ++$count2{$key}{$_} for @{$hash{$key}}; } print Dumper \%count1; print Dumper \%count2;
        Actually its a simple hash but not anonymous hash.

        I have expanded on Veltro's response.

        Given the simple structure provided, I was able to back-engineer the hash. I then used the Data::Dumper->Dump method that allows names to be stored, following the documentation perldoc Data::Dumper.

        There are slight differences in the way the storage occurs. The named hash gets stored as a hash, whereas the VAR structures are stored anonymously. Notice the arguments to Data::Dumper->Dump are each in anonymous arrays. You may need to play around with the sigils to the arguments a little to get the desired output.

        #!/usr/bin/perl -T # use v5.22 for <<$datafh>> use strict; use warnings; use feature qw/state/; use Data::Dumper; my $data = get_input_data(); open my $datafh, '<', \$data or die 'not getting it'; my %HASH; while(my $line = <$datafh> ){ state $kv; state $current_key; chomp($line); if( $line =~ s/\A\$VAR\d+\s\=\s(\'|\[)// ){ $kv = $1; if( $kv eq '\'' ){ $line =~ s/\'\;\Z//; $current_key = $line; } next; }else{ next if $line =~ m/\A\s+\]\;\Z/; $line =~ s/\A\s+\'(.*)\'\,?\Z/$1/x; push @{ $HASH{ $current_key } }, $line; } } print 'Dumper with VAR',"\n"; print Dumper(\%HASH); =head output1 $VAR1 = { '3|1' => [ 'user', 'user', 'user', 'admin', 'admin', 'manager' ], '2|7' => [ 'system' ] }; =cut print 'Dumper with names',"\n"; print Data::Dumper->Dump([\%HASH],[qw(*HASH)]); =head output2 %HASH = ( '3|1' => [ 'user', 'user', 'user', 'admin', 'admin', 'manager' ], '2|7' => [ 'system' ] ); =cut sub get_input_data{ q{$VAR1 = '3|1'; $VAR2 = [ 'user', 'user', 'user', 'admin', 'admin', 'manager' ]; $VAR3 = '2|7'; $VAR4 = [ 'system' ]; }; }

        fun note: output1 and output2 appear to sort in the same order after a few manual runs, although which key is top is random, they both show keys in same sorting order as each other. Perhaps an internal optimisation?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1229343]
Approved by marto
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2019-10-20 09:51 GMT
Find Nodes?
    Voting Booth?