Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

extracting only those keys with uniq values in array list

by v15 (Sexton)
on Apr 19, 2021 at 23:08 UTC ( #11131485=perlquestion: print w/replies, xml ) Need Help??

v15 has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone I am dealing with Hash of Arrays in Perl. Here is my code so far:
while(<>){ chomp; my @s = split /\t/,$_; my $r_name = $s[0]; my $seq = $s[9]; push @{$read2seq{$r_name}}, substr($seq,0,12); }
For every key($r_name) , I have 2 values. I want to print the names of those keys($r_name) where the 2 values for a particular key($r_name) are same. If the 2 values for the particular key are different from each other, then I do not want to output that key. I cannot use the module use List::MoreUtils qw(uniq) since it is not installed on my cluster. How can I do this?

Replies are listed 'Best First'.
Re: extracting only those keys with uniq values in array list
by choroba (Archbishop) on Apr 19, 2021 at 23:14 UTC
    List::Util is core and also exports uniq.
    use List::Util qw( uniq );

    Or, write your own:

    sub uniq { my @values = @_; my %uniq; @uniq{@values} = (); return keys %uniq }

    or, if you want to keep the order:

    sub uniq { my @values = @_; my %seen; return grep ! $seen{$_}++, @values }

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: extracting only those keys with uniq values in array list
by davido (Cardinal) on Apr 20, 2021 at 15:50 UTC

    In your question you should post an example of the data structure you are working with. Your description says that you are dealing with a hash of arrays, and that for every key there are two values. In some cases those two values are the same, and in other cases they are not. Since you're using substr in your example, I'll assume that "the same" means stringwise equality.

    I'm imagining that your datastructure looks like this:

    my %hash = ( foo => ['this', 'that'], # Two different values bar => ['those', 'those'], # Two identical values baz => ['the', 'other'], # Not the same buzz => ['same', 'same'], # The same );

    In this structure you would want to print the name of the bar key and buzz key, since the values in the array ref for those keys are the same. You didn't suggest that there could ever be more or less than two values, so I won't solve the problem that isn't stated. Here's an example of how to accomplish what you're after given this data structure:

    my %hash = ( foo => ['this', 'that'], # Two different values bar => ['those', 'those'], # Two identical values baz => ['the', 'other'], # Not the same buzz => ['same', 'same'], # The same ); while (my ($key, $aref) = each %hash) { if ($aref->[0] eq $aref->[1]) { print "$key\n"; } }

    The output now should be 'bar' and 'buzz'.

    If there could be fewer than two elements to compare, you could do this:

    if (2==@$aref && $aref->[0] eq $aref->[1]) ...

    If there could be more than two, and you want to assure that they are all equal, you could do this:

    while (my ($key, $aref) = each %hash) { my %seen; $seen{$_}++ for @$aref; next if 1 != keys %seen; print "$key\n"; }

    If these scenarios don't match the problem you are trying to solve, please refine your question to remove the ambiguity that has resulted in a diverse selection of answers.


    Dave

      Very well represented example and explanation. Thank you
Re: extracting only those keys with uniq values in array list
by BillKSmith (Prior) on Apr 20, 2021 at 13:45 UTC
    Every reply you have received so far has a different interpretation of your requirements. If you are lucky, someone will correctly guess what you intend and post a solution. However, I recommend that you restate the problem in a way that we can all understand without guessing. Note that the word 'unique' can be ambiguous. It can mean 'values that only occur once' or 'the first occurrence of each value'. The function List::Util#uniq which you reference assumes the latter.
    Bill
      I agree with you. I edited the question so hopefully it makes more sense. Thanks
        Thanks for the clarification. Build the hash exactly as you show. Process the keys with grep
        use strict; use warnings; my %r_name = ( a => [ 1, 2 ], b => [ 3, 3 ], c => [ 1, 6 ], ); my @result = grep {$r_name{$_}[0] eq $r_name{$_}[1]} keys %r_name; print @result, "\n";

        Result

        b

        Note for next time: It is considered impolite to edit your post in a way that invalidates existing replies. Leave the existing text. Add a clearly marked UPDATE section.

        Bill
Re: extracting only those keys with uniq values in array list
by hippo (Chancellor) on Apr 20, 2021 at 14:14 UTC
    For every key($r_name) , I have 2 values. I want to print the names of those keys($r_name) where the 2 values for a particular key($r_name) are same. If the 2 values for the particular key are different from each other, then I do not want to output that key.
    use strict; use warnings; use Test::More tests => 1; my %read2seq = ( foo => [123, 123], bar => [456, 789] ); my @want = ('foo'); my @samevals = grep { $read2seq{$_}->[0] eq $read2seq{$_}->[1] } keys +%read2seq; is_deeply \@samevals, \@want;

    🦛

Re: extracting only those keys with uniq values in array list
by Cristoforo (Curate) on Apr 20, 2021 at 01:45 UTC
    I don't know if you have to have a hash of lists, but this could be solved using a hash of hashes. I couldn't find a .tsv file to try this out (but did try it on a .csv and it seemed to work).
    while(<>){ chomp; my @s = split /\t/; my $r_name = $s[0]; my $seq = substr $s[9], 0, 12; $read2seq{$r_name}{$seq}++; } my @keys_with_2_unique_vals = grep 2 == keys %{$read2seq{$_}}, keys %read2seq;
    UPDATE: To account for the change in OP's post. (He wants keys where both values are the same) Change the grep line to:

    my @keys_with_equal_pair_of_vals = grep 1 == keys %{$read2seq{$_}}, keys %read2seq;
Re: extracting only those keys with uniq values in array list
by tybalt89 (Prior) on Apr 20, 2021 at 15:45 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11131485 use warnings; @ARGV = 'd.11131485'; # FIXME used for testing my %seen; while( <> ) { my ($r_name, $seq) = (split /\t/)[0, 9]; $seq =~ s/.{20}\K.*//; $seen{"$r_name\t$seq"}++ and print "$r_name\n"; }

    Works with the data file I had to fake up because you did not provide a sample data file.

Re: extracting only those keys with uniq values in array list
by salva (Canon) on Apr 20, 2021 at 13:51 UTC
    my %seen; while(<>) { chomp; my ($k, $v) = split/\t/, $_; print("$k\t$v\n") if $seen{$k} eq $v; $seen{$k} = $v }
Re: extracting only those keys with uniq values in array list
by perlfan (Vicar) on Apr 20, 2021 at 01:52 UTC
    hash keys are unique, generally you do the reverse:
    my %stuff = (); while(<>){ chomp; my @s = split /\t/,$_; my $r_name = $s[0]; my $seq = $s[9]; my $key = substr($seq,0,12); push @{$stuff{$key}}, $r_name; } # now %stuff is keyed on the value you wish to remain unique, # which is easily determined because the array ref attached to # it has a size of exactly 1 element

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11131485]
Approved by Corion
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2021-06-18 17:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (89 votes). Check out past polls.

    Notices?