Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^2: Idiom: hashes as sets

by Tanktalus (Canon)
on Jul 03, 2008 at 16:00 UTC ( #695380=note: print w/replies, xml ) Need Help??

in reply to Re: Idiom: hashes as sets
in thread Idiom: hashes as sets

Ok, you've piqued my curiosity ... I've looked both at the perl and the XS versions of uniq, and can't discern the difference between it and the idiomatic my %seen; @uniq = grep {not $seen{$_}++} @random_values. Though, I have to grant that it took me an inordinate amount of time to figure out the difference in the XS code between the scalar context codepath and the list context codepath, so I could easily be missing a subtle nuance somewhere. Care to be more explicit in which way the idiomatic method is wrong?

Replies are listed 'Best First'.
Re^3: Idiom: hashes as sets
by kyle (Abbot) on Jul 03, 2008 at 16:14 UTC

    Here's uniq from List::MoreUtils:

    sub uniq (@) { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }

    I think this is equivalent to your "my %h; @uniq = grep { ! $h{$_}++ } @in;". What's important about both of them is that they don't do this:

    sub bad_uniq { my %h; @h{@in} = (); return keys %h; }

    ...which may be more succinctly (idiomatically) written as "keys %{{ map { $_ => 1 } @in }}".

    The difference is that keys will not return what was in the list to begin with but rather whatever those things come out as after being stringified. The solutions using map and grep both will stringify the stuff you feed them, but it's the stuff that's returned, and not the stringy leftovers that were used to tell which were duplicates.

    Update with a demonstration:

    use List::MoreUtils 'uniq'; use Data::Dumper; my $aref = [ 1, 2 ]; my @aref_duplicated = ( $aref, $aref, $aref ); my @u1 = uniq( @aref_duplicated ); my @u2 = keys %{{map {$_=>1} @aref_duplicated}}; print Data::Dumper->Dump( [ \@u1, \@u2 ], [ '*from_uniq', '*from_keys' ] ); __END__ @from_uniq = ( [ 1, 2 ] ); @from_keys = ( 'ARRAY(0x8153c28)' );

      Yes, that is an excellent point. I forgot to mention in the OP that I have only been using the idiom for sets of strings and integers. If I were storing references, I would be using a different system entirely -- or select an indexable attribute and make a separate hash out of that.

      print "Just Another Perl Adept\n";

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://695380]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2018-06-23 13:14 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (125 votes). Check out past polls.