Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}

by thinker (Parson)
on Aug 04, 2003 at 15:46 UTC ( #280714=note: print w/replies, xml ) Need Help??

in reply to Perl Idioms Explained - keys %{{map{$_=>1}@list}}

Hi broquaint,

or, to keep the order that the items are inserted (ie. the order in which the first instance of a value is encountered)

my %seen; @uniq = grep ! $seen{$_}++, @list;



Replies are listed 'Best First'.
Re: Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by LanceDeeply (Chaplain) on Aug 04, 2003 at 18:58 UTC
    This is the way I've usually seen it done. I was curious, so I ran a benchmark against the two.
    use Benchmark; my @list; for ( 0..9999 ) { push @list, sprintf "%d", 100 * rand ; } timethese( 1000, { 'keys_map' => sub { my @uniq = keys %{{ map {$_ => 1} @list }} +; }, 'grep_seen' => sub { my %seen; my @uniq = grep ! $seen{$_}+ ++, @list; }, } );
    Yields the following output.

    Benchmark: timing 1000 iterations of grep_seen, keys_map... grep_seen: 13 wallclock secs (11.17 usr + 0.00 sys = 11.17 CPU) @ 89 +.52/s (n=1000) keys_map: 30 wallclock secs (29.28 usr + 0.00 sys = 29.28 CPU) @ 34 +.15/s (n=1000)
      'keys_map_undef' => sub { my @uniq = keys %{{ map {$_ => undef} @list +}}; },
      to test the undef suggestion, it turns out to be 15-20% faster than using 1.

      grep still wins, though.

Re: Re: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by Jasper (Chaplain) on Aug 04, 2003 at 22:21 UTC
    You can also grep lists for certain 'numbers' of entries (so if you wanted only the items that were in a list twice)
    @doubles = grep ++$seen{$_} == 2, @list;

      That would be 2 or more times right? Once ++$seen{$_} == 2 is true, you are immediatelly copying $_ to @doubles. So if it appeared a third time, you couldn't magically remove it again. Your code would be clearer if you used >= to emphasize that.

      The following would probably do if you only wanted entries with exactly 2 occurences:

      @doubles = grep $seen{$_} == 2, grep !$seen{$_}++, @list;

      But it is not pretty, or obvious. The first grep counts all occurences and only passes the first found entry to the next grep which will check to see how many were actually found.

      There must be a better way!

      - Cees

      No, that won't fly. You need
      $seen{$_}++ for @list; my @doubles = grep $seen{$_} == 2, @list;

      Makeshifts last the longest.

Re^2: Perl Idioms Explained - keys %{{map{$_=>1}@list}}
by Aristotle (Chancellor) on Aug 04, 2003 at 20:12 UTC
    Note that this is slightly broken as is. To be entirely correct, you have to say
    my (%seen, $seen_undef); my @uniq = grep defined() ? !$seen{$_}++ : !$seen_undef++, @list;
    Of course, if you're fiddling with objects which cannot be compared for equity by stringification, it is still broken.

    Makeshifts last the longest.

      Of course, if you're fiddling with objects which cannot be compared for equity by stringification, it is still broken.

      Well, so is every uniquification based on the keys of a hash! The point still stands that grep is faster than the original idiom presented, by orders of magnitude, if still as memory-hungry.

      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        I didn't dispute that. :) I just pointed out some of the more subtle points to keep in mind here.

        Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://280714]
[erix]: yeah. it's been some time since I last built a (linux) kernel but perl takes less than 30s and postgresql less than 2 minutes minutes :)
[talexb]: Yep .. came across SuSE 6.2 in my storage locker recently. How old is that.
[erix]: 2001 :)
[erix]: (I have the wp pages open :))
[talexb]: 2001 sounds about right .. got myself a separate box just to run Linux on. Seriously clueless.

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (7)
As of 2018-07-17 17:49 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (374 votes). Check out past polls.