Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Re: Remove Duplicates from Array

by dragonchild (Archbishop)
on Oct 31, 2008 at 21:03 UTC ( #720786=note: print w/replies, xml ) Need Help??

in reply to Remove Duplicates from Array

Don't do that. Use List::MoreUtils and the uniq() function. uniq() will work in cases where yours won't, such as lists of objects and other edge cases. Plus, List::MoreUtils has a ton of really good functions to have on hand.

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Replies are listed 'Best First'.
Re^2: Remove Duplicates from Array
by mpeever (Friar) on Oct 31, 2008 at 22:59 UTC
    Additionally, List::MoreUtils uniq() will maintain your list order. Using a temporary hash is tempting, but it will almost certainly change the order of elements.

      And List::MoreUtils::uniq can be faster as it tries to load a library to implement its functionality via DynaLoader. If that fails it implements a plain perl way.

      In my test (linux, perl 5.8.8 List::MoreUtils 0.21) the original List::MoreUtils::uniq is about 400% faster than my perl implementation.

      If I rename the library, so List::MoreUtils must rely on its perl implementation, my solution is about 20% - 25% faster.

      I don't want to argue against List::MoreUtils; but now I wonder about these two (perl) solutions:

      # presented in perlfaq4 - How can I remove duplicate elements from a l +ist or array? sub my_uniq { my %h; grep { !$h{$_}++ } @_; } # vs. # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }

      I can't recognize an advantage in the usage of map and the ternary operator.

      edit: text refined

        Well looky there... you know, it never occurred to me to put an empty list into a list with map to skip an entry. I was thinking in terms of Lisp, where adding '() gives you a nil entry. I assumed the result would have been to add 0 or something, even though I knew

        map { ( 0..$_ ) } ( 1..3 );
        (0, 1, 0, 1, 2, 0, 1, 2, 3)


        # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }
        This seems like a strange construction (and yet, as you point out, it's what's in the List::MoreUtils source). Surely $h{$_}++ == 0 is the same as ! $h{$_}++ in this setting? I thought the main point of using the List::MoreUtils uniq was to avoid edge cases, but this seems to do nothing but replace a grep with an essentially equivalent map. (In particular, it doesn't do anything to avoid stringification of objects.)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://720786]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (14)
As of 2018-05-22 13:07 GMT
Find Nodes?
    Voting Booth?