Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Remove Duplicates from Array

by dragonchild (Archbishop)
on Oct 31, 2008 at 21:03 UTC ( #720786=note: print w/ replies, xml ) Need Help??


in reply to Remove Duplicates from Array

Don't do that. Use List::MoreUtils and the uniq() function. uniq() will work in cases where yours won't, such as lists of objects and other edge cases. Plus, List::MoreUtils has a ton of really good functions to have on hand.


My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?


Comment on Re: Remove Duplicates from Array
Replies are listed 'Best First'.
Re^2: Remove Duplicates from Array
by mpeever (Friar) on Oct 31, 2008 at 22:59 UTC
    Additionally, List::MoreUtils uniq() will maintain your list order. Using a temporary hash is tempting, but it will almost certainly change the order of elements.

      And List::MoreUtils::uniq can be faster as it tries to load a library to implement its functionality via DynaLoader. If that fails it implements a plain perl way.

      In my test (linux, perl 5.8.8 List::MoreUtils 0.21) the original List::MoreUtils::uniq is about 400% faster than my perl implementation.

      If I rename the library, so List::MoreUtils must rely on its perl implementation, my solution is about 20% - 25% faster.

      I don't want to argue against List::MoreUtils; but now I wonder about these two (perl) solutions:

      # presented in perlfaq4 - How can I remove duplicate elements from a l +ist or array? sub my_uniq { my %h; grep { !$h{$_}++ } @_; } # vs. # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }

      I can't recognize an advantage in the usage of map and the ternary operator.

      edit: text refined

        Well looky there... you know, it never occurred to me to put an empty list into a list with map to skip an entry. I was thinking in terms of Lisp, where adding '() gives you a nil entry. I assumed the result would have been to add 0 or something, even though I knew

        map { ( 0..$_ ) } ( 1..3 );
        yields
        (0, 1, 0, 1, 2, 0, 1, 2, 3)

        Cool.

        # List::MoreUtils::uniq sub LM_uniq { my %h; map { $h{$_}++ == 0 ? $_ : () } @_; }
        This seems like a strange construction (and yet, as you point out, it's what's in the List::MoreUtils source). Surely $h{$_}++ == 0 is the same as ! $h{$_}++ in this setting? I thought the main point of using the List::MoreUtils uniq was to avoid edge cases, but this seems to do nothing but replace a grep with an essentially equivalent map. (In particular, it doesn't do anything to avoid stringification of objects.)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://720786]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2015-07-30 04:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls