XP is just a number PerlMonks

### How do I find if an array has duplicate elements, if so discard it?

by Zombie toddprof (Initiate)
 on Jul 13, 2002 at 21:33 UTC Need Help??
Contributed by Zombie toddprof on Jul 13, 2002 at 21:33 UTC
Q&A  > arrays

#### Description:

```@A = (1, 2, 3, 4, 5) ;
@B = (6, 7, 7, 8, 9) ;
@C = (10, 11, 12, 13, 15) ;

@D = ( @A , @B , @C );

Since @B has duplicates (7,7) It should not appear in @D. In the end @D should contain only sets @A and @C

 Answer: How do I find if an array has duplicate elements, if so discard it?contributed by DamnDirtyApe ``` use strict ; use warnings ; use Data::Dumper ; sub has_dups { my \$arr = shift ; my %counter ; foreach ( @\$arr ) { return 1 if \$counter{\$_}++ ; } return 0 ; } my @A = (1, 2, 3, 4, 5) ; my @B = (6, 7, 7, 8, 9) ; my @C = (10, 11, 12, 13, 15) ; my @D = grep { !has_dups(\$_) } ( \@A, \@B, \@C ) ; print Dumper( \@D ) ; [download]``` Answer: How do I find if an array has duplicate elements, if so discard it?contributed by Zombie gba this is some code i applied to an array: @legit is an array of elements @uniq is the resulting array of unique elements. ```foreach(@legit) { unless(\$b{\$_}++) { push(@uniq,\$_); } } [download]``` Answer: How do I find if an array has duplicate elements, if so discard it?contributed by Zombie maraist I think, as other's have suggested that this is largely dependent on your data structures. e.g. Are we talking about small amounts of data run not so oftenly. Or are we talking about massive amounts of data that is rarely updated, or massive data that's updated regularly. ```Here's a chart: ----------- small data, unoften updated, order not important # want simplicity sub remove_redundant { my %hdata = map { (\$_, 1 ) } @_; return keys %hdata; } ``` Note that you might just want to save things in a hash to begin with (and use "keys %data" whenever you want it as an array). ```------------ # small data, often update (more reads than writes) Keep in a hash and convert to a temp array when needed. %data{\$val} = 1; for my \$el ( keys %data ) { ... } You can even create an overloaded array object which is really just a hash. ------------ # for large data that's rarely updated push @data, \$val if ! grep { \$_ eq \$val } @data; ``` It's a linear search, but if we're talking megs of data here, this is MUCH better than building a HUGE intermediate hash, then having to garbage collect it afterwards. ```----------- # for large data that's often updated. (more reads than writes) ``` It's worth while building an overloaded class that stores the data as a hash.. Ideally just use a hash (except that that'll waste a lot of memory). It's better to waste this memory up front than to fragement your dynamic memory pool by constantly generating intermediate memory scratch pads. There is one final solution, and that is to maintain sorted data, then implement a c-function to efficiently perform an insertion function. You can either use a red-black tree or an insertion sort. You could get very creative, and it would obviously be of general utilization (since you're storing scalars). There might be CPAN modules for this already. The red-black tree is a good compromise between performance and memory use whereas the insertion sort will give you the best memory utilization. ```=-=-=-=-=-=-==- One possible mechanis for using a hash for the array is the following: our %hash_cache; sub hash_array_push { my ( \$hash_name, @vals ) = @_; for my \$val ( @vals ) { \$hash_cache{\$hash_name}{\$val} = 1; } } sub hash_array_del { my (\$hash_name) = @_; delete \$hash_cache{\$hash_name}; } sub hash_array_getref { my ( \$hash_name) = @_; # note that the user can more efficiently # utilize an array-ref than an array. return \$hash_cache{\$hash_name}; } ``` Yes there's an additional function call overhead for these funcs, but you'd have that with just about any solution aside from manually maintaining the hash. Alternatively you could use it as oo, but OO in perl adds a slightly greater amount of function-call overhead (which is probably offset by the lack of need to do the symbolic hash-table name lookup). Answer: How do I find if an array has duplicate elements, if so discard it?contributed by hossman A one-liner from "Effective Perl Programming"... @uniq = sort keys %{ { map { \$_, 1 } @list } }; Answer: Clarify: How do I find if an array has duplicate elements, if so discard it?contributed by BrowserUk Depending how your building your data structure, it might make more sense to only add an array to @D, if that array has no duplicates. It being easier to not add it than to later remove it. Answer: How do I find if an array has duplicate elements, if so discard it?contributed by rjimlad Err, okay, my answer, short and sweet: use a hash and array slices: ```my @a=(1 .. 2); my @b=(2 .. 3); my %c; @c{@a,@b}=(@a,@b); warn "Duplicates exist!" if scalar @c{@a,@b} != (@a+@b); [download]``` Of course, in this instance you don't want the last line - instead you'd probably want something like: ```my @d=keys %c; [download]``` ...which will be out-of-order, but guaranteed duplicate free. Answer: How do I find if an array has duplicate elements, if so discard it?contributed by snoopy ```use YAML; use List::MoreUtils; my @A = (1, 2, 3, 4, 5) ; my @B = (6, 7, 7, 8, 9) ; my @C = (10, 11, 12, 13, 15) ; my @D = grep {List::MoreUtils::uniq(@\$_) == @\$_} (\@A, \@B, \@C); print YAML::Dump(@D); [download]```

• Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
• Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
• Read Where should I post X? if you're not absolutely sure you're posting in the right place.
• Posts may use any of the Perl Monks Approved HTML tags:
a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
• You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
 For: Use: & & < < > > [ [ ] ]
• Link using PerlMonks shortcuts! What shortcuts can I use for linking?

Create A New User
Chatterbox?
and all is calm...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (10)
As of 2017-06-23 09:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How many monitors do you use while coding?

Results (539 votes). Check out past polls.