Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Merge arrays - seeking slicker approach

by puterboy (Scribe)
on Dec 21, 2010 at 14:47 UTC ( #878267=perlquestion: print w/replies, xml ) Need Help??
puterboy has asked for the wisdom of the Perl Monks concerning the following question:

I need to merge the elements of several arrays without repeats. Currently, I am using hash slices as follows:

my %temphash; @temphash{@array1, @array2} = (); my @mergedkeys = keys %temphash

I would love to eliminate the temporary hash by using anonymous hashes. But I seem to be missing the right magical combination of braces, ampersands, and percent signs to make it work. For example, I tried to no avail various things like:

 my @mergedkeys = keys %{{@array1, @array2}}

Similarly, is there any way to eliminate the temporary hash when I want to delete any elements of array2 appearing in array1 (along with eliminating any duplicates)? Currently, I am using the following code with a temporary hash:

my %temphash; @temphash{@array1} = (); delete @temphash{@array1}; my @newarray = keys %temphash;

Replies are listed 'Best First'.
Re: Merge arrays - seeking slicker approach
by kennethk (Abbot) on Dec 21, 2010 at 15:00 UTC
    The issue with my @mergedkeys = keys %{{@array1, @array2}} is that your assignment does not include values for the keys. This can be fixed using map:

    my @mergedkeys = keys %{{map {$_ => 1} @array1, @array2}};

    However: assuming this code will have to be maintained, consider that what is slick today maybe wholly obscure next month. Rather than using this, I'd recommend using List::MoreUtils' uniq to accomplish this. Its intent will be much more obvious on casual inspection. As well, it will respect order, if that matters to your application.

    On the topic of "eliminat[ing] the temporary hash", you can use scoping the get Perl to to the work for you. Just declare your temporary hash with my in a code block and it will go out of scope and be garbage collected once you exit the block. Something like:

    TEMP_HASH_BLOCK: { my %temphash; @temphash{@array1} = (); delete @temphash{@array1}; my @newarray = keys %temphash; }

    or use do (or a subroutine) if you need to export:

    my @newarray = do { my %temphash; @temphash{@array1} = (); delete @temphash{@array1}; keys %temphash; };
Re: Merge arrays - seeking slicker approach
by roboticus (Chancellor) on Dec 21, 2010 at 15:12 UTC


    I don't know of a way to do so, but List::MoreUtils has a handy function uniq to handle the first case:

    #!/usr/bin/perl use strict; use warnings; use List::MoreUtils qw(uniq); my @a = qw(alpha beta gamma); my @b = qw(delta kappa gamma); my @c = uniq(@a, @b); print join(", ", @c), "\n";

    You'll also find some interesting items in List::Util, Hash::Util and Scalar::Util.

    But if you find the temporary a distraction and don't want to include a module for just that purpose, just hide it in a subroutine and go ahead and use the temporary in your subroutine. After all, a temporary isn't necessarily inefficient. So by writing a subroutine with a good name, you can make the code cleaner and self-documenting at the same time, as in the above example. I find that some code can be difficult to read when there's "too much action" happening in a statement, and you may find it difficult to maintain if the statement is too hard to create. So just relax, write a subroutine to do your dirty work, and use it--or include an appropriate module, where someone has already done the dirty work for you!


    When your only tool is a hammer, all problems look like your thumb.

Re: Merge arrays - seeking slicker approach
by eff_i_g (Curate) on Dec 21, 2010 at 15:01 UTC
Re: Merge arrays - seeking slicker approach
by ikegami (Pope) on Dec 21, 2010 at 18:19 UTC

    I use the following (from perlfaq4):

    my %seen; my @mergedkeys = grep !$seen{$_}++, @array1, @array2;

    It's a bit of a doozie to understand, but it's just one of those patterns you keep seeing and you memorise.

    Bonus: it preserves order.

Re: Merge arrays - seeking slicker approach
by Marshall (Abbot) on Dec 21, 2010 at 18:36 UTC
    I think that roboticus has the ticket with the "uniq" function for question #1. List::MoreUtils is an XS module and should run very fast.

    As a thought for question #2, here is some Perl 5.10 code using the smart match operator (~~).

    #!/usr/bin/perl -w use strict; use 5.10.0; my @a = qw(alpha beta gamma zulu); my @b = qw(delta alpha kappa gamma beta); my @c = grep{!($_ ~~ @a)}@b; print "@c"; #prints: delta kappa # @c is @b after removing any element # that was contained in @a
    This is my first program with this new Perl 5.10 "~~" operator and I have no idea of how "smart" or not "smart" this new operator really is, especially in a loop like above. I would have to run some benchmarks, but since this is one line of code, I leave that to the OP to experiment with and present this as just an idea.
Re: Merge arrays - seeking slicker approach
by ikegami (Pope) on Dec 21, 2010 at 18:20 UTC

    I would love to eliminate the temporary hash by using anonymous hashes.

    We often get such questions, yet Perl provides a mechanism to hide away complex code: subs! Move your code info a sub, and you won't have temporaries lying around. (As mentioned, this has already been done for you as List::Util's uniq.)

Re: Merge arrays - seeking slicker approach
by JavaFan (Canon) on Dec 21, 2010 at 17:41 UTC
    I consider this to be idiom:
    my @mergedkeys = do {my %t; grep !$t{$_}++, @array1, @array2};
    There's a temporary hash, but its scope is limited.

    I think I first saw it used by John Orwant (perhaps in Mastering Algorithms with Perl), and later in the Perl Cookbook. I think it's also in the perlfaq. I must have used it a thousand programs.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://878267]
Approved by kennethk
[stevieb]: Damn... just wasted two hours wondering why num 23 wasn't setting bit 5 in a register. I was working on the decimal, but the register holds BCD numbers. Sigh.

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (8)
As of 2018-05-22 16:35 GMT
Find Nodes?
    Voting Booth?