Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: matching values in arrays

by Excalibor (Pilgrim)
on Dec 14, 2004 at 11:25 UTC ( [id://414681]=note: print w/replies, xml ) Need Help??


in reply to Re: matching values in arrays
in thread matching values in arrays

This solution is nice, and the hash is probably the way to go with it, but if the arrays contain repeated numbers, (i.e. their intersection is not void) it will probably fail (as it will always mark the last array, instead of the first one, or all of them)...

Expanding your solution a bit will yield this:

my %which; do { push @{ $which{$_} } => "array1" } for @array1; do { push @{ $which{$_} } => "array2" } for @array2; do { push @{ $which{$_} } => "array3" } for @array3; my @pairs = @array4; while (my ($one,$two) = splice(@pairs,0,2)) { print "$one and $two are in (@{$which{$one}}) and (@{$which{$two} +}) respectively\n"; }

which is more generic and works even if the numbers are shared by any of the arrays...

Great idea, anyway!

--
our $Perl6 is Fantastic;

Replies are listed 'Best First'.
Re^3: matching values in arrays
by Animator (Hermit) on Dec 14, 2004 at 15:46 UTC

    Yeah, but it uses a lot more memory...

    Another approach would be to use grep, loop the fourth array and in each iteration use grep on the other arrays...

      It uses pretty little space (not just in this particular example, but in general...) considering what's doing... Think of it as an index on the joins of the three arrays...

      For really big arrays using greps on every iteration would make the operation really cumbersome and memory would be used anyway for the temporaries for grep on each of the arrays... This approach allows logarithmic access to the key once the index is done, yours would be at least of order O(N^2)...

      Example:

      # some are shared by others... @array1 = (1 .. 100001); @array2 = (100001 .. 200001); @array3 = (200001 .. 300000); for $i ( 1 .. 100000, 1 .. 10000 ) { $a = int( rand( 3 ) ) + 1; # dark magic, beware... push @array4, ${*{"array$a"}}[$i]; } my %which; do { push @{ $which{$_} } => "array1" } for @array1; do { push @{ $which{$_} } => "array2" } for @array2; do { push @{ $which{$_} } => "array3" } for @array3; my @pairs = @array4; while (my ($one,$two) = splice(@pairs,0,2)) { print "." if "$one and $two are in (@{$which{$one}}) and (@{$which{ +$two}}) respectively\n"; } __END__ This renders (on my box): 4.05user 0.15system 0:04.31elapsed 97%CPU (0avgtext+0avgdata 0maxresid +ent)k 0inputs+0outputs (0major+22995minor)pagefaults 0swaps

      with bigger arrays it will grow bigger, of course, but if you have that many data you will have a lot of RAM... memory is cheap nowadays, cheaper than CPU cycles, I think...

      best regards

      --
      our $Perl6 is Fantastic;

        I only suggested an other approach... in case the person doesn't have much memory but has enough time to run it...

        The orignal arrays (accoring to Devel::Size::total_size) uses 152 bytes, the @which-array (posted by ysth), uses 629 bytes. Your hash of array references uses 2200 bytes.

        Using the same code but whit your example arrays: 2_000_072 bytes (for the first two and 2_000_052 for the last one), @which-array: 11_397_188, hash of array references: 44_886_169 bytes, which is 3,9 times more then the array. And that is in my opinion a lot more.

        Used code:

        Something that would be better would be to use a hash of a single scalar and append the text ("array1 ", "array2 ", ...) to it (instead of using an anonymous array).

        It will use less memory and it should be faster.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://414681]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-03-19 04:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found