Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Best method to diff large array

by LanX (Bishop)
on Nov 25, 2013 at 04:29 UTC ( #1064180=note: print w/replies, xml ) Need Help??

in reply to Best method to diff very large array efficiently

Using hash-slices like you demonstrated is the fastest way I know. (But you didn't show us your data)

But I'm confused about the sorts.

a) why do you think you need them? Sorting is comparatively slow!

b) do you really have numeric data? otherwise <=> won't help!

For completeness:

If you only have scalars as data which stringifies in a unique way (i.e no references) you don't need to populate the values and just take the keys . @hash{@arr1}=().

And I think you also want to calculate the symmetric difference, i.e. @arr2 \ @arr1 is missing.

Cheers Rolf

( addicted to the Perl Programming Language)

PS: Maybe of interest Using hashes for set operations...

Replies are listed 'Best First'.
Re^2: Best method to diff large array
by newbieperlperson (Acolyte) on Nov 25, 2013 at 05:10 UTC
    Hi Rolf,

    Thank you for responding.

    Good point on the sort, it is not required, I will remove that from my example.

    The goal for the code is to find what data is missing from @arr_1.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1064180]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (1)
As of 2018-02-19 04:43 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (258 votes). Check out past polls.