http://www.perlmonks.org?node_id=1064180


in reply to Best method to diff very large array efficiently

Using hash-slices like you demonstrated is the fastest way I know. (But you didn't show us your data)

But I'm confused about the sorts.

a) why do you think you need them? Sorting is comparatively slow!

b) do you really have numeric data? otherwise <=> won't help!

For completeness:

If you only have scalars as data which stringifies in a unique way (i.e no references) you don't need to populate the values and just take the keys . @hash{@arr1}=().

And I think you also want to calculate the symmetric difference, i.e. @arr2 \ @arr1 is missing.

Cheers Rolf

( addicted to the Perl Programming Language)

PS: Maybe of interest Using hashes for set operations...

Replies are listed 'Best First'.
Re^2: Best method to diff large array
by newbieperlperson (Acolyte) on Nov 25, 2013 at 05:10 UTC
    Hi Rolf,

    Thank you for responding.

    Good point on the sort, it is not required, I will remove that from my example.

    The goal for the code is to find what data is missing from @arr_1.

    AJ