Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Speed of comparison of two one-dimensional arrays

by rsFalse (Hermit)
on Mar 12, 2019 at 14:10 UTC ( #1231163=perlquestion: print w/replies, xml ) Need Help??

rsFalse has asked for the wisdom of the Perl Monks concerning the following question:

Hello.

Yesterday I was comparing pairs of arrays of integers (0 < x < 1e9), if they contain the same multiset of integers. Every pair consisted of arrays of the same size, so before comparison I sorted them alphabetically (numeric sort subroutine seems to be redundant here).

Firstly, I tried to compare by smartmatch '~~' operator, as I thought it could be the fastest implemented way.
@{ $h{ $key } } ~~ @{ $g{ $key } } and ... ;
Secondly, I compared arrays by string equality, by interpolating upto-9-digit integers into strings.
"@{ $h{ $key } }" eq "@{ $g{ $key } }" and ... ;
The second way have shown ~1.6x faster times.

Shouldn't smartmatch outrun string interpolation at that task? And which methods are fastest to accomplish comparison?

Replies are listed 'Best First'.
Re: Speed of comparison of two one-dimensional arrays
by hippo (Chancellor) on Mar 12, 2019 at 14:59 UTC
    Shouldn't smartmatch outrun string interpolation at that task?

    I'd say probably not. Smartmatch is as generic as possible and without knowing the implementation details my expectation would be that the more generic method would be slower. Anyway, since smartmatch is experimental I wouldn't be using it in code intended to be stable. YMMV.

    And which methods are fastest to accomplish comparison?

    That will almost certainly depend on your corpus. If your arrays differ in the first element then any iterative comparison will beat a whole-array one hands down, for example. As usual, the way to be sure is to benchmark against representative sample data.

Re: Speed of comparison of two one-dimensional arrays
by thanos1983 (Parson) on Mar 12, 2019 at 14:20 UTC

    Hello rsFalse,

    I am not an expert on this area so I will leave another Monk more experienced to answer your question.

    I found this older question (fastest way to compare two arrays) that contains a few suggestions. Would you give it a try and time them also?

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Speed of comparison of two one-dimensional arrays
by tobyink (Canon) on Mar 12, 2019 at 16:23 UTC

    I'm not especially surprised that string interpolation is faster. If you want the absolute fastest way, I imagine that a custom function written in XS would be fastest. I might have a play with that idea and see what I can find out.

      An Inline::C implementation runs about three times faster than the string interpolation version for integer comparison, and about 2.5 times faster for string comparisons.

      use v5.16; use strict; use warnings; use Inline 'C'; sub arrays_same_pp { "@{$_[0]}" eq "@{$_[1]}"; } our @x = map { int(rand() * 1_000_000) } 1..10; our @y = @x; use Benchmark 'cmpthese'; cmpthese -1, { PP => '::arrays_same_pp(\@::x, \@::y)', XS => '::arrays_same_i(\@::x, \@::y)', }; __END__ __C__ int arrays_same_i (SV* x, SV* y) { AV* arrx = (AV*)SvRV(x); AV* arry = (AV*)SvRV(y); if (arrx == arry) { return 2; } int len = av_len(arrx); if (len != av_len(arry)) { return 0; } int i; for (i=0; i<=len; i++) { SV** elemx = av_fetch(arrx, i, 0); SV** elemy = av_fetch(arry, i, 0); int ix = SvIV(*elemx); int iy = SvIV(*elemy); if (ix != iy) { return 0; } } return 1; } int arrays_same_s (SV* x, SV* y) { AV* arrx = (AV*)SvRV(x); AV* arry = (AV*)SvRV(y); if (arrx == arry) { return 2; } int len = av_len(arrx); if (len != av_len(arry)) { return 0; } int i; for (i=0; i<=len; i++) { SV** elemx = av_fetch(arrx, i, 0); SV** elemy = av_fetch(arry, i, 0); STRLEN dummy; char* strx = SvPV(*elemx, dummy); char* stry = SvPV(*elemy, dummy); if (strcmp(strx, stry) != 0) { return 0; } } return 1; }

      I might clean this up and release to CPAN if there's nothing similar already there.

        Can you say a few words as to how this compiles?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1231163]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2020-01-21 05:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?