Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Comparing two arrays

by baxy77bax (Chaplain)
on Dec 15, 2013 at 12:58 UTC ( #1067231=note: print w/ replies, xml ) Need Help??


in reply to Re: Comparing two arrays
in thread Comparing two arrays

thank you so much for the code and the benchmark, after seeing this i'll try to implement the strategy. However what i'm wondering now is where does the speed come from. When I search for a certain bit in a bit-string I remember reading somewhere that the bit is found by iterating through the memory block whereas accessing an array element is constant. is it possible that these constants are so large that it is cheaper to linearly scan through memory blocks or did i mixed up something (Which is probably the case). Could you please educate me a "bit" :)

Thank you

baxy


Comment on Re^2: Comparing two arrays
Re^3: Comparing two arrays
by BrowserUk (Pope) on Dec 15, 2013 at 13:47 UTC
    i'm wondering now is where does the speed come from.

    Perhaps the simplest way to demonstrate the difference is to look at the number of opcodes generated in order to compare and count two sets of 64 bits stored as: two arrays; two strings of ascii 1s and 0s; two bitstrings of 64 bits each. You don't need to understand the opcodes to see the reduction.

    Moving as much of the work (looping) into the optimised, compiled-C, opcodes just saves huge swaths of time and processor:

    1. Arrays:
      C:\test>perl -MO=Terse -E"@a=map{int rand 2}1..64;@b=map{int rand 2}1. +.64; for my$a(@a){ for my $b(@b){ $a==$b and ++$count }}" LISTOP (0x34e7c58) leave [1] OP (0x34eec40) enter COP (0x34e7c98) nextstate BINOP (0x34e7d00) aassign [9] UNOP (0x34e7d70) null [142] OP (0x34e7d40) pushmark LOGOP (0x34e7e90) mapwhile [8] LISTOP (0x34e7f00) mapstart OP (0x34e7ed0) pushmark UNOP (0x34e7e58) null UNOP (0x34e7f40) null LISTOP (0x34e80d0) scope OP (0x34e8110) null [177] UNOP (0x34e8178) int [4] UNOP (0x34e81b0) rand [3] SVOP (0x34e81e8) const [7] IV +(0x33cca88) 2 UNOP (0x34e7f78) rv2av SVOP (0x34e7e20) const [26] AV (0x33c7570) UNOP (0x34e7de0) null [142] OP (0x34e7db0) pushmark UNOP (0x34e8220) rv2av [2] PADOP (0x34e8258) gv GV (0xa76c8) *a COP (0x34e7660) nextstate BINOP (0x34e76c8) aassign [18] UNOP (0x34e7738) null [142] OP (0x34e7708) pushmark LOGOP (0x34e7858) mapwhile [17] LISTOP (0x34e78c8) mapstart OP (0x34e7898) pushmark UNOP (0x34e7820) null UNOP (0x34e7908) null LISTOP (0x34e7a98) scope OP (0x34e7ad8) null [177] UNOP (0x34e7b40) int [13] UNOP (0x34e7b78) rand [12] SVOP (0x34e7bb0) const [16] IV + (0x33c6e30) 2 UNOP (0x34e7940) rv2av SVOP (0x34e77e8) const [27] AV (0x33c6830) UNOP (0x34e77a8) null [142] OP (0x34e7778) pushmark UNOP (0x34e7be8) rv2av [11] PADOP (0x34e7c20) gv GV (0x33c6f40) *b COP (0x34eecb0) nextstate BINOP (0x34eed18) leaveloop LOOP (0x34eee30) enteriter [19] OP (0x34eee88) null [3] UNOP (0x34eef28) null [142] OP (0x34eeef8) pushmark UNOP (0x34ef568) rv2av [21] PADOP (0x34e75b8) gv GV (0xa76c8) *a UNOP (0x34eed58) null LOGOP (0x34eed90) and OP (0x34eee00) iter LISTOP (0x34eef68) lineseq COP (0x34eefa8) nextstate BINOP (0x34ef010) leaveloop LOOP (0x34ef128) enteriter [22] OP (0x34ef180) null [3] UNOP (0x34ef220) null [142] OP (0x34ef1f0) pushmark UNOP (0x34ef4c8) rv2av [24] PADOP (0x34ef500) gv GV (0x33c6f4 +0) *b UNOP (0x34ef050) null LOGOP (0x34ef088) and OP (0x34ef0f8) iter LISTOP (0x34ef260) lineseq COP (0x34ef2a0) nextstate UNOP (0x34ef308) null LOGOP (0x34ef340) and BINOP (0x34ef428) eq OP (0x34ef498) padsv [ +19] OP (0x34ef468) padsv [ +22] UNOP (0x34ef380) preinc UNOP (0x34ef3b8) null +[15] PADOP (0x34ef3f0) +gvsv GV (0x33c5ed0) *count OP (0x34ef0c8) unstack OP (0x34eedd0) unstack -e syntax OK
    2. Strings:
      C:\test>perl -MO=Terse -E"$a=join'',map{int rand 2}1..64;@b=map{int ra +nd 2}1..64; $count=($a&$b)=~tr[1][]" LISTOP (0x3447bc0) leave [1] OP (0x344f178) enter COP (0x3447c00) nextstate BINOP (0x3447c68) sassign LISTOP (0x3447cd8) join [8] OP (0x3447ca8) pushmark SVOP (0x3448118) const [22] PV (0x332ca20) "" LOGOP (0x3447d88) mapwhile [7] LISTOP (0x3447df8) mapstart OP (0x3447dc8) pushmark UNOP (0x3447d50) null UNOP (0x3447e38) null LISTOP (0x3447fc8) scope OP (0x3448008) null [177] UNOP (0x3448070) int [3] UNOP (0x34480a8) rand [2] SVOP (0x34480e0) const [6] IV +(0x332cb58) 2 UNOP (0x3447e70) rv2av SVOP (0x3447d18) const [23] AV (0x3327640) UNOP (0x3448150) null [15] PADOP (0x3448188) gvsv GV (0xa76a8) *a COP (0x34475c8) nextstate BINOP (0x3447630) aassign [17] UNOP (0x34476a0) null [142] OP (0x3447670) pushmark LOGOP (0x34477c0) mapwhile [16] LISTOP (0x3447830) mapstart OP (0x3447800) pushmark UNOP (0x3447788) null UNOP (0x3447870) null LISTOP (0x3447a00) scope OP (0x3447a40) null [177] UNOP (0x3447aa8) int [12] UNOP (0x3447ae0) rand [11] SVOP (0x3447b18) const [15] IV + (0x3326f00) 2 UNOP (0x34478a8) rv2av SVOP (0x3447750) const [24] AV (0x3326900) UNOP (0x3447710) null [142] OP (0x34476e0) pushmark UNOP (0x3447b50) rv2av [10] PADOP (0x3447b88) gv GV (0x3327010) *b COP (0x344f1e8) nextstate BINOP (0x344f250) sassign UNOP (0x344f290) null BINOP (0x344f3e8) bit_and [21] UNOP (0x344f498) null [15] PADOP (0x34474e0) gvsv GV (0xa76a8) *a UNOP (0x344f428) null [15] PADOP (0x344f460) gvsv GV (0x3327010) *b PVOP (0x344f3b0) trans UNOP (0x3447518) null [15] PADOP (0x3447550) gvsv GV (0x33262d0) *count -e syntax OK
    3. Bits:
      C:\test>perl -MO=Terse -E"$a=int rand 2**64;$b=int rand 2**64; $count += unpack '%32b*', $a & $b" LISTOP (0x33e7460) leave [1] OP (0x33e6e60) enter COP (0x33e74a0) nextstate BINOP (0x33e7508) sassign UNOP (0x33e7548) int [4] UNOP (0x33e7580) rand [3] SVOP (0x33e75b8) const [13] NV (0x32ca498) 1.844674407 +37096e+019 UNOP (0x33e76a0) null [15] PADOP (0x33e76d8) gvsv GV (0x107668) *a COP (0x33e71f0) nextstate BINOP (0x33e7258) sassign UNOP (0x33e7298) int [8] UNOP (0x33e72d0) rand [7] SVOP (0x33e7308) const [14] NV (0x32ca5a0) 1.844674407 +37096e+019 UNOP (0x33e73f0) null [15] PADOP (0x33e7428) gvsv GV (0x32ca510) *b COP (0x33e6ed0) nextstate BINOP (0x33e6f38) sassign LISTOP (0x33e6fa8) unpack OP (0x33e6f78) null [3] SVOP (0x33e7108) const [15] PV (0x32ca600) "%32b*" BINOP (0x33e6fe8) bit_and [12] UNOP (0x33e7098) null [15] PADOP (0x33e70d0) gvsv GV (0x107668) *a UNOP (0x33e7028) null [15] PADOP (0x33e7060) gvsv GV (0x32ca510) *b UNOP (0x33e7140) null [15] PADOP (0x33e7178) gvsv GV (0x32ca5d0) *count -e syntax OK

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Comparing two arrays
by hdb (Parson) on Dec 15, 2013 at 14:03 UTC

    Just be careful to create your data as bitstrings in the first place. If you create arrays and then turn them into bitstrings to do the comparison, then it is not that fast:

    use strict; use warnings; use Benchmark 'cmpthese'; sub create { map {rand() < $_[1] ? 1 : 0} 1..$_[0] } sub compare2a { # first find 1s in x, then check in ys my $x = shift; my $n = shift; my @nxs = grep { $x->[$_] } 0..$n-1; return map { scalar grep {$_} @{$_}[@nxs] } @_; } sub compare4 { # bitstrings my $x = shift; $x = pack 'b*', join '', @$x; return map { unpack '%32b*', ( $x & pack 'b*', join'',@$_ ) } @_; } my $n = 15000; my $p = 0.005; my $ny = 10; my @x = create $n, $p; my @ys = map { [ create $n, $p ] } 1..$ny; my @r2a = compare2a \@x, $n, @ys; my @r4 = compare4 \@x, @ys; print "compare2a: @r2a\n"; print "compare4: @r4\n"; cmpthese( -5, { compare2a => sub{ compare2a \@x, $n, @ys }, compare4 => sub{ compare4 \@x, @ys }, } );
    Result:
    Rate compare4 compare2a compare4 246/s -- -55% compare2a 543/s 120% --
      If you create arrays and then turn them into bitstrings [ everytime ] to do the comparison, then it is not that fast:

      No shit Sherlock :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1067231]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2014-08-02 07:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who would be the most fun to work for?















    Results (55 votes), past polls