Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

fast bit twiddling

by spurperl (Priest)
on May 02, 2004 at 05:44 UTC ( #349759=perlquestion: print w/ replies, xml ) Need Help??
spurperl has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have a string of 0s and 1s of some length (lets say "1010101011"), and a list of indices (lets say 2, 5, 8). I should return TRUE if the bits at these indices are the negation of the bits before them. In our example:

Bit 2 is NOT bit 1, so OK. (indices range 0 to string length - 1).
Bit 5 is NOT bit 4, so OK.
Bit 8 is NOT bit 7, so OK.
All 3 OK, so I return TRUE. Had one of these been false, I'd return FALSE.

Pretty simple, but I must make it as fast as possible. The basic approach would be just a loop on these indices and checking with substr() for each of them whether it's the NOT of the previous. Or a split and then array operations (I've yet to time them against each other).

Is there a quicker way ? Remember: the input is a string "..." and a list of indices into this string. For simplicity sake assume correctness of all arguments.

Comment on fast bit twiddling
Re: fast bit twiddling
by Zaxo (Archbishop) on May 02, 2004 at 05:51 UTC

    See 'perldoc -f vec' for a faster, better way of handling bitfields. All the rest is bitwise operator magic.

    { vec(my $str, 2, 1) = 1; vec($str, 5, 1) = 1; vec($str, 8, 1) = 1; sub mytest { my $foo = @_ ? shift : $_; # was 'shift || $_' $foo <<= 1; $str & $foo == $str; } }
    That code is structured as a closure to keep $str private, but that can be loosened.

    Update: Oops, wrong tests. that checks for equality. Here's the real thing,

    { vec(my $str, 2, 1) = 1; vec($str, 5, 1) = 1; vec($str, 8, 1) = 1; sub mytest { my $foo = @_ ? shift : $_; # was 'shift || $_' my $bar = $foo << 1; not ($foo & $str) ^ ($bar & $str); } }
    ( hope that's right! ;-) The Ancient Philosophy And Programming Languages thread should have said something about Aristotle's formal logic.

    Update2: modified mytest's argument handling to behave properly with a numerically false argument.

    After Compline,
    Zaxo

Re: fast bit twiddling
by exussum0 (Vicar) on May 02, 2004 at 06:01 UTC
Re: fast bit twiddling
by japhy (Canon) on May 02, 2004 at 06:26 UTC
    Well, because it's late and I'm addicted, here's a regex to do it:
    sub rx_bits { my $str = shift; my $l = 0; $str =~ ( "^" . join "", map "[01]" x ($_ - ($l+0,$l=$_)[0] - 2) . "(?:01|10 +)", @_ ); }
    I don't say it's fast, it's just a regex.
    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
Re: fast bit twiddling
by matija (Priest) on May 02, 2004 at 06:36 UTC
    Everybody else seems to be handling bits, but you DID specify the input was a string. I think substr is a good solution, but with a wrinkle: grab both characters at the same time, and usee them as index into a hash, like this:
    my %res=("00"=>0, "01"=>1,"10"=>1,"11"=>0); my $tmpres=1; foreach (@list) { $tmpres=$tmpres && $res{substr($string,$_-1,2)}; } return tmpres;
Re: fast bit twiddling
by BrowserUk (Pope) on May 02, 2004 at 09:16 UTC

    Assuming that the description a string of 0s and 1s of some length (lets say "1010101011") means string of bytes where each byte is either '0' or '1', and that the list of indices is subject to change with each call, then I'd try something like this.

    sub buk2{ my $s = shift; substr $s, $_-1, 1 eq substr $s, $_, 1 and return for @_; return 1; }

    It fails quickly if possible, and avoids creating lots of intermediate scalars or arrays.

    If the '1's and '0's are bits encoded in a numeric value and the list of indices are reused, then zaxo's method is probably much quicker.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      There's a mistake here, since the eq operator has an higher precedence than the comma.
      The sub should be written this way:
      sub buk2{ my $s = shift; substr($s, $_-1, 1) eq substr($s, $_, 1) and return for @_; return 1; }
      Oddly enough, after the correction above this solution proves to be the fastest (probably because it reaches the first return more frequently.)
      See here for the updated benchamark results.

      Cheers, Emanuele.

        Nice pickup.++


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
Re: fast bit twiddling
by ysth (Canon) on May 02, 2004 at 09:56 UTC
    sub ok { (" ".shift) =~ /^@{[map "(?=.{$_}(?:01|10))",@_]}/x }
Re: fast bit twiddling
by bart (Canon) on May 02, 2004 at 10:07 UTC
    I'm not too sure this will be the fastest way for your particular goal, but it's an interesting alternative route, so I'll present it anyway.

    Make a copy of your first string, insert an extra "0" on the front, and do bytewise XOR. You'll end up with a string of consisting of chr(0) and chr(1), the latter if both "bits" were different. If you OR with a string of "0" characters, you'll turn that into "0" and "1" respectively. Like this:

    my $str = "1010101011"; my $second = "0$str"; my $xor = $str ^ $second; my $result = $xor | "0" x length $xor;
    After this, $result contains 11111111101. The first and last character are rather meaningless. But you can test whether bits with position 2, 5 and 8 pass, by extracting those characters from $result using substr.

    Now, moving on to the second part of your question, You can use AND. First, make a mask, consisting of "0" characters and with the same length, setting it to "1" for those positions you're interested in. AND this with $result, and if this result is the same as $mask itself, you have a match.

    my $mask = "0" x length $result; substr($mask, $_, 1) = "1" foreach 2, 5, 8; print "It passes!" if ($mask & $result) eq $mask;
    and it does pass.
Re: fast bit twiddling
by BUU (Prior) on May 02, 2004 at 10:39 UTC
    And it's benchmark time!

    The results are directly below, the code used to generated the results is in the readmore tags. Note that I was a tad confused at exactly how to use a couple (namely ysth's!) so I had to guess and I might not be using them correctly. Also note that I took the easiest way to try to cram it in to a benchmarking format and I might have done something that screwed up the results. If you want to submit another bit of code in the form name => sub { } that I can more easily/effectively bench, I'd be glad to rebenchmark.

    Results:
    Rate japhy ysth matija zaxo browser +uk2 japhy 8361/s -- -37% -74% -82% - +87% ysth 13309/s 59% -- -58% -71% - +79% matija 31862/s 281% 139% -- -31% - +49% zaxo 46230/s 453% 247% 45% -- - +26% browseruk2 62527/s 648% 370% 96% 35% + --
      Try:
      ysth => sub { my $s = "1010101011"; my $test = sub { (" $s") =~ /^@{[map "(?=.{$_}(?:01|10))" +,@_]}/ +x }; $test->(2,5,8); }
      And properly speaking, the ones that don't support multiple bits should be $test->(2) && $test->(5) && $test->(8). Also not sure why you are using anonymous subs in the code to test.

      Like ysth's, mine, matija's and japhy's are all intended to take a string, and a list of bits to test as described in the OP. Zaxo's and sporty's seem to be addressing a different (though not necessarily wrong) interpretation of the OP and would need adapting before they were comparable.

      I also found your method of benchmarking somewhat eclectic:), so I did my own which also tests some failing cases and a few different sets of bits. I believe the included cases are all comparable, but I'm open to the possibility that zaxo & sporty interpreted the problem more correctly.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        Well, if they are bits, then the number can be done as binary. So translating it from a string to a number isn't evil. :)

        If this were trinary or pentary (base 3 or base 5), it'd be a little more difficult and using a string representation may be better. I have no interest in base 3 or 5, so I haven't tested. ;)

Re: fast bit twiddling
by runentertain (Initiate) on May 03, 2004 at 01:33 UTC
    bit 1 = bit 5 = bit -1
Re: fast bit twiddling
by esialb (Initiate) on May 03, 2004 at 13:09 UTC
    sub stest { my $str = shift; my $last_index=1; my @sorted_indexes = sort { $a <=> $b } @_; die "bad index" if( $sorted_indexes[0] < 1 ); my $offset; do { $last_index += ( ( $offset = ( (shift @sorted_indexes || return true) - $last_index ) || next ) ); substr($str, 0, $offset, ""); } while( $str=~/^01|^10/ ); return false; }
    USE: stest( $bit_string, @indexes ) Should be ridiculously fast, because it takes every opportunity to jump out of the main loop as soon as the relevant data is available, and often before it is stored. There is no recursion, and extremely tiny memory footprint. The only potential bottleneck is that it requires the bit list to be sorted.

      A couple of problems. 'true' and 'false' are not keywords in perl, and using next in a do while loop is a no-no, resulting, in this case, in the subroutine returning with the error message: "Exiting subroutine via next at..."


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
        oof, yes, i seem to have lost some of my perl-fu. i've recently been drawn to the dark side, aka functional languages.
        But I shortened my program even more
        sub stest { my $str = shift; foreach (@_) { substr($str,$_-1,2) =~ /^(10|01)/ || return; } return 1; } #USE: stest( $bit_string, @indexes )
Re: fast bit twiddling
by emazep (Priest) on May 03, 2004 at 19:42 UTC
    Inspired by BrowserUK's 2nd algorithm but slightly faster, since it calls substr just once in exchange for an extra assignment:
    sub emazep { my $s = shift; my $tmp; ( ($tmp = substr $s, $_-1, 2) eq '11' or $tmp eq '00' ) and return +for @_; return 1; }
    Using exactly the same benchmark provided here by BrowserUK, it gives:
    Rate ysth japhy matija buk2 buk1 emazep ysth 1966/s -- -17% -87% -89% -89% -91% japhy 2361/s 20% -- -84% -87% -87% -89% matija 14719/s 648% 523% -- -19% -20% -32% buk2 18088/s 820% 666% 23% -- -1% -17% buk1 18363/s 834% 678% 25% 2% -- -16% emazep 21753/s 1006% 821% 48% 20% 18% --

    Update:

    added esialb's (corrected) solution to the BrowserUK's benchamrk:
    Rate ysth japhy matija esialb buk1 buk2 emazep ysth 1975/s -- -17% -87% -88% -89% -89% -91% japhy 2374/s 20% -- -84% -85% -87% -87% -89% matija 15000/s 659% 532% -- -8% -17% -18% -30% esialb 16299/s 725% 586% 9% -- -10% -11% -23% buk1 18156/s 819% 665% 21% 11% -- -1% -15% buk2 18277/s 825% 670% 22% 12% 1% -- -14% emazep 21295/s 978% 797% 42% 31% 17% 17% --

    2nd Update:

    I've been misleaded by the BrowserUK solution, which contains an error which also slowed down his sub (see here.) After the correction his sub proves to be the fastest (dani_l's very elegant solution also included in the benchmark):
    Rate ysth japhy esialb matija buk1 dani_l emazep buk2* ysth 1845/s -- -22% -85% -87% -88% -89% -90% -91% japhy 2366/s 28% -- -81% -84% -84% -86% -87% -88% esialb 12530/s 579% 430% -- -14% -16% -28% -31% -38% matija 14566/s 689% 516% 16% -- -2% -16% -20% -28% buk1 14885/s 707% 529% 19% 2% -- -14% -18% -27% dani_l 17357/s 841% 634% 39% 19% 17% -- -5% -14% emazep 18181/s 885% 669% 45% 25% 22% 5% -- -10% buk2* 20275/s 999% 757% 62% 39% 36% 17% 12% --
    buk2* is the buk2 sub as corrected by me here.

    Cheers, Emanuele.
      Refinement of emazep's:
      sub dani_l { my $s = shift; substr($s, $_ - 1, 2) =~ tr/1// == 1 or return 0 for @_; return 1; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://349759]
Approved by davido
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (11)
As of 2014-09-18 21:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (124 votes), past polls