Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Implementing a signed/sign-propagating right bitwise shift

by LonelyPilgrim (Beadle)
on Feb 16, 2012 at 17:07 UTC ( #954284=perlquestion: print w/replies, xml ) Need Help??

LonelyPilgrim has asked for the wisdom of the Perl Monks concerning the following question:

Hi again. Building, kind of, from my post the other day (I am moving in some form of logic here) — I am still fiddling with binary and bitwise operations. Here, I am trying to implement, or at least mimic, the JavaScript >> operator — signed/sign-propagating right bitwise shift. Perl's >> seems to be unsigned/zero-filling, behaving the same as the JavaScript >>>.

What I have below seems to give me what I want. It spits out the same results as the equivalent JavaScript — though my working knowledge of JavaScript is more that of a fiddler than a serious programmer. The below presumes (as I think the JavaScript operators do?) that I'm working with signed 32-bit integers; and for this instance I think that is probably fine. (What I'm trying to do with all this is implement Paj's JavaScript MD5 library in Perl, so my Perl script will encrypt my password appropriately and login to a website correctly.)

So, my questions:

  1. Am I reinventing the wheel here? Are there Perl modules already that do what Paj's does? (If there are, I will not complain, because at least I am learning a lot.)
  2. Is there a way to change the behavior of Perl's bitwise operators, to do what I need them to do? I seem to need to use both the JavaScript >> (signed/sign-propagating) and >>> (unsigned/zero-filling) in the same scope.
  3. Or, is there an (easy) way to define an >>> operator in Perl (as of v5.14) that behaves as I need it to? (I know that I could overload the >> operator, but that without the other wouldn't help me.)
  4. Would what I have below work in every case?
  5. Is there a more elegant way to do what I'm doing below?

#!/usr/bin/perl use strict; use warnings 'all'; use Config; my $longsize = $Config{longsize}; my $firstbit = ($longsize * 8) - 1; sub rshift { # Should be signed/sign-propagating right shift my ($x, $y) = @_; foreach (my $i = 0; $i < $y; $i++) { my $lbit = (($x & (1 << $firstbit)) != 0); $x = ($x >> 1) | ((1 << $firstbit) * $lbit); } return unpack("l>", pack("l>", $x)); } sub bin_test_print { printf("%032b = %d\n", $_[0], $_[0]); } my $x = 9; bin_test_print($x); $x = rshift($x, 2); bin_test_print($x); print "\n\n"; $x = -9; bin_test_print($x); $x = rshift($x, 2); bin_test_print($x); print "\n\n"; $x = 12345; bin_test_print($x); $x = rshift($x, 2); bin_test_print($x); print "\n\n"; $x = -12345; bin_test_print($x); $x = rshift($x, 2); bin_test_print($x);

UPDATE: Thanks, everybody, for all your help. Through repeatedly beating my head against this bitwise problem, and spending a whole lot of time dissecting code and playing with encryption routines -- I have figured out that, yes, I was, in fact, re-inventing the wheel. Not only did the Digest::MD5 package take care of what Paj's MD5 JavaScript module does, but I realized that the website I'm working with doesn't even use MD5 encryption for its login -- just simple Base64 encoding. The issue that was throwing me was that there were two passwords being sent with the form, an encrypted one (in fact just Base64) and a seemingly doubly-encrypted one -- which turned out, after painstakingly working through the bitwise math of Paj's whole module, figuring out exactly what he was doing to the bits at every step, merely to be the same thing -- the Base64 encoding of the password, only expanded to UTF16. D'oh! And once I realized this, I re-created it in Perl in five minutes with just the little bit of code below. But at least, now, and finally, I can say that I understand bitwise operations and binary representation of integers. Many thanks.

use strict; use warnings 'all'; use MIME::Base64; sub b64_password { return MIME::Base64::encode($_[0]); } sub b64_utf16_password { my @ascii = unpack("C*", $_[0]); my $utf16 = pack("v*", @ascii); my $base64_utf16 = MIME::Base64::encode($utf16); return $base64_utf16; }

Replies are listed 'Best First'.
Re: Implementing a signed/sign-propagating right bitwise shift
by BrowserUk (Pope) on Feb 16, 2012 at 18:35 UTC

    Make negative numbers positive, shift them and make them negative again:

    sub SRshift { $_[0] < 0 ? -( -$_[0] >> $_[1] ) : $_[0] >> $_[1] }

    Should work on any platform regardless of native integer size.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Thanks. This is the most elegant way I've seen so far, and I think I now understand the math of what exactly it is I'm trying to do. (I would vote this up but I'm now out of votes for the day; I only got 2.)
Re: Implementing a signed/sign-propagating right bitwise shift
by jethro (Monsignor) on Feb 16, 2012 at 17:19 UTC

    "(1 << $firstbit)" seems to be a constant. Extract it from the loop and it should be faster.

    You could remove the loop completely by (in the case of 32 bit) checking the high-bit, then shifting for $y, then or-ing with $FFFFFFFF shifted left 32-$y

      Slowly, please. :-) Bitwise anything is new to me, and I struggle with even understanding what it means.

      Yes, I can see how (1 << $firstbit) is a constant. But its purpose is to get at that first bit, 0b10000000000000000000000000000000, i.e. 2**31. What is a better way to do that? I originally had "2**$firstbit" in place of "(1 << $firstbit)". Would that be better here? Is that what you mean by 'extracting' it?

      As for your second paragraph, you'll have to slow down and explain how to do what you're saying, as you lost me completely with everything beginning with "checking the high-bit."

      In any case, thanks!

        He was suggesting that you do something like:

        my $firstbit = ($longsize * 8) - 1; my $firstbit_mask = 1 << $firstbit;

        then use $firstbit_mask instead of 1 << $firstbit_mask. This cuts down on the number of shifts to do.


        As for the second paragraph:

        use constant NUM_BITS = $Config{ivsize} * 8; use constant SIGN_BIT = 1 << ( NUM_BITS - 1 ); use constant ALL_BITS = -1; sub sr_signed { my ($p, $q) = @_; if ($p & SIGN_BIT) { return ($p >> $q) | ($ALL_BITS << (NUM_BITS - $q)); } else { return $p >> $q; } }

        Example:

        Shift 10101111...10101100 by 3 10101111...10101100 >> 3 |||||||| ||||| \\\\\\\\ \\\\\ \\\\\\\\ \\\\\ |||||||| ||||| 00010101111...10101 11111111...11111111 << (NUM_BITS-3) ||| _____________/// /// ||| 11100000...00000000 00010101111...10101 | 11100000...00000000 ------------------- 11110101111...10101

        (Note that ivsize should be used, not longsize.)

        a) if you do something in a loop that is executed 32 times, that something is done 32 times. If you do it before the loop it is only done once. That should make anything faster. So you could do this:

        my $highbitset= (1<< $firstbit); foreach ... ... $x & $highbitset) != 0);

        As you can see the bitshift is only done once and inside the loop the constant contents of the variable is used instead.

        b) What you do is (for negative 32bit numbers): loop $y times: shift by one, set the high bit, shift by one, set the high bit ...

        What you could do: No loop, just shift by $y. set all high bits at once (With high bits I mean all the bits shifted in as 0 instead of 1).

        And the value you need to set all high-bits at once would be what? 0b10000... if $y==1, 0b11000.... if $y==2, and so on to 0b1111... if $y=32. This value is just 0b111111.... shifted left 32-$y times, i.e. 0b1111.... << 32-$y.

        More generally I would specify that as -1 << ($longsize*8 - $y)

Re: Implementing a signed/sign-propagating right bitwise shift
by Eliya (Vicar) on Feb 16, 2012 at 19:17 UTC
    (What I'm trying to do with all this is implement Paj's JavaScript MD5 library in Perl, ...)

    Your bit fiddling question is certainly an interesting one, and I don't want to spoil the fun... but in case you just want to get the job done, you could also use the existing Digest::MD5 core module:

    $ perl -MDigest::MD5=md5_hex -E 'say md5_hex("message digest")' f96b697d7cb7938d525a2f31aaf161d0
Re: Implementing a signed/sign-propagating right bitwise shift
by ikegami (Pope) on Feb 16, 2012 at 17:36 UTC

    The signed shift can be implemented in terms of a division:

    sub sr_i32 { my ($p, $q) = @_; return int( $p / (2<<$q) ); }

    Note that divisions by power of two are lossless.

    Update: Previous implementation only worked with q=1, so provided a different implementation.

Re: Implementing a signed/sign-propagating right bitwise shift
by thundergnat (Deacon) on Feb 16, 2012 at 17:57 UTC

    You could eliminate the loop pretty easily.

    #!/usr/bin/perl use strict; use warnings 'all'; use Config; my $longsize = $Config{longsize}; my $bits = $longsize * 8; sub rshift { # Should be signed/sign-propagating right shift my ($x, $y) = @_; my $mask = 0; my $z = $x >> $y; $mask = (2 ** ($y + 1) -1) << ($bits - $y) if $x < 0; return $z | $mask; }

    It won't be very easy to turn it into an infix operator in Perl 5 without using a source filter... and you really don't want to go there if you don't have to.

    UPDATE:Or if you want to calculate native size a different way:

    #!/usr/bin/perl use strict; use warnings 'all'; my $bits; { my $native = -1; $bits++ while $native >>= 1; } sub rshift { # Should be signed/sign-propagating right shift my ($x, $y) = @_; my $mask = 0; my $z = $x >> $y; $mask = ( -1 >> ($bits - $y) ) << ($bits - $y) if $x < 0; return $z | $mask; }
Re: Implementing a signed/sign-propagating right bitwise shift
by LonelyPilgrim (Beadle) on Feb 16, 2012 at 18:50 UTC

    Thanks for all the helpful replies. Another little question that's bugging me (I'm not sure if anybody will see this down here or not): does my function presume endian-ness of one form or another? Or does that even matter? Whether big-endian or little-endian, the leftmost bit of the leftmost byte will be the sign bit, correct? Will the endian-ness of the integer affect how it is shifted? It doesn't appear to so far, since even on my little-endian Win32 system, I get the correct end result. And it doesn't seem to matter if I remove the ">" from my pack and unpack there at the end -- I get the correct result. As I said, my understanding of the binary system is still pretty new.

      endianness may become important if you store numbers as binary data in a string (with the pack function) or on disk. It is important in languages like C that allow you to use the memory cells your integers are stored in as any other data type.

      Perl (and many other interpreted languages) don't allow this. You can't access the memory cells of your variables directly. If you use your integer variable as a string a conversion will be done by the interpreter. '>>' and other functions know the endianness of the system and use it. That is even true for C.

      Note that you can specify endianness when you use the pack function.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://954284]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2020-12-04 08:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you use taint mode?





    Results (58 votes). Check out past polls.

    Notices?