Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Turning very larger numbers into an array of bits

by gblack (Novice)
on Feb 04, 2017 at 23:49 UTC ( [id://1181114]=perlquestion: print w/replies, xml ) Need Help??

gblack has asked for the wisdom of the Perl Monks concerning the following question:

I need a way (preferably fast) to turn very larger numbers into an array of bits. I tried sprintf and then splitting, but it looks like sprintf can't do a binary print of numbers that large (up to 1<<80) - nor am I confident that was a really "speedy" way to approach the issue. It seems like there should be some easy way to do this.

Essentially if I start with the number 57, I want to end up with the array ( 1, 0, 0, 1, 1, 1 ) (low bit is the first element).

Alternatively, if I just had a good way of checking the xth bit of a number without building a bitmask (unless someone can clue me in on how to do that quickly) that would work.

Speed matters to me because I've got a lot of computations to run here and if I burn too many cycles the program will end up taking an impractical amount of time (months) to run.

Replies are listed 'Best First'.
Re: Turning very larger numbers into an array of bits
by BrowserUk (Patriarch) on Feb 05, 2017 at 03:37 UTC

    Two questions key to helping you:

    1. Where are you getting these 80-bit numbers from? (How are you loading them into Perl?)

      This is important as no version of Perl can deal with 80-bit numbers as numbers.

      You could however, load them in their binary representation as binary strings of 10 bytes, and then accessing the bits directly is relatively simple.

    2. If you are loading them in binary (which is the only way) then you will need to know what format they are in.

      The only 80-bit numbers in common usage are the extended precision 80-bit IEEE 754 floating point values used internally by the X86 & X64 floating point processors. These have a fairly complicated format and are not directly interpretable bit by bit.

      If however, your numbers are simple 80-bit integers stored low bit first, once read from disk or stream, they would be directly interpretable in perl using its bit-wise string operations; or easily converted into an array of bits using unpack. eg:

      $bin80bit = "\x01\x23\x45\x67\x89\xab\xcd\xef\xaa\xff";; ## get an 80- +bit binary integer from somewhere. @bits = unpack '(a1)*', unpack 'B*', $bin80bit;; print scalar @bits, ':', "@bits";; 80 : 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 + 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 +0 1 0 1 0 1 1 1 1 1 1 1 1

      (I just do not know of any source of 80-bit integers.)

    With answers to those questions, a solution to your question may be possible.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Turning very larger numbers into an array of bits
by kcott (Archbishop) on Feb 05, 2017 at 08:23 UTC

    G'day gblack,

    Welcome to the Monastery.

    Here's one way to do this using Math::BigInt::GMP.

    #!/usr/bin/env perl use 5.010; use strict; use warnings; use Math::BigInt lib => 'GMP'; for my $number (57, '886545601050728061451195', '0x' . 'b' x 20) { say "Number: $number"; say "Array: @{get_array($number)}"; say 'Bits: ', scalar @{get_array($number)}; for (0 .. 5) { say "Bit $_: ", is_bit_set($number, $_); } } { my %cache; sub get_array { $cache{$_[0]} // gen_array($_[0]) } sub is_bit_set { ($cache{$_[0]} // gen_array($_[0]))->[$_[1]] } sub gen_array { $cache{$_[0]} = [ reverse split //, substr Math::BigInt::->new($_[0])->as_bi +n(), 2 ] } }

    I tested with your example number (57) and an 80-bit number in stringified, decimal (886545601050728061451195) and hexidecimal (0xbbbbbbbbbbbbbbbbbbbb) representations:

    Number: 57 Array: 1 0 0 1 1 1 Bits: 6 Bit 0: 1 Bit 1: 0 Bit 2: 0 Bit 3: 1 Bit 4: 1 Bit 5: 1 Number: 886545601050728061451195 Array: 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 +1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 + 0 1 1 1 0 1 1 1 0 1 1 1 0 1 Bits: 80 Bit 0: 1 Bit 1: 1 Bit 2: 0 Bit 3: 1 Bit 4: 1 Bit 5: 1 Number: 0xbbbbbbbbbbbbbbbbbbbb Array: 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 +1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 + 0 1 1 1 0 1 1 1 0 1 1 1 0 1 Bits: 80 Bit 0: 1 Bit 1: 1 Bit 2: 0 Bit 3: 1 Bit 4: 1 Bit 5: 1

    If you're not going to be doing multiple operations on the same number, you might be better off without the cache.

    I see others have suggested solutions. I'd tailor these to your specific requirements and then Benchmark.

    — Ken

Re: Turning very larger numbers into an array of bits
by johngg (Canon) on Feb 05, 2017 at 00:47 UTC

    A better way using vec in 8-bit chunks to store the modulo of the number and 256. This places the least significant bit first as you required.

    use strict; use warnings; use feature qw{ say }; my $val = 107865; # Pre-size the vector for 80 bits. # my $vec; vec( $vec, 79, 1 ) = 0; my $offset = 0; while ( $val >= 256 ) { my $rem = $val % 256; vec( $vec, $offset ++, 8 ) = $rem; $val /= 256; } vec( $vec, $offset, 8 ) = $val; say unpack q{b*}, $vec; say qq{Bit $_: }, vec( $vec, $_, 1 ) ? q{set} : q{not set} for 0 .. 79;

    The output.

    I hope this solution better fits your requirements.

    Update: The condition in the while was incorrect, changed from $val > 256 to $val >= 256 to rectify invalid results for values exactly divisible by 256 to the n.

    Cheers,

    JohnGG

Re: Turning very larger numbers into an array of bits
by johngg (Canon) on Feb 05, 2017 at 00:14 UTC

    Rather than an array you might be able to use vec.

    johngg@shiraz:~/perl > perl -Mstrict -Mwarnings -E ' my $vec; vec( $vec, 0, 8 ) = 57; say qq{Bit $_: }, vec( $vec, $_, 1 ) ? q{set} : q{not set} for 0 .. 7;' Bit 0: set Bit 1: not set Bit 2: not set Bit 3: set Bit 4: set Bit 5: set Bit 6: not set Bit 7: not set

    I hope this is helpful.

    Update: Note that vec can vector sizes up to 64 bits on Perl compiled for 64-bit environments. This doesn't satisfy your 80-bit requirement unfortunately.

    Cheers,

    JohnGG

Re: Turning very larger numbers into an array of bits
by stevieb (Canon) on Feb 06, 2017 at 00:03 UTC
    "if I just had a good way of checking the xth bit of a number without building a bitmask (unless someone can clue me in on how to do that quickly)"

    Don't know if this is what you're after, but it's a quick way to generate bitmasks on the fly.

    Two to the power of the number of bits to include in the mask minus one, left-shifted by the LSB

    perl -E '$bits=40; $lsb=3; printf("%b", ((2**$bits)-1) << $lsb)' 1111111111111111111111111111111111111111000

    Don't left-shift at all if you simply want the mask for the number of bits you're masking exactly.

Re: Turning very larger numbers into an array of bits
by Marshall (Canon) on Feb 06, 2017 at 02:58 UTC
    Essentially if I start with the number 57, I want to end up with the array ( 1, 0, 0, 1, 1, 1 ) (low bit is the first element).

    Alternatively, if I just had a good way of checking the xth bit of a number without building a bitmask (unless someone can clue me in on how to do that quickly) that would work.

    This just sounds so bizarre to me that I can't comprehend what is being attempted. To reverse 57 decimal, 111001b to 100111b is extremely counter intuitive! Yes, there are assembly level instructions that can do that with multiple instructions involving various kinds of masks and shifts but, that would be a wild thing to do. I think this is even more difficult to do in Perl.

    There is no faster way of testing if a bit is set other than using a bit mask and a logical "and" instruction. The lowest level assembly language instructions do it that way.

      there are assembly level instructions that can do that with multiple instructions involving various kinds of masks and shifts but, that would be a wild thing to do. I think this is even more difficult to do in Perl.
      johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' my @arr = split m{}, do { my $vec; vec( $vec, 0, 8 ) = 57; unpack q{b*}, $vec; }; say for @arr;' 1 0 0 1 1 1 0 0

      Not really. This is Perl after all :)

      Cheers,

      JohnGG

Re: Turning very larger numbers into an array of bits
by gblack (Novice) on Feb 05, 2017 at 18:04 UTC

    The large number is coming from me. In essence I'm trying to look through every combination of a set that size and find an optimal solution.

    I have an array of objects equal to the number of bits and wanted to +1 to try the next combination, evaluate it, and then proceed to the next. Each bit turns that particular element in the corresponding array on/off. I was planning on using bigint to deal with the large numbers.

    I considered at first nesting loops but decided I didn't like that approach as with each new element I'd have to keep rolling back over the elements already in the chain to see if it had been used already in the chain, plus I wondered how much CPU I'd burn with the structure.

    I also figured with a "bit" approach to the problem, it would probably make it easier to start forking off instances to analyze various combinations.

    So at the end of the day here perhaps I'm just making the problem more difficult and I asked the wrong question to begin with.

    How would you quickly iterate over every combination of an array with 80 elements where each element is used a maximum of once in the combination? To make it faster, I've already determined the optimal solution will use at least 7 of the elements and no more than 25.

    Solutions that precompute the combinations before allowing me to evaluate them don't work well as I have to store all of those combinations and I anticipate only keeping a low number of them - hence I want to evaluate each combination as it comes.

      I'm trying to look through every combination of a set that size and find an optimal solution.

      2^80 == 1208925819614629174706176. If you could test 1 million per second, your task will take 38 billion (38,308,547,532) years!

      Even if you only iterate those 80-bit values that have 7 .. 25 bits set, that is still 304202362464000 variations; and would take over 9 years at 1 million per second.

      Update: That is still 636,339,175,131,064,539,743 variations (assuming no ordering requirement) which would take 20,164,371 years at 1 million per second.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Yes, BrowserUk you are right... we can go from a super duper wildly humongous number to just a super duper humongous number. There is a heck of a lot that I don't understand about the requirements to this OP.

        There is presumably some "test" that will be done on these combinations, which the OP does not mention. Whatever that "test" is, it is bound to take some MIPS and probably a significant number of them!

        Without some more information as to the actual problem that the OP is trying to solve, I do not see any brute force calculation tactics that can do the required math to produce an "optimal solution". I suspect that a better description of the problem could potentially lead to a plausible, "pretty good" rather than an "optimal solution" with many orders less of processing power. Absent that, I am out of ideas.

      How would you quickly iterate over every combination of an array with 80 elements where each element is used a maximum of once in the combination? To make it faster, I've already determined the optimal solution will use at least 7 of the elements and no more than 25.

      As others have pointed out, that's still an unfeasibly massive number of combinations. Perhaps a considered read through The 10**21 Problem (Part I) (and the subsequent parts) will prove instructive?

      Factorial : There are n! ways of arranging n distinct objects into an ordered sequence.

      With 80 objects,

      n! = 80!
      = 7.156945704 E+118
      =71569457046263802294811533723186532165584657342365752577109445058227039255480148842668944867280814080000000000000000000
      There is no program/computer that can do that many tests. This is a humongous number.

      You have absolutely no chance at all without a strategy to reduce the number of tests. A better explanation of what you are testing will probably yield massive orders of magnitude in reduction of computations. Without that, I see ZERO chance.

        There are n! ways of arranging n distinct objects into an ordered sequence.

        That would only be true if the OP allowed duplicates; Ie. if a set consisting of 80 x bit 0 was allowed, but the OP does state:

        How would you quickly iterate over every combination of an array with 80 elements where each element is used a maximum of once in the combination?

        Thus the maximium number of sets is 2^80.

        He also made no mention of "ordered".


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
        In the absence of evidence, opinion is indistinguishable from prejudice.
      It sounds like you're trying to solve an intractable problem like the knapsack problem. This type of algorithm becomes extremely painful around n=30 -- there are over a million possibilities to run through at that point.
Re: Turning very larger numbers into an array of bits
by Marshall (Canon) on Feb 07, 2017 at 17:30 UTC
    You might want to also look at Quadruple-precision floating-point format. In IEEE 754-2008 the 128-bit base-2 format is officially referred to as binary128.

    As BrowserUk notes: The only 80-bit numbers in common usage are the extended precision 80-bit IEEE 754 floating point values used internally by the X86 & X64 floating point processors. This would not give you 80 bits of integer precession anyway.

    The IEEE 754 binary128 format will yield >100 bits of precision which would meet your 80 bit requirements. Even 265 256 (typo) bit formats have been defined. However the choice of compilers is going to be limited. There can be issues affecting stack alignment which can significantly affect performance depending upon the target processor architecture. However since these are binary formats, shifting and masking albeit across 64 bit words can be done efficiently as opposed to using some kind of BCD (Binary Coded Decimal) format.

    AFAIK, the low level BCD instructions are still part of the x86 family assembly instructions, but I've never used a compiler that could use them, nor have I written any assembly code for these instructions either. In any event, that doesn't sound like what you want. For raw speed, a C or FORTRAN compiler that can handle the 128bit format would seem like a promising approach. I suspect that the Perl larger int options will be slower for the main calculations that you apparently need to do.

Re: Turning very larger numbers into an array of bits
by Anonymous Monk on Feb 05, 2017 at 01:16 UTC
    Where are your big numbers coming from? If you're manipulating them at all, you probably need a multiple-precision math library. If you use Math::GMP, there's gmp_tstbit. (Math::BigInt doesn't seem to have an efficient way to fetch a particular bit.)
      If you use Math::GMP, there's gmp_tstbit

      Yes, I'd be using the gmp library (accessed either via Math::GMPz or Math::GMP).

      Math::BigInt doesn't seem to have an efficient way to fetch a particular bit

      Math::BigInt works with base 10 representations of numbers, which makes it less than ideal for determining values of particular bits.

      Cheers,
      Rob
Re: Turning very larger numbers into an array of bits
by gblack (Novice) on Feb 04, 2017 at 23:59 UTC

    I suppose I could precompute all of the bitmasks and throw them in an array to reference. Still seems like there should be an easier way though.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1181114]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (7)
As of 2024-03-28 21:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found