Problems? Is your data what you think it is? PerlMonks

### Re: Converting decimals to binary

by ikegami (Pope)
 on Nov 11, 2004 at 15:54 UTC ( #407079=note: print w/replies, xml ) Need Help??

in reply to Converting decimals to binary

There is something call fixed point numbers (as opposed to floating point), and it sounds like that's what you need. In this case, I fixed the point (B-scaling) to 0 (unsigned) integer bits (allowing numbers from 0 to +1, not including +1).

foreach ( 0.750 0.500, 0.400, 0.250, 0.125, ) { my \$num = \$_; my \$num_B0 = \$num * 4294967296; # 0..+1 not incl +1. #my \$num_B1 = \$num * 2147483648; # 0 to just over +1. printf("%f = %s B0\$/", \$num, unpack('B32', pack('N', \$num_B0)), ); } __END__ output ====== 0.750000 = 11000000000000000000000000000000 B0 0.500000 = 10000000000000000000000000000000 B0 0.400000 = 01100110011001100110011001100110 B0 0.250000 = 01000000000000000000000000000000 B0 0.125000 = 00100000000000000000000000000000 B0

.5, .25 and .125 are round numbers cause they're 2^(-1), 2^(-2) and 2^(-3) respectively.

The machine for which I write software doesn't support floating point numbers, so we use fixed point arithmetic (which any integer arithmetic machine can do). It can be a pain at times, since we deal with an extreme number of decimal numbers like valve positions, temperature measurements, etc. (It's the control system of a nuclear power plant.)

Let me know if you want to know how to do math with fixed point numbers.

Replies are listed 'Best First'.
Re^2: Converting decimals to binary
by ketema (Scribe) on Nov 11, 2004 at 16:37 UTC
How does multiplying by 4294967296 not include 1 but multiplying by 2147483648 does?

In response to the first reply, if I just multiply the decimal by the "accuracy" per say then take the integer portion am I losing more accuracy, than the fixed point answer provided by number 2? Are these two solutions the same?

I don't know of any sources, but here's some background. Let's use smaller numbers to demonstrate. Let's say we have 4 bit numb +ers. +---+---+---+---+ | | | | | +---+---+---+---+ As a little refresher, let's forget decimal numbers for a second. +---+---+---+---+ | 0 | 1 | 1 | 1 | 7 = 0*2^3 + 1*2^2 + 1*2^1 + 0*2^0 +---+---+---+---+ I'm going to call that B4, since 4 bits are used by the integer portio +n of the number. +---+---+---+---+ | 0 | 1 | 1 | 1 | 7 B4 +---+---+---+---+ So how would you store a decimal number? Let's take 1.75, for example. +---+---+---+---+ + + | 0 | 0 | 0 | 1 | 1 1 1.75 = 1*2^0 + 1*2^(-1) + 1*2^(-2) +---+---+---+---+ + + Unfortunately, I can't just add bits to my register. What if we shifted the bits? +---+---+---+---+ | 0 | 1 | 1 | 1 | 1.75 * 2^2 = [ 1*2^0 + 1*2^(-1) + 1*2^(-2) ] * 2 +^2 +---+---+---+---+ Another way of phrasing that is in terms of number of integer bits lef +t. +---+---+---+---+ | 0 | 1 | 1 | 1 | 1.75 B2 = [ 1*2^0 + 1*2^(-1) + 1*2^(-2) ] * 2^(4 +-2) +---+---+---+---+ In other words, using smaller B-scalings increases precision: The following show the precision for a given scaling. +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^0 ] * 2^(4-4) = 1 B4 +---+---+---+---+ +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^(-1) ] * 2^(4-3) = 0.5 B3 +---+---+---+---+ +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^(-2) ] * 2^(4-2) = 0.25 B2 +---+---+---+---+ +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^(-3) ] * 2^(4-1) = 0.125 B1 +---+---+---+---+ +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^(-4) ] * 2^(4-0) = 0.0625 B0 +---+---+---+---+ Nothing says we have to stop at 0. +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^(-5) ] * 2^(4--1) = 0.03125 B-1 +---+---+---+---+ The downside is that biggest number we can represent goes down as the precision increases. +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^3 + 1*2^2 + 1*2^1 + 1*2^0 ] * +2^(4-4) = 15 B4 +---+---+---+---+ +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^2 + 1*2^1 + 1*2^0 + 1*2^(-1) ] * +2^(4-3) = 7.5 B3 +---+---+---+---+ +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^1 + 1*2^0 + 1*2^(-1) + 1*2^(-2) ] * +2^(4-2) = 3.75 B2 +---+---+---+---+ +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^0 + 1*2^(-1) + 1*2^(-2) + 1*2^(-3) ] * +2^(4-1) = 1.875 B1 +---+---+---+---+ +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^(-1) + 1*2^(-2) + 1*2^(-3) + 1*2^(-4) ] * +2^(4-0) = 0.9375 B0 +---+---+---+---+ +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^(-2) + 1*2^(-3) + 1*2^(-4) + 1*2^(-5) ] * +2^(4--1) = 0.46875 B-1 +---+---+---+---+ We can go the other way, if need be. +---+---+---+---+ | 1 | 1 | 1 | 1 | [ 1*2^4 + 1*2^3 + 1*2^2 + 1*2^1 ] * +2^(4-5) = 30 B5 +---+---+---+---+ +---+---+---+---+ | 0 | 0 | 0 | 1 | [ 1*2^1 ] * 2^(4-5) = 2 B5 +---+---+---+---+ <blockquote><i>How does multiplying by 4294967296 not include 1 but mu +ltiplying by 2147483648 does?</i></blockquote> Multiplying by 4294967296 (2^32) gives 32 bits for the decimal portion +, leaving no bits (32-32) for the integer portion. In other words, B0. 1 does not fit into no bits. + +---+----- 1 | 0 | ... 1 B0 + +---+----- Multiplying by 2147483648 (2^31) gives 31 bits for the decimal portion +, leaving 1 bit (32-31) for the integer portion. In other words, B1. 1 fits in one bit. +---+---+----- | 1 | 0 | ... 1 B1 +---+---+----- So what's the advantage over floating-point numbers? Floating-point numbers does this the same way, and it does it for you. However, in order to do that, it must save the scaling (called "expone +nt") along with the number. That means floats require more memory to store +than these. So what good is any of this anyway? You can use integer arithmetic on these numbers! X Bi + Y Bi = (X+Y) Bi X Bi - Y Bi = (X-Y) Bi X Bi * Y Bj = (X*Y) B(i+j) X Bi / Y Bj = (X/Y) B(i-j) X Bi >> j = X B(i+j) X Bi << j = X B(i-j) You can also compare numbers of the same scaling. X Bi < Y Bi X Bi > Y Bi X Bi == Y Bi etc. Do be careful about overflow! A = ... # B1 B = ... # B1 C = A + B # POTENTIAL OVERFLOW! # 1 B1 + 1 B1 will overflow, for example There are two ways of avoiding overflow. You can make sure in advance that the numbers won't cause an overflow, + or you can switch to a different B-scaling (at the cost of a bit of preci +sion). A = ... # B1 A = A >> 1 # B2 B = ... # B1 B = B >> 1 # B2 C = A + B # B2 We do lots of works with values in [0.0..1.0]. When we need to calcula +te by how much a valve should be open, we calculate it using values in [0.0. +.1.0]. Later on, we convert them to the right value to send to the Analog Out +put. It sounds like you want to deal with numbers in this same range, so I +thought the following would be very useful to you: A = ... # B1 Must be between 0.0 and 1.0. B = ... # B1 Must be between 0.0 and 1.0. C = A * B # B2 C = C << 1 # B1 Safe, since 1.0 * 1.0 = 1.0. Although we've only dealt unsigned numbers, everything I've said so far applies to signed 2s complement numbers.
Nice work ikegami, I up voted you on this one. You *must* have a lot of free time up your sleeves. ;-)

You can look how it's done in FORTH. FORTH-Users don't have floating point numbers (unless one has written a library that is), so they do all with fixed point arithmetic. And it's very fast. You should find some examples if you google for FORTH.

Create A New User
Node Status?
node history
Node Type: note [id://407079]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2020-10-25 23:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
My favourite web site is:

Results (249 votes). Check out past polls.

Notices?