perlmeditation
Cine
This is not really a Perl specific subject, but nonetheless a problem
for people who are beginning to program.<br><br>
<p>There are 4 bit operations, <em>and</em>, <em>or</em>, <em>xor</em>
and <em>inverse</em>, which in Perl is accessable as the character
operators <em>&</em>, <em>|</em>, <em>^</em> and <em>~</em>,
because they are not to be confused with the actual <em>and-</em>,
<em>or-</em> and <em>not-operators</em>, whom we will return to
later. Because these operations are bit operations, they work on
individual bits in a number, but as we shall see in Perl they also
magically work on all the bits in strings.
<p>Except for <em>inverse</em> they require two arguments also called
operands, a left operand and a right operand. For none of these
operations is it important which is the right or the left, they would
yield the same result reversed. <em>inverse</em> only requires a
single operand.<br>
<h3>The & operator</h3>
This is the operator you use when you want to know need to know if two
items both are true. <em>true</em> and <em>false</em> in bits are
commonly respectively 1 and 0 or on and off<br>
The truth schema, for <em>and</em> is<br>
<table border=1>
<tr><td>Bit 0<td>Bit 1<td>Bit 0 & Bit 1
<tr><td><em>false</em><td><em>true</em><td><em>false</em>
<tr><td><em>false</em><td><em>false</em><td><em>false</em>
<tr><td><em>true</em><td><em>true</em><td><em>true</em>
<tr><td><em>true</em><td><em>false</em><td><em>false</em>
</table>
<p>Thus if we have the numbers 234 and 15, in bit representation 11101010
and 00001111 and do 234 & 15. We then for each bit in the left
operand <em>and</em> it with the corresponding bit in the right operand and
check the schema every time. Thus the result is 00001010 or 10 in
decimal.<br>
<p>Using a char as example instead, we have an "A" and "a", respectively
01000001 and 01100001 in ASCII. If we <em>and</em> them together, we
get 01000001 or "A".
<h3>The | operator</h3>
This operator check if either the right operand or the left operand is
true, or if they are both true. Thus this is false, only when both
operands are false.<br>
The truth schema, for <em>or</em> is<br>
<table border=1>
<tr><td>Bit 0<td>Bit 1<td>Bit 0 | Bit 1
<tr><td><em>false</em><td><em>true</em><td><em>true</em>
<tr><td><em>false</em><td><em>false</em><td><em>false</em>
<tr><td><em>true</em><td><em>true</em><td><em>true</em>
<tr><td><em>true</em><td><em>false</em><td><em>true</em>
</table>
<p>Again, our examples with 234 and 15 or 11101010 and 00001111. The
result is now 11101111 or 239. And the example with "A" and
"a" or 01000001 and 01100001 in ASCII, result is 01100001 or
"a".
<h3>The ~ operator</h3>
The <em>inverse</em> operator only takes a single argument and simply
reverses all bits. Thus all true values become false and all false
values become true.<br>
<table border=1>
<tr><td>Bit 0<td>~Bit 0
<tr><td><em>false</em><td><em>true</em>
<tr><td><em>true</em><td><em>false</em>
</table>
<p>This operator is actually also called the <em>not operator</em>, but
most people associate that with the other operator with the same name,
the <em>! operator</em>. The distiction is that <em>~</em> is the
<em>"bitwise" not operator</em>. The <em>! operator</em> is used to
reverse a true value to false and not actually looking at the specific
bits. This is also how perl does it, everything else would surprise
people, imagine what would happen if !"1", did it
bitwise... <em>1</em> is 00110001, so !"1" would be 11001110 which in
iso8859-1 is "LATIN CAPITAL LETTER I WITH CIRCUMFLEX"... But in perl
!"1" is 0, which makes it false. Just to confuse the matter some more,
<em>~</em> is also called the (1's) complement operator, or the
bitwise negation operator.<br>
<p>Unlike what you also may imagine, ~234 is not 00010101, but
11111111111111111111111100010101... This is simply because Perl's
representation of the number is much longer than a simple 8 bits. Also
this is on my machine and my compiled version of Perl, on other
machines or builds it may be longer or shorter.<br>
<p>Also notice that this does not work on a list. With
<code>@a=(1,0)</code>, <code>print ~@a</code> would not print "01",
but will evaluate the length of the <code>@a</code> and invert that
number.
<h3>The ^ operator</h3>
The XOR operator is not like, the other operators known from the
common language, mostly because it is a composition of more
operations.<br>
<p>XOR means eXclusive OR. Expressed in other bit operations it is
<code>(a&~b) | (~a&b)</code>. In more humane terms, it is this or
that, but not both. <br>
The truth schema, for <em>^</em> is<br>
<table border=1>
<tr><td>Bit 0<td>Bit 1<td>Bit 0 ^ Bit 1
<tr><td><em>false</em><td><em>true</em><td><em>true</em>
<tr><td><em>false</em><td><em>false</em><td><em>false</em>
<tr><td><em>true</em><td><em>true</em><td><em>false</em>
<tr><td><em>true</em><td><em>false</em><td><em>true</em>
</table>
<p>You may wonder what this operator is used for, but if you think a
little about it what is actually tells you is "are the operand
identical?". In many cases we could have used == or eq to tell us
that, but other times we actually need to know where and what the
difference is. For example we have two very long strings "aaaa" and
"aaba", and need to find where the difference is. The result of
"aaaa"^"aaba" is "\0x00\0x00\0x03\0x00", we can then run
[http://www.perldoc.com/perl5.8.0/pod/perlop.html#Regexp-Quote-Like-Operators|tr]/\0x00/1/c
to get the number of bytes that are different and we can use
[perldoc://index] to then search for <em>1</em> in the string to find
the actual place in the original string they differ.
<h3>The boolean operators</h3>
Boolean operators are much like their bitwise counterparts. We already
looked at the <em>! operator</em>, which turns a true value into a
false and false values into true ones. Thus it is actually as though
we converted the entire expression into a single bit and reversed that
bit.<br>
<p>We also have <em>&&</em> and <em>||</em>, <em>&&</em>
works by evaluating the left operand, and if and only if that returns
a true value the right operand is evaluated and that result is
returned. <em>||</em> works similar, but only evaluates the left
operand if the right operand was false.<br>
<p>Thus unlike their bitwise cousins, the ordering of operands is
important here, just think of <code>$a && $b/$a</code> if $a is
0...<br>
<p>In the start I wrote that you should not confuse the <em>and
operator</em> with the <em>& operator</em>, this is because
<em>and</em> is the same as <em>&&</em>. Same goes for
<em>or</em> which is the same as <em>||</em> and also <em>!</em> which
is <em>not</em>. The difference is that the named versions have lower
precedens, which means that the implicit parentheses are put
differently when Perl is looking at your code.
<h3>Conclusion</h3>
I now hope you have a better understanding of how bitwise operations
work, and are able to understand why people sometimes do things like
"onestring" ^ "anotherstring".<br><br>
<p>Here is a small tip, if you ever find yourself with a long complex
boolean expression:<br>
<code>!a && !b && !c</code> can be written as
<code>!(a || b ||c)</code>, and <br>
<code>!a || !b || !c</code> as
<code>!(a && b && c)</code>...<br>
This is also known as [http://www.hyperdictionary.com/computing/demorgan's+theorem|DeMorgan's Theorem]
<br><br><small>
<strong>Update: Fixed the errors [liz] pointed out</strong><br>
<strong>Update: Added link to DeMorgan's Theorem</strong><br>
<strong>Update: Rewrote "xxx" to <em>xxx</em></strong><br>
<strong>Update: Changed 0 and 1 to true and false</strong><br>
<strong>Update: Rewrote bits and pieces</strong><br>
<strong>Update: XOR was wrong</strong><br><br>
<font color="green">T</font>
<font color="red">I</font>
<font color="blue">M</font>
<font color="white">T</font>
<font color="cyan">O</font>
<font color="black">W</font>
<font color="magenta">T</font>
<font color="red">D</font>
<font color="green">I</font></small>