note
repellent
Very interesting topic!<br/>
<br/>
I see an overall pattern, which I would like to share :-)<br/>
<br/>
First, a few basic observations:<br/>
<ul>
<li>There seems to be a certain symmetry evident from how <c>R & R</c> results in 8 bits set on average (halfway between 1 and 16) versus <c>R | R</c> resulting in 24 bits set on average (halfway between 16 and 32). I'll be sure to make my hypothesis symmetrical, in that regard.
<li>If we apply AND, the average bits set goes <b>down</b> due to higher likelihood of 0 bits <b>swallowing</b> the 1 bits. Conversely, if we apply OR, the average bits set goes <b>up</b> due to higher likelihood of 1 bits <b>overriding</b> the 0 bits.
<li>Non-associative rules apply for AND and OR:<br/>
<br/>
<ul>
<c>( (A & B) | C ) != ( A & (B | C) )</c>
</ul>
<br/>
So evaluating the expression from <b>left-to-right</b> is important! <b>Update:</b> Perl gives precedence to the <c>&</c> operator over <c>|</c>
</ul>
<br/>
<readmore>
<h4>Hypothesis:</h4>
Each AND/OR operation is evaluated in the following way:<br/>
<ul>
<c>( <left_gained_avg_bits_set> <op> <right_modifier> ) = <resulting_avg_bits_set></c>
</ul>
Whenever an AND/OR <c><op></c> is applied, it induces a change to the already gained average bits on the left-side. That change is a multiplicative factor determined by the average bits set on the right-side:<br/>
<br/>
<ul>
<c>right_mod_factor = right_avg_set_bits / total_bits</c>
</ul>
<br/>
<h4>Cases:</h4>
For <c><op> = AND</c>, we would take <c>right_mod_factor</c> and multiply it with <c>left_avg_bits_set</c> to get <c><resulting_avg_bits_set></c>. Examples:<br/>
<ul>
<c>R & (R & R) => 16 * (8 / 32) = 4 bits set on average</c><br/><br/>
<c>(R | R) & (R & R & R) => 24 * (4 / 32) = 3 bits set on average</c>
</ul>
<br/>
That is, if there are more bits set on average on the AND right-side, there will be less chance of <b>swallowing</b>, so more bits are set on average in the result.<br/>
<br/>
Conversely, for <c><op> = OR</c>, we would take <c>right_mod_factor</c> and multiply it with <c>left_avg_unset_bits</c>. Then, take that result and add it to <c>left_avg_set_bits</c> to get <c><resulting_avg_bits_set></c>. Examples:<br/>
<ul>
<c>R | (R | R) => 16 + 16 * (24 / 32) = 28 bits set on average [second 16 is # of avg unset bits]</c><br/><br/>
<c>(R | R) | (R & R & R) => 24 + 8 * (4 / 32) = 25 bits set on average</c><br/><br/>
<c>R & R & R | R | R = (R & R & R) | (R | R) => 4 + 28 * (24 / 32) = 25 bits set on average</c><br/>
or<br/>
<c>R & R & R | R | R = (R & R & R) | R | R => 4 + 28 * (16 / 32) + 14 * (16 / 32) = 25 bits set on average [got 14 from 32 - ( 4 + 28 * (16 / 32) )]</c>
</ul>
<br/>
That is, if there are more bits set on average on the OR right-side, there will be more chance of <b>overriding</b>, so more bits are set on average in the result.<br/>
<br/>
<h4>Conclusion:</h4>
So, following my pattern hypothesis, I see some inconsistent results in your "full set". <strike>For instance, 7 of <c>( R & R | R & R ) & R</c> evaluates to 5 bits set on average.</strike> Attention has to be paid to precedence of AND/OR evaluations. Also, 22 & 26 are the same, when 22 of <c>R & R | R | R</c> evaluates to 26 bits set on average.<br/>
<br/>
There may be some fallacy in my hypothesis, so feedback is welcomed :)<br/>
</readmore>
<br/>
So, given the following legend:<br/>
<ul>
<c>
n = total # bits per sample
L1 = # bits set on average on left-side of AND/OR
L0 = # bits unset on average on left-side of AND/OR
= n - L1
R1 = # bits set on average on right-side of AND/OR
</c>
</ul>
<br/>
we can derive a set of equations to calculate the resulting average bits set for each AND/OR operation:<br/>
<ul>
<c>
AND : L1 * R1 / n
OR : L1 + L0 * R1 / n = L1 + (n - L1) * R1 / n
= L1 + R1 - L1 * R1 / n
= L1 * ( 1 - R1 / n ) + R1
</c>
</ul>
<br/>
716397
716397