In your diagram above, draw two vertical pencil-lines, 64 bits apart, with at least 64 bits to the left of the leftmost line.   (In other words, this is the second quadword in the value being sought.)

You can now see, within those two lines, all 64 of the possible values, one of which must occur in the buffer, and aligned on a quadword boundary, if that ≥128-bit value occurs in the buffer at all.   Every occurrence of the needle must include one of those 64 quadwords, although you don’t know which one.   The mere occurrence of one of those values also does not guarantee success, but the odds are basically 64 in 2^64 that they do.

Therefore, send the microprocessor on 64 blisteringly-fast searches through the entire buffer, looking in turn for each of those values.   A page fault will never happen.

Then, search byte-wise for a pre-shifted version of what the value would be, again using fast instructions.

Only then must you test the first and last quadwords in the value, using appropriate shifting-and-masking, and doing this at a point when your success is already assured.

If you are going for “raw speed” in a modern microprocessor, you want to give it things that it can pipeline.   A conditional-branch instruction of any sort often causes the predictive-branch caches to be flushed.   Even though my strategy causes the microprocessor to blaze through the entire buffer 64 times in a row, (a) we know that there is RAM-enough to do that without faults, and (b) each time a value is found, success is already likely, and (c) the pipeline is full.   You spend the “pedantic bit-twiddling” only to be certain of what a much-faster algorithm is already almost-certain of.


In reply to Re^9: [OT] The interesting problem of comparing bit-strings. by sundialsvc4
in thread [OT] The interesting problem of comparing bit-strings. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.