“Bit vector” problems such as this one actually have many different analogs.   (Graphics processing, in particular, comes to mind.)   These problems become expensive and difficult to solve if the strings are short (less than the native word-size of the machine), and/or if there is positively no “crib” that can be meaningfully applied to cut the search-space.

For instance, if you are routinely looking for bit-strings that are (say) only 20-30 bits long, then there really isn’t any alternative but to pedantically growl through the buffer.   If you are looking for a 20-bit-long quantity in a world of 64-bit words, that quantity could be in any one of 64 positions, and within that space of 128 total bits, 108 of them are entropy.   Strategies such as the one which I previously suggested would be of no real value, because the shorter the “window” is, the less likely it is that “a hit” implies the existence of “a solution.”

Even if you try to press strategies such as Moore into service, you find that they are not nearly so effective in this situation, as they are in the text-search applications for which they were intended.   Why?   Because uncompressed text actually has only a very small number of possible bit-patterns within each byte.   The entropy of the data is very small.   If you consider text “one character at a time,” you get a familiar letter-frequency distribution:   ETAOIN....   In that context, Moore’s algorithm might well save some time.   Whereas, if the data is a bit-string, the distribution of byte-wise values is much more likely to be flat, simply because the data is not aligned to any sort of incremental boundary.

Hence, the necessity to “divide and conquer.”   To find some rune by which you can quickly eliminate all places where the data could not be.


In reply to Re: [OT] The interesting problem of comparing bit-strings. by sundialsvc4
in thread [OT] The interesting problem of comparing bit-strings. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":