BrowserUk:

Are you familiar with the Boyer-Moore string search? I bet you could use the same basic idea of building a state machine of the next address to check. But rather than analyzing the needle byte by byte, do it in parallel for each shifted variant. Something like the below. Note: I'm going to pretend the byte size is 4 bits here, and I also don't know boyer-moore by heart, so I'm winging it to give you the gist of it. If you like the idea, you can hash out the annoying details (heh!):

Given Needle : 1010 0011 1101 1100 Haystack: 0001 0010 1101 1100 1010 1101 0001 1110 1110 0111 1) Make three more copies of given, each bit-shifted Needle A: 1010 0011 1101 1100 Needle B: x101 0001 1110 1110 0xxx Needle C: xx10 1000 1111 0111 00xx Needle D: xxx1 0100 0111 1011 100x Column D C B A 2) Build the BM skip table based on the last complete byte D C B A 0000 ---- ---- ---- 4444 0001 ---W -M-- ---- ---- 0010 --X- ---- ---- ---- 0011 ---W M--- ---- ---- 0100 ---- ---M ---- ---- 0101 -Y-W ---- ---- ---- 0110 --X- ---- ---- ---- 0111 ---W ---- ---M --M- 1000 ---- --M- ---- ---- 1001 ---W ---- ---- ---- 1010 Z-X- ---- ---- ---- 1011 ---W ---- ---- ---M 1100 ---- ---- ---- M--- 1101 -Y-W ---- M--- ---- 1110 --X- ---- -M-- -M-- 1111 ---W ---- --M- ---- - : an entry I didn't try to determine yet M : Matched byte position, match previous needle char to previous table column. Z : Perfect match found Y : Perfect match if byte after position A matches remnant B X : Perfect match if byte after position A matches remnant C W : Perfect match if byte after position A matches remnant D 1-4: mismatch, skip 1..4 bytes and restart scan at position A

Geh! I've got to get to work now. I'll update this at lunch time. Anyway, the state table will show how far you can skip ahead so you don't need to examine every byte in the haystack. Read the [no such wiki, Boyer-Moor string search] page, and you'll see what I'm talking about. I've drawn four state machines here by having 4 output edges based on which needle we're tracking, but in the final version, we'd combine the tables into simpler entries. If you like the idea, we can investigate that a bit. I've got some time this evening, so we could code something up.

...roboticus

When your only tool is a hammer, all problems look like your thumb.


In reply to Re: [OT] The interesting problem of comparing bit-strings. by roboticus
in thread [OT] The interesting problem of comparing bit-strings. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":