That's fine, i'm in exactly the same position.
Mine seems to work most of the time, but I've occasionally seen long patterns found at offsets other than (earlier) than expected.
Given I was using random data, it is a possibility, but with needles of 100s or 1000s of bits (extracted from the randomly generated haystack), you wouldn't expect it to happen with any frequency in a human lifetime -- even in a billion bits of haystack -- and I've seen it half a dozen times already.
Of course, it only ever happens when both haystack and needle are huge; when, even if I did dump the bits for manual inspection comparing thousands of 0s & 1s by eye is just too painful. (I did try it once!)
Hence, I went looking for a better test strategy -- DeBruijn sequences -- which took rather longer to get right than I'd like to admit. (Would have been easier on a big-endian processor!)
That -- last night -- allowed me to confirm that there are some circumstances when I get false hits -- it seems to be related to __shiftleft128() treating a shift value of 64 as 0!
So now I'm recoding the entire thing in an OO style so that I don't have to juggle so many different offsets, shifts and counts in the mainline code. But I only just started.
Bottom line: instead of continuing to post "Boyer Moore would be faster!" - "No it won't!" - "Yes it will!"; how about we wait until we're both ready and compare our actual code.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
|