in reply to Re: [OT] The interesting problem of comparing (long) bit-strings.
in thread [OT] The interesting problem of comparing bit-strings.

Alright. Here's a demo of alpha skip search on bitstrings. The needle size is fixed, but it should be trivial to follow this with a longer compare. A possibility exists that some off-by-one errors have crept in. If that's the case, just deduct from my paycheck. :-)

Alpha skip is quadratic in the worst case and probably not the best choice when nasty matches are expected. Anyway, I'm curious how this might hold up against the optimized brute-force routines.

Note: to change the needle size, just edit the M and logM values. (Compile with -fno-strict-aliasing, or fix the type-punning).

  • Comment on Re^2: [OT] The interesting problem of comparing (long) bit-strings.
  • Download Code

Replies are listed 'Best First'.
Re^3: [OT] The interesting problem of comparing (long) bit-strings. (Doesn't work!)
by BrowserUk (Pope) on Mar 29, 2015 at 16:29 UTC

    Making minimal changes to your code to allow it compile here, and it simply never finds anything(very quickly), no matter how simple I make the search criteria:

    C:\test\C>cl /W3 bitsearch-oiskuu.c Microsoft (R) C/C++ Optimizing Compiler Version 15.00.21022.08 for x64 Copyright (C) Microsoft Corporation. All rights reserved. bitsearch-oiskuu.c bitsearch-oiskuu.c(49) : warning C4267: '=' : conversion from 'size_t' + to 'U16', possible loss of data bitsearch-oiskuu.c(78) : warning C4267: '=' : conversion from 'size_t' + to 'char', possible loss of data bitsearch-oiskuu.c(80) : warning C4267: '+=' : conversion from 'size_t +' to 'char', possible loss of data Microsoft (R) Incremental Linker Version 9.00.21022.08 Copyright (C) Microsoft Corporation. All rights reserved. /out:bitsearch-oiskuu.exe bitsearch-oiskuu.obj C:\test\C>bitsearch-oiskuu Result -1 Took 1.837746e-004 secs

    I've wasted enough time on it; and won't be expending any more, but here is the modified code that compiles clean (apart from warnings that originate from your posted code):

    I'd suggest you enter it in an obfuscated C competition; could be a winner.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re^3: [OT] The interesting problem of comparing (long) bit-strings.
by BrowserUk (Pope) on Mar 29, 2015 at 09:38 UTC
    'm curious how this might hold up against the optimized brute-force routines.

    It's 3 orders of magnitude slower.

      More or less, yeah.

      [ 0.033410] needle 512, haystack 134217216 bytes [ 0.000030] rot 5 at 8000007; bitoff=64000061 [ 0.000017] needle stuffed! [ 0.000644] search=64000061
      I see your table lists "2.483719e-001" for the corresponding case. Brute force appears to be >300 times slower (0.2483719/0.000644).

      Looks like { M = 16384, logM = 14 } case is the fastestfaster, with tables weighing in at 64KB. Beyond that, slowdown. (0.2784286/0.000236 == ~1180). Even bigger needles/tables may be quicker still.