REPNE SCASD

You cannot search for bits using byte(nor word, nor double-word)-wise instructions unless your happy to miss 75%(87%,94%) of possible hits.

The rest is regurgitated garbage. Probably cribbed from the 'net and then sundial'd into an unreadable mess. Please stop!

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
• Comment on Re^4: [OT] The interesting problem of comparing bit-strings.

Replies are listed 'Best First'.
Re^5: [OT] The interesting problem of comparing bit-strings.
by salva (Canon) on Mar 25, 2015 at 09:47 UTC
You cannot search for bits using byte-wise instructions

But the thinkg is that, actually, you can! Only that you will have to run eight searches instead of one.

But the think thing is that, actually, you can! Only that you will have to run eight searches instead of one.

Hm. Pedantically, that still searching for bytes, but no matter, it still doesn't solve the problem. What if what you are looking for spans a byte boundary?

Eg. The needle below appears twice in the haystack, but no combination of 8 (or any number) of repne scasb instructions will ever find either:

```needle:
00100100

haystack:
+                            *      *       *      *
1         2         3         4         5         6
+7         8         9         0         1         2
0123456789012345678901234567890123456789012345678901234567890123456789
+0123456789012345678901234567890123456789012345678901234567
1111000101001010110110101100101011101001010100010111100010011001010100
+0111111111001000000011100100100111011100100100111000110111

Because, repne scasb scans the string looking for a fixed byte value, but the needle doesn't appear as a byte anywhere in the string.

Both occurrences span byte boundaries, thus to find them, you need to do two compares, using two different masks on two adjacent bytes, against two different byte values, neither of which are or include the needle.

Nor can you pad the needle (on both ends) to 16-bits and use repne scabw to find it, because again, a masking operation is required to extract the relevant 7 bits from the surrounding 16.

The same is true (and then some) for my use of 64-bit comparisons, but I didn't suggest I was going to microcode loop instructions.

So, I'll say it again, with a minor addition, that (I hope) will satisfy you; and will go right over his head:

You cannot search for bits using byte-wise instructions alone

What he pontificated about was utterly pointless; and wrong.

And that before we get into:

1. I'm coding in C not assembler.

And my compiler doesn't support inline assembler.

2. the whole guff about: "If you actually discover a matching 64-bit quantity within the buffer, then the odds are “64 in 2^64” that the entire value-being-sought is present".

We already know how completely screwed his sens eof probabilities are: The odds of an actual collision for a 7-digit random number are so astronomically small that I quite frankly would not bother to check for it.

I mentioned that my haystacks are frequently >1/4GB in size"; that's 33 million quads; the chances that two the same appear somewhere are (by the Birthday Paradox) not so slim in totally random data; but in structured, correlated, meaningful data, quite likely to be much much higher.

And that's before you consider that the patterns I'll be searching for are not quad-aligned, thus every bit in that 1/4GB (bar the first and last 63 bit) is actually a part of 63 (potentially) entirely different quads. Ie. that 33 million just became 2.1 billion.

Beyond that, using bytes is a bad idea for this in any number of ways:

• Byte aligned data is the most expensive size to load/store for a 64 bit processor.

Effectively 64-bits is moved across the data bus, but only 8-bits are used. Even reading from cache, every byte except the low byte of each quad has to be manipulated into the low 8 bits of a register before it can be used.

• <l>Byte operators effectively waste 7/8ths of each register used to hold the byte.

Most of them use AL or CL leaving the rest of the register useless for the duration.

If it is of interest, this is my currently incomplete code (a few edge cases to handle) locating a 1Mi bit needle near the end of a 1GiB buffer of random bits:

```C:\test\C>bvSearch 1745337664 1048576

Found it at 67108800
Took 0.255063793859 secs

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
The idea is that for long needles, you can discard their borders (the unaligned bits), search for the middle string using some efficient algorithm and only when it matches, check the borders.

update: oh, and BTW, I was not endorsing sundialsvc4 response, just stating that this time, besides all his typical non-sense there may actually be also some insight.

A reply falls below the community's threshold of quality. You may see it by logging in.
locating a 1Mi bit needle near the end of a 1GiB buffer of random bits

Have you considered that using random test data may not be a good idea. You are never going to generate the performance-degrading worst cases.

For instance, it is very unlikely that you will find partial needle matches longer than a few bits.

A reply falls below the community's threshold of quality. You may see it by logging in.