Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

(OT) A different kind of 'combinatorics'

by BrowserUk (Patriarch)
on Mar 26, 2015 at 13:59 UTC ( [id://1121394]=perlquestion: print w/replies, xml ) Need Help??

BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Given the 16 x 4-bit patterns: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

Is it possible to combine them is such a way that results in a single, length unspecified, binary string that only contains each of those 16 bit patterns (in any-aligned four bit field) exactly once?

Ie. if you start with 0000, then the next pattern must start with a 1; else the 0000 repeats immediately.

That is: 00000001 contains '0000' at offset 0; (0000)0001 but also at offsets: 1 0(0000)001, 2 00(0000)01 and 3 000(0000)1; which is what must be avoided.

So, if you start with 0000 & 1000: 00001000 then the repeated '0000' is avoided; but the no matter what you put next:

000010001xxx 0001 repeats 0(0001)(0001)xxx 000010000xxx 0000 repeats (0000)1(0000)xxx

Beyond a brute force examination of all the permutations, which would be doable for 16 x 4 bits, but not for much bigger, is there some algorithm for performing such a combination?

This reminds me of something I've seen at some point, but the only thing that's coming out of my brain at the moment is Gray Codes, but I can't see how to apply it.

Thoughts?


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re: (OT) A different kind of 'combinatorics'
by Corion (Patriarch) on Mar 26, 2015 at 14:27 UTC

    It seems that De Bruijn sequences are what you are searching for. Wikipedia even lists an algorithm, albeit in Python.

    Update: As I assume that this is somewhat related to your quest of finding a misaligned sequence, you might be interested in the section "De Bruijn decoding", which hints at specially constructed sequences that can be mapped back to the position/offset in O(n log(n)) if you can't use a lookup table to convert the sequence value back to an offset.

      It seems that De Bruijn sequences are what you are searching for.

      Yes. Thankyou!

      As I assume that this is somewhat related to your quest of finding a misaligned sequence, you might be interested in the section "De Bruijn decoding", which hints at specially constructed sequences that can be mapped back to the position/offset in O(n log(n)) if you can't use a lookup table to convert the sequence value back to an offset.

      Yes. I'm looking for a testing strategy to ensure that all patterns that can be found are.

      And the "De Bruijn decoding" looks very interesting. Thanks again.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re: (OT) A different kind of 'combinatorics'
by salva (Canon) on Mar 26, 2015 at 14:31 UTC
    length unspecified

    It is easy to see that the length must be equal to $number_of_patters + $size_of_pattern - 1 as a new pattern is introduced at every bit offset.

      It is easy to see that the length must be equal to $number_of_patters + $size_of_pattern - 1 as a new pattern is introduced at every bit offset.

      Correct. I'd worked out that for 4, 19 had to be the minimum length, but I wasn't convinced that it could be done in the minimum length.

      Now you've pointed it out, it is becomes obvious that it can only be done in the minimum length; else you'd have dups.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re: (OT) A different kind of 'combinatorics'
by chacham (Prior) on Mar 26, 2015 at 14:23 UTC

    If each set of four were to be repeated, it'd be impossible, because after 0000 or 1111 the next digit must be either 0 or 1, which will be a repeat of 0000, 1111, 0001 or 1110. So, the question would only be valid for a condensed string that happens to includes all combinations but not more than once. (This is probably what you are asking, but i did not see it so clearly at first.)

      If each set of four were to be repeated, it'd be impossible ... This is probably what you are asking, but i did not see it so clearly at first.

      Indeed, though it wasn't obvious to me at first either.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
Re: (OT) A different kind of 'combinatorics'
by hdb (Monsignor) on Mar 26, 2015 at 14:29 UTC

    I have now no time to code but this is what I would do. My usual approach is to do gradually build it up recursively with backtracking. I would think that you would not see many of the 16! possible combinations in such a procedure. If one would use a string as the underlying data structure, then a regex should be possible to identify the bad patterns. My first thought was m/(....).*\1/ but that would not detect overlapping repetitions. But I am sure such a regex exists.

      My first thought was m/(....).*\1/ but that would not detect overlapping repetitions. But I am sure such a regex exists.

      I was looking to combine lookaheads with captures to check for dups, but then settled for using index to check for existance, and again, with the offset+1 from the first to check for duplicates.

      My brute forcer finds 208 complient patterns in about a minute:

      The second set of numbers is the order of the 4-bit patterns within the 19-bit solutions -- in the vague hope they might give some hint as to a generic algorithm -- now rendered unnecessary as Corion's discovered it is a solved problem.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        That was me.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1121394]
Approved by karlgoethebier
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2024-04-24 18:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found