|Perl: the Markov chain saw|
Re: The golf course looks great, my swing feels good, I like my chances (Part III)by eyepopslikeamosquito (Chancellor)
|on May 28, 2009 at 11:29 UTC||Need Help??|
I could see how the hallvabo-provoked improved Python algorithm could improve my PHP score ... if only I could find an improved magic formula, one with a distinct newline mapping. Unfortunately, my original 70 stroke solution mapped newline to zero, which clashes with I -> 0. I'm happy to report that I did manage to find a very lucky 68 stroker, with a newline -> 10 mapping, as described in detail below.
Running the following PHP program:
Notice that M -> 999, C -> 99, X -> 9, V -> 4 and I -> 0 are all direct hits. And, crucial for the new algorithm, newline has a distinct value (10) thus ensuring it can be distinguished from the roman letters. Now, applying 472969775 % 2119 produces the desired D -> 499 mapping. So far, so good. Just one more miracle is required: a L -> 49 mapping with the same modulo (2119) as the D mapping.
Pop quiz: what should 154047519861 % 2119 produce? Perl, Python and Ruby are unanimous on that score: the correct answer is 157. Surprisingly, however, running the following PHP test program:
The curious thing is that both answers are wrong. The miraculous thing is that the string "answer", $L_str above, produces the desired L -> 49 mapping! So I'm grateful for this (presumably 32-bit related) PHP bug because this lucky hit finally allowed me to break the 70 barrier:
where XXXXXX above is chr(205).chr(141).chr(207).chr(128).chr(206).chr(142). 68 strokes!
Though I admit I never expected to break the 70 barrier, I now believe that a score in the low 60s is a certainty. Here's why.
Suppose you could find a direct hit: M -> 1000, D -> 500, C -> 100, L -> 50, X -> 10, V -> 5, I -> 1, newline -> 0. You could then replace the:
claptrap with simply:
a saving of eight strokes. The comical "*1" above is to work around a PHP idiosyncrasy, to ensure that "1e3" is interpreted as 1000 and not 1.
How long would the magic string need to be to find such a solution? The probability of matching three specific characters followed by [a-f] is (1/16)*(1/16)*(1/16)*(6/16) = 1/10923, which is the rough odds of matching M, D and C: "1e3", "500", "5e2", "100", "1e2". Matching 50 and 10 is more likely: (1/16)*(1/16)*(6/16) = 1/689. Matching five and one is more likely still: (1/16) * (6/16) = 1/42. Finally, matching zero is only: 6/16 = 3/8. So the, admittedly very rough, odds of such a miraculously lucky hit is (1/10923)**3 * (1/689)**2 * (1/42)**2 * (3/8), which comes to the order of one in 10**21. Since there are 180 characters available, a ten character magic string can produce 180**10 = 10**22 distinct combinations, making such a lucky hit a virtual certainty. I therefore declare that a 64 stroke solution is certain and shorter ones possible.
This problem, however, is now no longer one of golf, but of computation. Even with super computers, 10**21 combinations is a daunting challenge and I expect you'd need to set up a cooperative grid computing project to find it. Such a project, of course, is far less worthy than SETI@home or Einstein@Home and would be a horrible waste of CPU cycles. :)
Update 2012: it seems you don't need a cooperative grid computing project anymore, you can just rent the power in the cloud for around $2 per hour. See also Cracking Passwords in The Cloud: Amazon's new EC2 GPU Instances.
Update 2014: See also The 10**21 Problem (Part I), an example of evaluating 10**21 combinations in a C program to find a Python magic formula.