in reply to The golf course looks great, my swing feels good, I like my chances (Part III)

I could see how the hallvabo-provoked improved Python algorithm could improve my PHP score ... if only I could find an improved magic formula, one with a distinct newline mapping. Unfortunately, my original 70 stroke solution mapped newline to zero, which clashes with I -> 0. I'm happy to report that I did manage to find a very lucky 68 stroker, with a newline -> 10 mapping, as described in detail below.

Running the following PHP program:

$m = chr(205).chr(141).chr(207).chr(128).chr(206).chr(142); echo "M " . md5("M".$m) . "\n"; echo "D " . md5("D".$m) . "\n"; echo "C " . md5("C".$m) . "\n"; echo "L " . md5("L".$m) . "\n"; echo "X " . md5("X".$m) . "\n"; echo "V " . md5("V".$m) . "\n"; echo "I " . md5("I".$m) . "\n"; echo "n " . md5("\n".$m) . "\n";
M 999cb80306cbebadbd2d929ba8ed579a D 472969775e593db37081e391c072a85c C 99f4ea53c723dc31fe41bee9bd3f1fcd L 154047519861cb3d7c35ba7d283ce80e X 9fba084f9fdf4d9cd0e480e32b6d26d4 V 004b6563916984013ca50d0ee8add0e7 I daba8b73d682dc51d3c6dfa5ed770c4d n 10a89bcbe90dda3d6bb0e5ee5b8fbe21
Notice that M -> 999, C -> 99, X -> 9, V -> 4 and I -> 0 are all direct hits. And, crucial for the new algorithm, newline has a distinct value (10) thus ensuring it can be distinguished from the roman letters. Now, applying 472969775 % 2119 produces the desired D -> 499 mapping. So far, so good. Just one more miracle is required: a L -> 49 mapping with the same modulo (2119) as the D mapping.

Pop quiz: what should 154047519861 % 2119 produce? Perl, Python and Ruby are unanimous on that score: the correct answer is 157. Surprisingly, however, running the following PHP test program:

$L_num = 154047519861 % 2119; $L_str = "154047519861" % 2119; echo "L_num : $L_num\n"; echo "L_str : $L_str\n";
L_num : -1324 L_str : 49
The curious thing is that both answers are wrong. The miraculous thing is that the string "answer", $L_str above, produces the desired L -> 49 mapping! So I'm grateful for this (presumably 32-bit related) PHP bug because this lucky hit finally allowed me to break the 70 barrier:
where XXXXXX above is chr(205).chr(141).chr(207).chr(128).chr(206).chr(142). 68 strokes!

Though I admit I never expected to break the 70 barrier, I now believe that a score in the low 60s is a certainty. Here's why.

Suppose you could find a direct hit: M -> 1000, D -> 500, C -> 100, L -> 50, X -> 10, V -> 5, I -> 1, newline -> 0. You could then replace the:

claptrap with simply:
a saving of eight strokes. The comical "*1" above is to work around a PHP idiosyncrasy, to ensure that "1e3" is interpreted as 1000 and not 1.

How long would the magic string need to be to find such a solution? The probability of matching three specific characters followed by [a-f] is (1/16)*(1/16)*(1/16)*(6/16) = 1/10923, which is the rough odds of matching M, D and C: "1e3", "500", "5e2", "100", "1e2". Matching 50 and 10 is more likely: (1/16)*(1/16)*(6/16) = 1/689. Matching five and one is more likely still: (1/16) * (6/16) = 1/42. Finally, matching zero is only: 6/16 = 3/8. So the, admittedly very rough, odds of such a miraculously lucky hit is (1/10923)**3 * (1/689)**2 * (1/42)**2 * (3/8), which comes to the order of one in 10**21. Since there are 180 characters available, a ten character magic string can produce 180**10 = 10**22 distinct combinations, making such a lucky hit a virtual certainty. I therefore declare that a 64 stroke solution is certain and shorter ones possible.

This problem, however, is now no longer one of golf, but of computation. Even with super computers, 10**21 combinations is a daunting challenge and I expect you'd need to set up a cooperative grid computing project to find it. Such a project, of course, is far less worthy than SETI@home or Einstein@Home and would be a horrible waste of CPU cycles. :)

Update 2012: it seems you don't need a cooperative grid computing project anymore, you can just rent the power in the cloud for around $2 per hour. See also Cracking Passwords in The Cloud: Amazon's new EC2 GPU Instances.

Update 2014: See also The 10**21 Problem (Part I), an example of evaluating 10**21 combinations in a C program to find a Python magic formula.