When I looked for magic formulae, etc., I would give up after 20 minutes!
 Jasper
Unlike Jasper, I have spent more than 20 minutes searching for magic formulae.
Especially ancient Roman ones.
I first became aware of them back in 2006 during
the Fonality Christmas Golf Challenge
where I was astonished by Ton's ingenuity and deviousness in constructing
his original HART (Hospelian Arabic to Roman Transform) magic formula.
Shortly after the Fonality game, codegolf.com hosted an endless competition
where you must convert the other way, from Roman to Decimal.
By 2009, I had managed to take the lead in all four languages
(Perl, Python, Ruby, PHP), as detailed
in this series of nodes.
And the winning solutions  in all four languages 
all used (quite different) magic formulae!
To refresh your memory, the codegolf game rules were essentially:
Convert a Roman numeral to its integer value.
The input numeral, provided on stdin followed by a single newline, is
guaranteed to be in the range I to MMMCMXCIX (1 to 3999).
For example, given IV on stdin, your program should print 4 to stdout.
As you might expect, a key component of solutions to this game
is mapping individual Roman letters to their decimal equivalents.
To help us focus on that, let's define a simpler spec:
Write a function to convert a single Roman Numeral letter to its decimal equivalent.
The function assumes the Roman letter is in $_
and returns its decimal equivalent.
To clarify, here's a sample (nongolfed) Perl solution:
sub r {
my %h = ( I=>1, V=>5, X=>10, L=>50, C=>100, D=>500, M=>1000 );
return $h{$_};
}
print "$_: ", 0+r(), "\n" for (qw(I V X L C D M));
Running this program produces:
I: 1
V: 5
X: 10
L: 50
C: 100
D: 500
M: 1000
Lookup Table vs Magic Formula
In code golf, there is often a battle between a lookup table
and a magic formula. So it proved here.
When converting from Roman to Decimal, magic formula trumps lookup table.
Here are three attempts to solve this problem using a lookup table:
sub r {%h=(I,1,V,5,X,10,L,50,C,100,D,500,M,1000);$h{$_}}
sub r {M1000D500C100L50X10V5I1=~$_;$'}
sub r {M999D499C99L49X9V4I=~$_+$'}
And here are some of my earliest magic formulae from 2007:
sub r {5**y/VLD//.E.(3*/M/+2*/CD/+/XL/)}
sub r {1 .E.~ord()*41%52%5>>y/VLD//}
sub r {5**y/VLD//.E.ord()x3%75%50%4}
sub r {1 .E.(72^ord)*5/7%5>>y/VLD//}
sub r {5**y/VLD//.E.(42^88*ord)%5}
sub r {1 .E.(3^77%ord)%7>>y/VLD//}
Back then, my magic formulae searching skills were so poor that I
erroneously concluded that the lookup table approach was better. :)
Later, when I tackled this problem in Python, I really needed to use
each Roman letter once only in the formula, which forced me to explore
alternative approaches ... which, in turn, led to still shorter Perl magic formulae,
such as:
sub r {10**(7&69303333/ord)%9995}
sub r {10**(7&5045e8/ord)%2857} # needs 64bit Perl
sub r {IXCMVLD=~$_;"1E@"%9995}
Back in the good old days, the little search programs that uncovered these
solutions took hours to run, not years. :)
The long numbers (such as 69303333 above) that began popping up in these formulae
were an indication that the ord function didn't scale very well as
the required solutions became less probable.
Can we find a builtin function better suited to Roman magic formulae?
In PHP and Python, yes. In Perl and Ruby, probably not.
At least, I don't know of one.
The PHP builtin md5 function is perfect for Roman magic formulae.
Not only does it generate all required digits (09), it generates little else (just af).
Moreover, you can use all 256 characters in a magic formula,
compared to just ten (09) for ord.
To illustrate, in an eight character magic string (for example 69303333),
there are just 10**8 combinations available for ord, while there
are a mindboggling 256**8=1.8*10**19 combinations available for md5!
This is a huge advantage when searching for highly improbable solutions,
as we shall see later.
The Python hash function is also superior to ord,
though not as good as PHP's md5. This is because Python's Unicode
and character escaping claptrap limits you to 125 characters
(namely ord 1..9,11,12,14..127) that can be used as input to the hash
function without special treatment.
Still, 125 is a huge improvement over 10!
One drawback of hash compared to md5 is that it generates
huge numbers, forcing you to waste five strokes with %1001
to trim the generated numbers into the required Roman Numeral range (11000).
In some cases, the Perl crypt builtin could be employed, though
it is not wellsuited to this specific problem because (unlike md5)
it generates many other characters in addition to the desired 09.
To recap, the shortest magic formulae found so far in all four languages
(as detailed in this series of nodes) are:
10**(205558%ord(r)%7)%9995 # Python
hash(r+"magicstrng")%1001 # Python (finding magicstrng is the subjec
+t of this node!)
10**(7&5045e8/ord)%2857 # Perl (64bit)
IXCMVLD=~$_;"1E@"%9995 # Perl
10**(494254%r/9)%4999 # Ruby (no need for explicit ord in this g
+ame)
md5($r.magicstrng) # PHP (finding magicstrng is an unsolved p
+roblem)
md5($r.PQcUv)&uppp # PHP wins due to superb mf properties of
+md5
The 10**21 Problem
There's just no sane regularity in this thing. But in "random" mappings
with a very small result set like this, the shortest solution is often
to make up some magic formula that has no particular meaning, but just
happens to give the wanted result.
 Ton Hospel explaining his original decimaltoroman magic formula
When Ton Hospel invented magic formulae in golf way back in 2004,
he correctly noted that they work best "with a very small result set".
Indeed, if there were only five Roman Numeral letters, rather than seven, we would
have a straightforward 10**15 problem, rather than a (borderline intractable)
10**21 problem. Why 10**21?
Suppose you have a magic formula that produces a random number between 1 and
1000. The probability of scoring a (lucky) hit for one numeral is therefore
1 in 1000. Since probabilities multiply, the probability of hitting five
out of five numerals is 1 in 10**15, while the chance of hitting
all seven Roman Numerals is a daunting 1 in 10**21.
Though 1 in 10**21 sounds improbable in the extreme,
if you could generate 10**21 combinations, you would not need luck,
indeed you would expect to find such an improbable solution.
Yet how long would it take to search 10**21
combinations for the elusive (lucky) one?
Well, if you could perform 10,000 search operations per second, the time
required to search 10**21 combinations is a
mere 10**21/10000 seconds = 10**17 seconds = 3,170,979,198 years!
By the way, this "brute force search infeasibility problem" is why you
are asked to create longer and longer passwords, and with a wider range
of characters in them, as computer speeds improve.
Indeed, Password cracking
and magic formula searching are closely related disciplines,
the lack of a "codegolf magic formula" wikipedia page notwithstanding. :)
To make this theoretical argument more concrete, consider searching
for a Roman to Decimal magic formula using the Python builtin
hash function.
Since this hash function produces very large values, we need to apply %1001
to it so as to produce an essentially random value in the desired 1..1000 range.
We might try searching for such a magic formula using a simple Python bruteforce
search program, such as:
import time
print time.time(), time.clock()
for q0 in range(1, 128):
for q1 in range(1, 128):
for q2 in range(1, 128):
for q3 in range(1, 128):
for q4 in range(1, 128):
for q5 in range(1, 128):
for q6 in range(1, 128):
for q7 in range(1, 128):
for q8 in range(1, 128):
for q9 in range(1, 128):
magic = chr(q0)+chr(q1)+chr(q2)+chr(q3)+chr(q4)+chr(q5)+chr(
+q6)+chr(q7)+chr(q8)+chr(q9)
m = hash("M" + magic) % 1001
if m != 1000: continue
d = hash("D" + magic) % 1001
if d != 500: continue
c = hash("C" + magic) % 1001
if c != 100: continue
l = hash("L" + magic) % 1001
if l != 50: continue
x = hash("X" + magic) % 1001
if x != 10: continue
v = hash("V" + magic) % 1001
if v != 5: continue
i = hash("I" + magic) % 1001
if i != 1: continue
print "bingo!", q0, q1, q2, q3, q4, q5, q6, q7, q8, q9
print time.time(), time.clock()
On my machine, this naive brute force search for a tencharacter magic
string is expected to find a solution in about 52,887,477 years!
Note that, with 127 different values for each character in the string,
we need a 10character magic string to give us the required
127**10 = 1.1e21 = 10**21 combinations.
Rather than give up at this point, I found this
"impossible" 50 million year challenge
strangely irresistible ... and was eager to learn about any high performance computing techniques required to solve it.
I get it  optimization is a fun game ... one can play all day with unrolling loops, peeling away layers of indirection, and so forth to gain cycles, while piddling away time and energy
 davido
Or, as davido puts it, "optimization is a fun game".
Well, I enjoy it.
So I started by rewriting the above search program in C++, then kept on refining
it until I was able to reduce the running time from fifty million years down to
only several years and eventually find a solution.
This new series of articles describes that endeavour.
Please note that this series focuses more on High Performance Computing/Supercomputing/Parallel computing
in C/C++/Intel assembler than code golf.
