Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Though I'd long stopped searching for fresh ideas to try in this game, a little while after this ToastyX-provoked epiphany, I could not help but wonder if the string bitwise & PHP golfing trick might somehow be similarly applied to the older Roman to Decimal game.

I started with an inspection of my previous best PHP solution:

There's a lot to dislike about this code. Much nicer is:
While such a solution undoubtedly exists (as argued last time), I don't have a powerful enough computer to find it. The reason such a long (ten character) magic string is likely to be required is because hitting the longer Roman numbers is so improbable; that is, the probability of a random MD5 digest happening to start with 100[a-f], 500[a-f] or 1000[a-f] is too low.

Can good ol' string bitwise & help here? To find out, I started by running a little Perl program:

for my $i (48..57,97..102) { printf " %08b (ord %3d %c)\n", $i, $i, $i; }
to display the bit patterns of all characters that may appear in a MD5 digest:
00110000 (ord 48, "0") 00110001 (ord 49, "1") 00110010 (ord 50, "2") 00110011 (ord 51, "3") 00110100 (ord 52, "4") 00110101 (ord 53, "5") 00110110 (ord 54, "6") 00110111 (ord 55, "7") 00111000 (ord 56, "8") 00111001 (ord 57, "9") 01100001 (ord 97, "a") 01100010 (ord 98, "b") 01100011 (ord 99, "c") 01100100 (ord 100, "d") 01100101 (ord 101, "e") 01100110 (ord 102, "f")
Hmmm, though we clearly cannot bitwise & from [a-f] to [0-9] or vice versa, we can bitwise & all digits to zero. For example:
00111001 (ord 57, "9" character) & 01110000 (ord 112, "p" character) = 00110000 (ord 48, "0" character)
Does that help? Yes! You see, applying &p to the second, third and fourth characters of the MD5 digest transforms any digit in these character positions to zero, making it much more likely to hit the problematic 100, 500 and 1000 Roman numbers.

What about the first character? A simple brute force search program showed that the best we can do here is &u, which transforms [0-9] like so:

0 0 1 1 2 0 3 1 4 4 5 5 6 4 7 5 8 0 9 1
thus making a starting 1 or 5 more likely.

Putting these two together, a &uppp bitwise string operation truncates an MD5 digest string to four characters as follows:

[0145a`de] [0`] [0`] [0`]
The bitwise &uppp operation therefore reduces the possible numbers in a MD5 digest from millions to just thirteen, namely: 0, 1, 4, 5, 10, 40, 50, 100, 400, 500, 1000, 4000, 5000.

The next step was to update my old C search program with an adjusted my_md5 function as follows:

static int my_md5(const unsigned char* inbuf) { int outi = 0; int nib; MD5_CTX mdContext; #ifdef MY_ASM mdContext.A = 0x67452301; mdContext.B = 0xefcdab89; mdContext.C = 0x98badcfe; mdContext.D = 0x10325476; md5_block_asm_data_order(&mdContext, inbuf, 1); #else md5_block_asm_data_order(&mdContext, inbuf); #endif /* 1st byte */ nib = (unsigned char)((mdContext.A >> 4) & 0xF); if (nib > 9) return outi; nib &= 5; // <-- added this line outi = outi * 10 + nib; nib = (unsigned char)(mdContext.A & 0xF); if (nib > 9) return outi; nib = 0; // <-- added this line outi = outi * 10 + nib; /* 2nd byte */ nib = (unsigned char)((mdContext.A >> 12) & 0xF); if (nib > 9) return outi; nib = 0; // <-- added this line outi = outi * 10 + nib; nib = (unsigned char)((mdContext.A >> 8) & 0xF); if (nib > 9) return outi; nib = 0; // <-- added this line outi = outi * 10 + nib; return outi; // <-- only care about first four chars now }
After adjusting M_TARG and related #defines, running this modified search program unearthed a number of five character magic string hits, but none shorter. From the many different five character solutions, I naturally chose one consisting of all alphabetic characters, namely PQcUv.

Because PHP and Perl have essentially the same string bitwise operator behavior, I was able to test these ideas in Perl. You can see more clearly how all this works, by running the following Perl test program:

use Digest::MD5 qw(md5_hex); for my $r (M, D, C, L, X, V, I, "\n") { my $hd = md5_hex( $r . PQcUv ); print "$r $hd ", $hd & uppp, "\n"; }
which produces:
M 993332d3c4fa8b3839761ca4dd480f7b 1000 D 503c3b61b971c24100e97c3297882b22 500` C 112da378c970c5a0ff7769acd85095c7 100` L 58dbff2e42141165a1e04bbc629df030 50`` X 90a86e4f0b96ada8fb44bcb43d16f25e 10`0 V 7c9f70b74e8f19325e8dc62d31a9dd87 5`0` I 9bb0bbb1977969abd73e5f8121e6cff7 1``0 c697df2fcf84f10272369ee4482e5c1c a000
Note that the ` characters above, the result of applying &p to [a-f], terminate the number when it's used in an arithmetic operation. Note too that for PHP we require the newline case to evaluate to zero (as a000 does above) to terminate the controlling while loop (this is not required in Perl because /./g removes the trailing newline).

Here's an example Perl solution to this game:

use Digest::MD5 md5_hex; $\+=$n-2*$n%($n=uppp&md5_hex$_.PQcUv)for<>=~/./g;print
Were md5 a built-in Perl function, this would probably be a winning Perl golf entry. Note, by the way, that it may be possible to replace md5 with Perl's built-in crypt function and be competitive in this game.

As you might expect, having a built-in md5 function made this new approach a winner in PHP, as demonstrated by the following 63 stroker:

Curiously, this turned out to have exactly the same length as the (theoretical) long magic string solution postulated earlier:

The shortest known PHP solution to this game is thus reduced from 68 to 63 strokes, and without requiring a super computer.

A general golfing lesson to take from this little episode is that if you get stuck on one golf game, try another. You may be able to apply tricks learnt in the new game to the one you were previously stuck on.

In reply to Re^2: The golf course looks great, my swing feels good, I like my chances (Part III) by eyepopslikeamosquito
in thread The golf course looks great, my swing feels good, I like my chances (Part III) by eyepopslikeamosquito

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (6)
    As of 2021-03-04 19:24 GMT
    Find Nodes?
      Voting Booth?
      My favorite kind of desktop background is:

      Results (107 votes). Check out past polls.