Compression in Golf: Part IIIby eyepopslikeamosquito (Canon)
|on Oct 21, 2012 at 07:22 UTC||Need Help??|
Continuing our 99 bottles of beer reduction from last time, I decided at this point to step back and reconsider some basic assumptions. In particular, though:
has served us well so far, can we do better?
Especially relevant to 99 bottles of beer are these two ideas:
Multiplying the eval string by 99
All the beer bottle algorithms seen so far loop from zero up to 99, building a single large string inside the loop -- and printing it in one go at the end. This (unnatural) algorithm has proven to produce the shortest solutions because, by starting at zero rather than 99, we exploit perl's default (undef) variable initialization, and so avoid costly explicit initializations, such as a prohibitive six stroke penalty for a leading $n=99;. Moreover, we can shorten the plural inflection problem from, say, bottle."s"x!!$n to bottle.$&.
Sadly though, such an algorithm is not well-suited to:
because it requires a terminating action outside the loop, namely to print the string built inside the loop. If you could find a short bottle golf algorithm that just did the same thing 99 times without requiring anything at the beginning or end, that could be a winner.
Unable to find such an algorithm, I resorted to printing the value returned by the eval like so:
Now, since eval returns the value of the last expression, we just have to make sure that its last evaluated expression returns a complete 99 bottles of beer string; for example, this 167-stroker:
which can be tested like so:
If you compare this code to our original algorithm, you will notice that we are spared an until loop, and the associated test for 99, because we know we are executed 99 times via the eval. That saves us 179 - 167 = 12 characters or so in the source string. Yet those twelve source characters are reduced to nine in the output string, courtesy of pack's 3/4 compression. And, as you can see below:
this new way of employing pack is ten strokes longer. That is, our nine stroke saving costs ten.
As proof of concept, running this program generates a working 157 stroke solution:
Note that we (unluckily) lost a stroke because a single quote was generated (near "(g" above), necessitating packaging the string inside q&..&, rather than '...'. That is, this solution is potentially a 156 stroker. So four strokes need to be whittled from the source string to get down to 153 and another four to get to 150. Since this looked unlikely, I reluctantly gave up on this interesting approach.
Translating outside the eval
The following 152 stroke source string:
which can be tested via:
saves a whopping 179 - 152 = 27 characters in the source string. Yet those 27 source characters are reduced to about 20 in the output string, after pack's 3/4 compression. And the cost of this form is a further 10 strokes, as indicated below:
That is, our 20 stroke saving comes at a cost of ... 20 strokes! So I suppose this approach has a chance. However, I found it very difficult to pour this program into any "pack u" shape. So I reluctantly gave up on this interesting approach too.
More radical still is this 148 stroker:
which can be tested via:
This time 179 - 148 = 31 characters are saved in the source string. And those 31 source characters are reduced to about 23 in the output string, after pack's 3/4 compression. Note that the cost of this form is about 23 strokes, as indicated below:
Once again though, I found pouring this program into any "pack u" shape problematic.
Back to the main game
After that interesting, if unsuccessful, diversion, I reverted back to the main game, namely:
I felt that the best chance of getting from 154 to 151 was to change "pack u54" to "pack u", a saving of two strokes, combined with a one-stroke saving via the tactical trick of finding an algorithm ending in $`, thus exploiting pack's use of backtick as the NULL byte. With that approach, I don't need to find a significantly shorter solution, just one that fits the default "pack u" shape like a glove.
Fun with split
I was able to reduce my shortest unformatted solution from 179 strokes to 176 by exploiting a deprecated feature of the Perl split function:
namely that calling split in scalar context has the side-effect of setting @_. As an aside, note that side-effects are frequently very useful in golf. Though this solution only works with Perl 5.10 or earlier (this mis-feature was thankfully (for non-golfers) removed in Perl 5.12) that was ok for this game because codegolf competitions use perl 5.8.8.
This shorter raw solution didn't help very much because I was unable to effectively pour it into a "pack u" shape. Just like my earlier 154-stroke entry, I had more success by focusing on finding an algorithm to fit the required shape.
Stravinsky's inspirational quote notwithstanding, I found the arbitrary constraint of fitting Perl code into three lines of precisely 61 characters in length, all starting with the letter "m", to be frustrating in the extreme. For example, I found many "nearly" solutions, such as:
which has line lengths of 62, 60, 61 when I need them to be 61, 61, 61. Aargh!?!! Other "nearly" solutions were:
I was getting more and more frustrated ... and more and more annoyed at that ugly leading "mx;".
So many "nearly" solutions. If only one little thing was different, they'd work.
Eventually, unable to bear looking at that damned leading "mx;" any longer, I switched to seeking out solutions beginning with m/s/. Well, that was (and remains) the only half-way useful regex I can think of to start a solution with. After doing that, I finally found a perfectly fitting 152-stroke solution:
Yay! The remarkable and non-obvious tactical trick of inserting the "m" length byte into the string -- then later removing it (via s/m//), with the side-effect of usefully setting $' and $` -- I would never have found were it not for the constraints of having to start the first line with m/s/ and having to break it after exactly 61 bytes.
As shown last time, we need to generate a working 152-stroke entry via a little program, such as:
Despite ending with the desired backtick, this solution cannot be reduced to 151 because it relies on the last line starting with the letter "m", the length byte for 61. To get to 151, we need to start the last line with the letter "l", the length byte for 60.
Reverting to the deprecated perl split semantics, I found a number of 151-stroke solutions whose last line begins with the letter "l":
After submitting one of these to tie for the lead I felt I could finally relax ... until the tenacious "dmd" struck back yet again, posting a 149-stroke solution!
I hope I've been able to convey the extra level of complexity that compression adds to golf. As if the game of golf were not already hard enough. To illustrate that extra level of complexity, I quote leading Python golfer hallvabo again:
This reminds of the SHA-256 challenge on codegolf.com. Since Python's built-in compression wasn't available and my source code was over 500 bytes long, I figured I had to roll my own compression scheme to beat Mark Byers leading the Python section with 493 strokes. I started with restricting the source to 64 characters so I could use a homemade 6-bit character encoding (curiously, this only increased the source from 507 to 512 bytes! this was because I couldn't use ~, so tricks like ~- became unavailable). I then golfed the decompressor, getting it down to about 75 strokes. Finally, I recognized that this approach gives a whole set of new tricks to play with, since I could reuse variables from the decompression stage in the sha-256 stage! Of course, this requires the variables to have the correct value after the decompression stage... at this point my brain almost melted :)
That concludes this introductory series on the difficult topic of compression in golf. I hope you've enjoyed it. If you are looking for further challenges, we know those damned beer bottles can be further reduced to 149, perhaps lower. Though be warned, the complexity of this task may melt your brain. :)
Acknowledgement: I'd like to thank mtve and hallvabo for their help in preparing this series.