Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Compression in Golf: Part III

by eyepopslikeamosquito (Canon)
on Oct 21, 2012 at 07:22 UTC ( #1000179=perlmeditation: print w/ replies, xml ) Need Help??

Continuing our 99 bottles of beer reduction from last time, I decided at this point to step back and reconsider some basic assumptions. In particular, though:

eval lc pack u,'source-string'
has served us well so far, can we do better?

Especially relevant to 99 bottles of beer are these two ideas:

  • Since we need to loop 99 times, why not exploit the "eval we have to do anyway" for that? That is, string multiply the eval string like so: eval lc pack(u,'source-string')x99.
  • In solutions that use v for space, we translate via y/v/\40/. If this translation were done outside the eval, it could be shortened by two strokes to y/v/ /; one stroke could similarly be saved when translating to a hard newline. Though doing that outside the eval loses our 3/4 "pack u" compression, the saving in translation makes it worth considering at least.

Multiplying the eval string by 99

All the beer bottle algorithms seen so far loop from zero up to 99, building a single large string inside the loop -- and printing it in one go at the end. This (unnatural) algorithm has proven to produce the shortest solutions because, by starting at zero rather than 99, we exploit perl's default (undef) variable initialization, and so avoid costly explicit initializations, such as a prohibitive six stroke penalty for a leading $n=99;. Moreover, we can shorten the plural inflection problem from, say, bottle."s"x!!$n to bottle.$&.

Sadly though, such an algorithm is not well-suited to:

eval lc pack(u,'source-string')x99
because it requires a terminating action outside the loop, namely to print the string built inside the loop. If you could find a short bottle golf algorithm that just did the same thing 99 times without requiring anything at the beginning or end, that could be a winner.

Unable to find such an algorithm, I resorted to printing the value returned by the eval like so:

print eval lc pack(u,'source-string')x99
Now, since eval returns the value of the last expression, we just have to make sure that its last evaluated expression returns a complete 99 bottles of beer string; for example, this 167-stroker:
@j=/s/?(take,one,down,an.d,pass,it,around):(go,to,the,store,an.d,buy,s +ome,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/,$"@m.\n\n +@m,$"@z.\n\u@j/;/\n+/;$'.$`;
which can be tested like so:
print eval q!@j=/s/?(take,one,down,an.d,pass,it,around):(go,to,the,sto +re,an.d,buy,some,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s +/^/,$"@m.\n\n@m,$"@z.\n\u@j/;/\n+/;$'.$`;!x99
If you compare this code to our original algorithm, you will notice that we are spared an until loop, and the associated test for 99, because we know we are executed 99 times via the eval. That saves us 179 - 167 = 12 characters or so in the source string. Yet those twelve source characters are reduced to nine in the output string, courtesy of pack's 3/4 compression. And, as you can see below:
eval lc pack u,'' [17] print eval lc pack(u,'')x99 [27] #23456789012345678901234567
this new way of employing pack is ten strokes longer. That is, our nine stroke saving costs ten.

As proof of concept, running this program generates a working 157 stroke solution:

my $source = <<'PERSEVEROUS'; my@j=/s/?(take,one,down,$m,pass,it,around):(go,to,the,store,$ m=an.d,buy,some,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the, gall^v16);s/^/,$"@m.\n\n@m,$"@z.\n\u@j/;/\n+/;;$'.$`; PERSEVEROUS my $out = unpack 'u', uc($source); open my $fh, '>', 'b.pl' or die "error: open b.pl: $!"; binmode $fh; print $fh "print eval lc pack(u,q&" . $out . "&)x99";
Note that we (unluckily) lost a stroke because a single quote was generated (near "(g" above), necessitating packaging the string inside q&..&, rather than '...'. That is, this solution is potentially a 156 stroker. So four strokes need to be whittled from the source string to get down to 153 and another four to get to 150. Since this looked unlikely, I reluctantly gave up on this interesting approach.

Translating outside the eval

The following 152 stroke source string:

@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,v@z._/;s/s/\utake +vonevdownvandvpassvitvaround,v@m.__/;"$'\ugovtovthevstorevandvbuyvsom +evmore,v@m.";
which can be tested via:
my $prog = <<'PERSEVEROUS'; @m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,v@z._/;s/s/\utake +vonevdownvandvpassvitvaround,v@m.__/;"$'\ugovtovthevstorevandvbuyvsom +evmore,v@m."; PERSEVEROUS s//lc $prog x99/ee;y/_v/ /;print
saves a whopping 179 - 152 = 27 characters in the source string. Yet those 27 source characters are reduced to about 20 in the output string, after pack's 3/4 compression. And the cost of this form is a further 10 strokes, as indicated below:
eval lc pack u,'' [17] print eval lc pack(u,'')x99 [27] s//lc pack(u,'')x99/ee;y/v_/ N/;print [37] (N represents newline) #234567890123456789012345678901234567
That is, our 20 stroke saving comes at a cost of ... 20 strokes! So I suppose this approach has a chance. However, I found it very difficult to pour this program into any "pack u" shape. So I reluctantly gave up on this interesting approach too.

More radical still is this 148 stroker:

@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,`@z._/;s/s/]ake`o +ne`down`and`pass`it`around,`@m.__/;"$'^o`to`the`store`and`buy`some`mo +re,`@m.";
which can be tested via:
my $prog = <<'PERSEVEROUS'; @m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,`@z._/;s/s/]ake`o +ne`down`and`pass`it`around,`@m.__/;"$'^o`to`the`store`and`buy`some`mo +re,`@m."; PERSEVEROUS s//lc $prog x99/ee;y/]-`/TG /;print
This time 179 - 148 = 31 characters are saved in the source string. And those 31 source characters are reduced to about 23 in the output string, after pack's 3/4 compression. Note that the cost of this form is about 23 strokes, as indicated below:
eval lc pack u,'' [17] s//lc pack(u,'')x99/ee;y/v_/ N/;print [37] s//lc pack(u,'')x99/ee;y/]-`/TGN /;print [40] #234567890123456789012345678901234567890
Once again though, I found pouring this program into any "pack u" shape problematic.

Back to the main game

After that interesting, if unsuccessful, diversion, I reverted back to the main game, namely:

eval lc pack u,'source-string'

I felt that the best chance of getting from 154 to 151 was to change "pack u54" to "pack u", a saving of two strokes, combined with a one-stroke saving via the tactical trick of finding an algorithm ending in $`, thus exploiting pack's use of backtick as the NULL byte. With that approach, I don't need to find a significantly shorter solution, just one that fits the default "pack u" shape like a glove.

Fun with split

I was able to reduce my shortest unformatted solution from 179 strokes to 176 by exploiting a deprecated feature of the Perl split function:

s/^/,$"@m.\n\n@m,$"@z.\n\u@_/,/s/until/99/*split@m=(@z=(++$n,bottle.$& +,of,beer),on,the,wall),/\n+/?take7one7down7and7pass7it7around:go7to7t +he7store7and7buy7some7more;print$'.$` # or s/^/,$"@m.\n\n@m,$"@z.\n\u@_/until/99/*split@m=(@z=(++$n,bottle."s"x/\ +n+/,of,beer),on,the,wall),$&?take7one7down7and7pass7it7around:go7to7t +he7store7and7buy7some7more;print$'.$`
namely that calling split in scalar context has the side-effect of setting @_. As an aside, note that side-effects are frequently very useful in golf. Though this solution only works with Perl 5.10 or earlier (this mis-feature was thankfully (for non-golfers) removed in Perl 5.12) that was ok for this game because codegolf competitions use perl 5.8.8.

This shorter raw solution didn't help very much because I was unable to effectively pour it into a "pack u" shape. Just like my earlier 154-stroke entry, I had more success by focusing on finding an algorithm to fit the required shape.

Constraints

The more constraints one imposes, the more one frees oneself of the chains that shackle the spirit... the arbitrariness of the constraint only serves to obtain precision of execution.

-- Igor Stravinsky, 1882-1971

Stravinsky's inspirational quote notwithstanding, I found the arbitrary constraint of fitting Perl code into three lines of precisely 61 characters in length, all starting with the letter "m", to be frustrating in the extreme. For example, I found many "nearly" solutions, such as:

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 mx;s//,$"@m.\n\n@m,$"@z.\n\u@_/until/^99.+/sm/split@m=(@z=(++$ m,bottle."s"x@-,of,beer),on,the,wall),@-?take7one7down7and.@ m.pass7it7around:go7to7the7store7and7buy7some7more;print$&.$`
which has line lengths of 62, 60, 61 when I need them to be 61, 61, 61. Aargh!?!! Other "nearly" solutions were:
# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 mx;s//$"@m.\n\n@m,$"@z.\n\u@_,/until/^(?=99)/m/split@m=(@z=($ m+=1,bottle."s"x@+,of,beer),on,the,wall),@+?take7one7down7and.@ m.pass7it7around:go7to7the7store7and7buy7some7more;print$',$` mx;s/^/,$"@l.\n\n@l,$"@m.\n\u@_/,/s/until/^99.*/sm/split@l=(@ m=(++$n,bottle.$&,of,beer),on,the,wall),@+?take7one7down7and.@ l.pass7it7around:go7to7the7store7and7buy7some7more;print$&.$`
I was getting more and more frustrated ... and more and more annoyed at that ugly leading "mx;".

If only...

So many "nearly" solutions. If only one little thing was different, they'd work.

If only, if only ... if only me auntie had bollocks she'd be me uncle

-- David Brent, The Office Season 2, Episode 3

Eventually, unable to bear looking at that damned leading "mx;" any longer, I switched to seeking out solutions beginning with m/s/. Well, that was (and remains) the only half-way useful regex I can think of to start a solution with. After doing that, I finally found a perfectly fitting 152-stroke solution:

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@j,/while@j=!s/m//?(go,to,the,store,an.d,buy,so. me,more):(take,one,down,an.d,pass,it,around),$n^99;print$'.$`
Yay! The remarkable and non-obvious tactical trick of inserting the "m" length byte into the string -- then later removing it (via s/m//), with the side-effect of usefully setting $' and $` -- I would never have found were it not for the constraints of having to start the first line with m/s/ and having to break it after exactly 61 bytes.

As shown last time, we need to generate a working 152-stroke entry via a little program, such as:

my $source = <<'PERSEVEROUS'; m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@j,/while@j=!s/m//?(go,to,the,store,an.d,buy,so. me,more):(take,one,down,an.d,pass,it,around),$n^99;print$'.$` PERSEVEROUS my $out = unpack 'u', uc($source); open my $fh, '>', 'b.pl' or die "error: open b.pl: $!"; binmode $fh; print $fh "eval lc pack u,'" . $out . "'";

Despite ending with the desired backtick, this solution cannot be reduced to 151 because it relies on the last line starting with the letter "m", the length byte for 61. To get to 151, we need to start the last line with the letter "l", the length byte for 60.

Reverting to the deprecated perl split semantics, I found a number of 151-stroke solutions whose last line begins with the letter "l":

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@_,/until/99/*split+"l",s/m//?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$` m/s/,@m=(@z=($".++$n,bottle.$&,of,beer),on,the,wall),s/^/,@m. m@m,@z.\n\u@_/until/99/*split+"l",s/m./\n/?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$` m/s/,@m=(@z=(",",++$n,bottle.$&,of,beer),on,the,wall),s/^/@m. m@m@z.\n\u@_/until/99/*split+"l",s/m../\n/?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$`

After submitting one of these to tie for the lead I felt I could finally relax ... until the tenacious "dmd" struck back yet again, posting a 149-stroke solution!

Complexity

I hope I've been able to convey the extra level of complexity that compression adds to golf. As if the game of golf were not already hard enough. To illustrate that extra level of complexity, I quote leading Python golfer hallvabo again:

This reminds of the SHA-256 challenge on codegolf.com. Since Python's built-in compression wasn't available and my source code was over 500 bytes long, I figured I had to roll my own compression scheme to beat Mark Byers leading the Python section with 493 strokes. I started with restricting the source to 64 characters so I could use a homemade 6-bit character encoding (curiously, this only increased the source from 507 to 512 bytes! this was because I couldn't use ~, so tricks like ~- became unavailable). I then golfed the decompressor, getting it down to about 75 strokes. Finally, I recognized that this approach gives a whole set of new tricks to play with, since I could reuse variables from the decompression stage in the sha-256 stage! Of course, this requires the variables to have the correct value after the decompression stage... at this point my brain almost melted :)

That concludes this introductory series on the difficult topic of compression in golf. I hope you've enjoyed it. If you are looking for further challenges, we know those damned beer bottles can be further reduced to 149, perhaps lower. Though be warned, the complexity of this task may melt your brain. :)

References

Acknowledgement: I'd like to thank mtve and hallvabo for their help in preparing this series.

Comment on Compression in Golf: Part III
Select or Download Code
Re: Compression in Golf: Part III (Zone, Bubble or Hole?)
by Anonymous Monk on Oct 21, 2012 at 09:32 UTC

    I felt bad to see another one of your golf meditations met with no replies, so I've got a query, how do you describe this state you're in when you write these golf exposes?

    Golf Zone, Golf Bubble or Golf Hole?

    I thank you

      Hey, writing a meditation with no replies is not unlike 90% of blogs out there. :) I think merlyn once said that blogging is like making a speech to a big empty hall. :)

      As for getting no replies, I don't stress over that. I enjoy the act of writing. Plus, I've received enjoyable feedback from serious golfers (such as hallvabo, mtve, Jasper, J_-_L, emiltin, rhebus, o0lit3, murky-satyr, primo, grimy) which indicates that my exposes are read and appreciated. The people who get the most out of them are probably serious golfers who have actually played in the challenges and who are mostly not members of Perl monks. So why write these golf articles on Perl Monks? Why not start my own blog instead? Ironically, my "no replies" nodes notwithstanding, I feel I am likely to get more replies on Perl monks than I would if I wrote some obscure blog. Plus Perl Monks is a very comfortable place for me.

      As for the state I'm in when writing my golf exposes, from your three choices, I would choose "golf zone". While I find my "writing zone" to be pretty much the same mental state no matter which topic I'm writing on, I feel that my "writing zone" is a different mental state to my "programming zone", which is different to my "playing golf zone".

      When writing, I am focusing on story-telling, trying not to bore the reader. Trying not to be too dry by throwing in a few jokes, but also trying not to irritate the reader by using too much humour. That is a very different mental state to programming or golfing.

      Though I enjoy programming, golfing, and writing, I usually enjoy writing more than the other two. I'm getting too old for golf nowadays.

Re: Compression in Golf: Part III
by grizzley (Chaplain) on Oct 22, 2012 at 11:31 UTC
    It is really tempting... to use your ideas for my codegolf solutions... and finally get into position 10... ahhh... I shouldn't do it... shut up, my conscience!...

      While simply cutting and pasting my 151-stroke solution may bother your conscience, how about applying the compression techniques learned here to other golfs? I see you've played in 27 challenges, so you should be able to apply compression techniques to some of the longer ones without troubling your conscience. At least, I don't see a problem with doing that.

        It's just a little joke, nothing more :) Of course I could do that too! But I'm right now busy with www.projecteuler.net so probably won't start compressing.

      While it's true that Roman to Decimal, Saving Time, and now 99 Bottles have all been effectively post-mortemed, the game doesn't need to stop there, it just changes flavor: "Can the best solution be improved any more?" If I could find a 149 stroke solution as dmd did, or even a 150, I'd post it without thinking twice ;)

      And, as eyepopslikeamosquito pointed out, this is also a perfectly valid technique to improve your other scores. I've recently improved my 1000 Digits of Pi score from 102 -> 96, extending my lead to a full 20 strokes, and Oblong Number Spirals from 112 -> 104, breaking a long standing tie with Flagitious, which of course is always satisfying.

      Even solutions as short as 80 strokes can often be improved. Admittedly, attempting to pack u an 80 stroke solution might seem like an act of desperation, but if it gets you that ellusive 79 stroker, it's fair game. After all, this technique is available to all golfers, and I suspect that many have been using it for some time.

        I suspect that many have been using it for some time
        It would be very interesting to learn who has been using it, for how long, and how they stumbled upon it.

        I always get suspicious when someone holds a big lead in a high-scoring golf -- as mtve did at sha-256-hashing, leading 396 to 433. So I asked him about it directly. And he then (finally) remembered ... why yes he did use it in that game! You see, he'd completely forgotten about that when I contacted him several weeks earlier to help me research the Compression in golf series! Well, it was years ago. In fact, mtve knew about this technique since 2002 because he co-refereed TPR(0,6) with Ton Hospel and so was privy to the reason behind Ton's new "At least 55% of the code must consist of normal ASCII characters" rule.

        I also wonder about Ruby golfing phenomenon flagitious. After all, he led that same sha-256 golf in Ruby by 338 to 386 for years ... until "dmd" recently wrested the lead at 335 by using "pack u" compression. Now dmd is such a strong golfer that if he was using compression and flagitious was not, I would expect dmd to be at least 20 strokes ahead. Whoa, I just realised that I tied flagitious in Ruby at 173 strokes in the 99 bottles of beer game without using compression ... so if flagitious did use compression in that game, that would have to rate as my greatest Ruby golfing accomplishment. :)

Re: Compression in Golf: Part III
by Anonymous Monk on Mar 30, 2013 at 15:53 UTC

    This is my 151 stroke solution.

    My strategy was trying to compress your 160 without sacrificing strokes to pack's constraints (escaping G and T took 4):

    eval lc(j x 165^pack$<^C,'(M\xe2(\x9d\xe2\x86\x1b\x9b\x98\x81^xc\xe4\xa1\x9a\x12*y\x852b\x0f=\x88\xe6\x14I\x9e\x08\xf9\x9d,a\xa3\x99\x99eY\xf7\x8b\x04\xfa\x85\x10\xfa\x8e\x15\xd1*,C\xaah\xb6Y\xa87\xaa-\x81_\x10\xe9\xaa(I\x00\x00\xa1&\xa8\xa2d\x02W\xc4x1\xa5\x92L\xf3\x92\x00!\x97\x16\x98\x0cG\xa8\xba\xd5\x9f4Z\x9e\x16\xa7\x82>\xa6^\x15\x83\xea,C\xaa!\xf4\xeadQ\xcf\xa8qX>\xeb(')

    So, after discovering Perl string bitwise operators, I applied ToastyX's law of PHP golf to Perl.

    With the right timing x 165 can be replaced by x$$, saving two strokes.

    Regards

    dmd

      To help the casual reader appreciate the depths of depravity and deviousness of dmd's solution, consider this snippet:

      j x 165^pack$<^C,'packed-string'
      For this to work:
      1. Perl must interpret $< (aka $UID) as a string, not a number, otherwise the xor with "C" will not produce the desired "u" for pack's first parameter.
      2. The $< special variable must be greater than or equal to 1123 because 123 is the length of the compressed string. This special variable seems to take the value 65534 on the codegolf server, so the string produced by $<^C there is "u5534".
      3. dmd was undeterred by "u5534" being way out of range for pack. On the contrary, this turned out to be an opportunity because perl 5.8 has a nasty memory bug (fixed in later perl versions, as mentioned in Compression in Golf: Part I) so that going beyond "u63" essentially results in a random value for the length byte, depending on what happens to be in the memory being read beyond the end of the PL_uudmap[] array. That is, this solution works with the perl 5.8.8 running on the codegolf server but fails with later perl versions and may even fail with perl 5.8 running in a different environment.
      4. The random value dmd got in the first position as pack's length byte on the codegolf server happened to be "f" which he was able to exploit by xor'ing with "j" to produce form-feed (ord 12) as the leading character, so as not to muck up the program string to be eval'ed (as a leading "f" would have done). By the way, if you could somehow fluke a leading "s" or "@" for the length byte you may be able to shave another stroke because short solutions to the bottle golf typically start with "s" or "@".
      That is, like many winning codegolf solutions, this one hangs by a thread.

      With the right timing x 165 can be replaced by x$$, saving two strokes
      Now that's just sick. If you managed to submit your entry at just the right time when $$ took the value 165 on the codegolf server, I salute you! This reminds me of some of the antics described in the "Outsmarting the Robot Referee" section at The golf course looks great, my swing feels good, I like my chances (Part IV) where entries were submitted when the $^T time variable took just the right value on the codegolf server.

      Thanks for sharing your solution. I find it hilarious that when we were tied for the lead at 151 strokes, we both probably assumed we had similar solutions, when, in fact, they were completely different. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://1000179]
Approved by Athanasius
Front-paged by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2014-08-27 12:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (238 votes), past polls