Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Continuing our 99 bottles of beer reduction from last time, I decided at this point to step back and reconsider some basic assumptions. In particular, though:

eval lc pack u,'source-string'
has served us well so far, can we do better?

Especially relevant to 99 bottles of beer are these two ideas:

  • Since we need to loop 99 times, why not exploit the "eval we have to do anyway" for that? That is, string multiply the eval string like so: eval lc pack(u,'source-string')x99.
  • In solutions that use v for space, we translate via y/v/\40/. If this translation were done outside the eval, it could be shortened by two strokes to y/v/ /; one stroke could similarly be saved when translating to a hard newline. Though doing that outside the eval loses our 3/4 "pack u" compression, the saving in translation makes it worth considering at least.

Multiplying the eval string by 99

All the beer bottle algorithms seen so far loop from zero up to 99, building a single large string inside the loop -- and printing it in one go at the end. This (unnatural) algorithm has proven to produce the shortest solutions because, by starting at zero rather than 99, we exploit perl's default (undef) variable initialization, and so avoid costly explicit initializations, such as a prohibitive six stroke penalty for a leading $n=99;. Moreover, we can shorten the plural inflection problem from, say, bottle."s"x!!$n to bottle.$&.

Sadly though, such an algorithm is not well-suited to:

eval lc pack(u,'source-string')x99
because it requires a terminating action outside the loop, namely to print the string built inside the loop. If you could find a short bottle golf algorithm that just did the same thing 99 times without requiring anything at the beginning or end, that could be a winner.

Unable to find such an algorithm, I resorted to printing the value returned by the eval like so:

print eval lc pack(u,'source-string')x99
Now, since eval returns the value of the last expression, we just have to make sure that its last evaluated expression returns a complete 99 bottles of beer string; for example, this 167-stroker:
@j=/s/?(take,one,down,an.d,pass,it,around):(go,to,the,store,an.d,buy,s +ome,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/,$"@m.\n\n +@m,$"@z.\n\u@j/;/\n+/;$'.$`;
which can be tested like so:
print eval q!@j=/s/?(take,one,down,an.d,pass,it,around):(go,to,the,sto +re,an.d,buy,some,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s +/^/,$"@m.\n\n@m,$"@z.\n\u@j/;/\n+/;$'.$`;!x99
If you compare this code to our original algorithm, you will notice that we are spared an until loop, and the associated test for 99, because we know we are executed 99 times via the eval. That saves us 179 - 167 = 12 characters or so in the source string. Yet those twelve source characters are reduced to nine in the output string, courtesy of pack's 3/4 compression. And, as you can see below:
eval lc pack u,'' [17] print eval lc pack(u,'')x99 [27] #23456789012345678901234567
this new way of employing pack is ten strokes longer. That is, our nine stroke saving costs ten.

As proof of concept, running this program generates a working 157 stroke solution:

my $source = <<'PERSEVEROUS'; my@j=/s/?(take,one,down,$m,pass,it,around):(go,to,the,store,$ m=an.d,buy,some,more);@m=(@z=(++$n,bottle.$&,of,beer),on,the, gall^v16);s/^/,$"@m.\n\n@m,$"@z.\n\u@j/;/\n+/;;$'.$`; PERSEVEROUS my $out = unpack 'u', uc($source); open my $fh, '>', 'b.pl' or die "error: open b.pl: $!"; binmode $fh; print $fh "print eval lc pack(u,q&" . $out . "&)x99";
Note that we (unluckily) lost a stroke because a single quote was generated (near "(g" above), necessitating packaging the string inside q&..&, rather than '...'. That is, this solution is potentially a 156 stroker. So four strokes need to be whittled from the source string to get down to 153 and another four to get to 150. Since this looked unlikely, I reluctantly gave up on this interesting approach.

Translating outside the eval

The following 152 stroke source string:

@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,v@z._/;s/s/\utake +vonevdownvandvpassvitvaround,v@m.__/;"$'\ugovtovthevstorevandvbuyvsom +evmore,v@m.";
which can be tested via:
my $prog = <<'PERSEVEROUS'; @m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,v@z._/;s/s/\utake +vonevdownvandvpassvitvaround,v@m.__/;"$'\ugovtovthevstorevandvbuyvsom +evmore,v@m."; PERSEVEROUS s//lc $prog x99/ee;y/_v/ /;print
saves a whopping 179 - 152 = 27 characters in the source string. Yet those 27 source characters are reduced to about 20 in the output string, after pack's 3/4 compression. And the cost of this form is a further 10 strokes, as indicated below:
eval lc pack u,'' [17] print eval lc pack(u,'')x99 [27] s//lc pack(u,'')x99/ee;y/v_/ N/;print [37] (N represents newline) #234567890123456789012345678901234567
That is, our 20 stroke saving comes at a cost of ... 20 strokes! So I suppose this approach has a chance. However, I found it very difficult to pour this program into any "pack u" shape. So I reluctantly gave up on this interesting approach too.

More radical still is this 148 stroker:

@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,`@z._/;s/s/]ake`o +ne`down`and`pass`it`around,`@m.__/;"$'^o`to`the`store`and`buy`some`mo +re,`@m.";
which can be tested via:
my $prog = <<'PERSEVEROUS'; @m=(@z=(++$n,bottle.$&,of,beer),on,the,wall);s/^/s@m,`@z._/;s/s/]ake`o +ne`down`and`pass`it`around,`@m.__/;"$'^o`to`the`store`and`buy`some`mo +re,`@m."; PERSEVEROUS s//lc $prog x99/ee;y/]-`/TG /;print
This time 179 - 148 = 31 characters are saved in the source string. And those 31 source characters are reduced to about 23 in the output string, after pack's 3/4 compression. Note that the cost of this form is about 23 strokes, as indicated below:
eval lc pack u,'' [17] s//lc pack(u,'')x99/ee;y/v_/ N/;print [37] s//lc pack(u,'')x99/ee;y/]-`/TGN /;print [40] #234567890123456789012345678901234567890
Once again though, I found pouring this program into any "pack u" shape problematic.

Back to the main game

After that interesting, if unsuccessful, diversion, I reverted back to the main game, namely:

eval lc pack u,'source-string'

I felt that the best chance of getting from 154 to 151 was to change "pack u54" to "pack u", a saving of two strokes, combined with a one-stroke saving via the tactical trick of finding an algorithm ending in $`, thus exploiting pack's use of backtick as the NULL byte. With that approach, I don't need to find a significantly shorter solution, just one that fits the default "pack u" shape like a glove.

Fun with split

I was able to reduce my shortest unformatted solution from 179 strokes to 176 by exploiting a deprecated feature of the Perl split function:

s/^/,$"@m.\n\n@m,$"@z.\n\u@_/,/s/until/99/*split@m=(@z=(++$n,bottle.$& +,of,beer),on,the,wall),/\n+/?take7one7down7and7pass7it7around:go7to7t +he7store7and7buy7some7more;print$'.$` # or s/^/,$"@m.\n\n@m,$"@z.\n\u@_/until/99/*split@m=(@z=(++$n,bottle."s"x/\ +n+/,of,beer),on,the,wall),$&?take7one7down7and7pass7it7around:go7to7t +he7store7and7buy7some7more;print$'.$`
namely that calling split in scalar context has the side-effect of setting @_. As an aside, note that side-effects are frequently very useful in golf. Though this solution only works with Perl 5.10 or earlier (this mis-feature was thankfully (for non-golfers) removed in Perl 5.12) that was ok for this game because codegolf competitions use perl 5.8.8.

This shorter raw solution didn't help very much because I was unable to effectively pour it into a "pack u" shape. Just like my earlier 154-stroke entry, I had more success by focusing on finding an algorithm to fit the required shape.

Constraints

The more constraints one imposes, the more one frees oneself of the chains that shackle the spirit... the arbitrariness of the constraint only serves to obtain precision of execution.

-- Igor Stravinsky, 1882-1971

Stravinsky's inspirational quote notwithstanding, I found the arbitrary constraint of fitting Perl code into three lines of precisely 61 characters in length, all starting with the letter "m", to be frustrating in the extreme. For example, I found many "nearly" solutions, such as:

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 mx;s//,$"@m.\n\n@m,$"@z.\n\u@_/until/^99.+/sm/split@m=(@z=(++$ m,bottle."s"x@-,of,beer),on,the,wall),@-?take7one7down7and.@ m.pass7it7around:go7to7the7store7and7buy7some7more;print$&.$`
which has line lengths of 62, 60, 61 when I need them to be 61, 61, 61. Aargh!?!! Other "nearly" solutions were:
# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 mx;s//$"@m.\n\n@m,$"@z.\n\u@_,/until/^(?=99)/m/split@m=(@z=($ m+=1,bottle."s"x@+,of,beer),on,the,wall),@+?take7one7down7and.@ m.pass7it7around:go7to7the7store7and7buy7some7more;print$',$` mx;s/^/,$"@l.\n\n@l,$"@m.\n\u@_/,/s/until/^99.*/sm/split@l=(@ m=(++$n,bottle.$&,of,beer),on,the,wall),@+?take7one7down7and.@ l.pass7it7around:go7to7the7store7and7buy7some7more;print$&.$`
I was getting more and more frustrated ... and more and more annoyed at that ugly leading "mx;".

If only...

So many "nearly" solutions. If only one little thing was different, they'd work.

If only, if only ... if only me auntie had bollocks she'd be me uncle

-- David Brent, The Office Season 2, Episode 3

Eventually, unable to bear looking at that damned leading "mx;" any longer, I switched to seeking out solutions beginning with m/s/. Well, that was (and remains) the only half-way useful regex I can think of to start a solution with. After doing that, I finally found a perfectly fitting 152-stroke solution:

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@j,/while@j=!s/m//?(go,to,the,store,an.d,buy,so. me,more):(take,one,down,an.d,pass,it,around),$n^99;print$'.$`
Yay! The remarkable and non-obvious tactical trick of inserting the "m" length byte into the string -- then later removing it (via s/m//), with the side-effect of usefully setting $' and $` -- I would never have found were it not for the constraints of having to start the first line with m/s/ and having to break it after exactly 61 bytes.

As shown last time, we need to generate a working 152-stroke entry via a little program, such as:

my $source = <<'PERSEVEROUS'; m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@j,/while@j=!s/m//?(go,to,the,store,an.d,buy,so. me,more):(take,one,down,an.d,pass,it,around),$n^99;print$'.$` PERSEVEROUS my $out = unpack 'u', uc($source); open my $fh, '>', 'b.pl' or die "error: open b.pl: $!"; binmode $fh; print $fh "eval lc pack u,'" . $out . "'";

Despite ending with the desired backtick, this solution cannot be reduced to 151 because it relies on the last line starting with the letter "m", the length byte for 61. To get to 151, we need to start the last line with the letter "l", the length byte for 60.

Reverting to the deprecated perl split semantics, I found a number of 151-stroke solutions whose last line begins with the letter "l":

# 1 2 3 4 5 6 #234567890123456789012345678901234567890123456789012345678901 m/s/,@m=(@z=(++$n,bottle.$&,of,beer),on,the,wall),s/^/$"@m.\n m@m,$"@z.\n\u@_,/until/99/*split+"l",s/m//?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$` m/s/,@m=(@z=($".++$n,bottle.$&,of,beer),on,the,wall),s/^/,@m. m@m,@z.\n\u@_/until/99/*split+"l",s/m./\n/?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$` m/s/,@m=(@z=(",",++$n,bottle.$&,of,beer),on,the,wall),s/^/@m. m@m@z.\n\u@_/until/99/*split+"l",s/m../\n/?takeloneldownland. lpasslitlaround:goltolthelstorelandlbuylsomelmore;;print$'.$`

After submitting one of these to tie for the lead I felt I could finally relax ... until the tenacious "dmd" struck back yet again, posting a 149-stroke solution!

Complexity

I hope I've been able to convey the extra level of complexity that compression adds to golf. As if the game of golf were not already hard enough. To illustrate that extra level of complexity, I quote leading Python golfer hallvabo again:

This reminds of the SHA-256 challenge on codegolf.com. Since Python's built-in compression wasn't available and my source code was over 500 bytes long, I figured I had to roll my own compression scheme to beat Mark Byers leading the Python section with 493 strokes. I started with restricting the source to 64 characters so I could use a homemade 6-bit character encoding (curiously, this only increased the source from 507 to 512 bytes! this was because I couldn't use ~, so tricks like ~- became unavailable). I then golfed the decompressor, getting it down to about 75 strokes. Finally, I recognized that this approach gives a whole set of new tricks to play with, since I could reuse variables from the decompression stage in the sha-256 stage! Of course, this requires the variables to have the correct value after the decompression stage... at this point my brain almost melted :)

That concludes this introductory series on the difficult topic of compression in golf. I hope you've enjoyed it. If you are looking for further challenges, we know those damned beer bottles can be further reduced to 149, perhaps lower. Though be warned, the complexity of this task may melt your brain. :)

References

Acknowledgement: I'd like to thank mtve and hallvabo for their help in preparing this series.


In reply to Compression in Golf: Part III by eyepopslikeamosquito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2024-03-28 23:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found