http://www.perlmonks.org?node_id=835038

(Disclaimer) I wrote this more as a journal entry as I was learning about recursive theory and using quines as a starting point, then realized it might be beneficial for anyone who's interested in learning how to start thinking about them and all the interesting stuff they entail. I'm pretty sure this belongs in Meditations, but if not I apologize.

Quines are programs that print out copies of themselves, character for character. Quines are made of 2 essential parts, encoding (data) and actual code:

  1. (data) An encoding of the actual code is made.
  2. (code) The executable part of the encoding is then executed as code using it as input (the code is a subset of the encoding).

In other words, the data contains the "encoding" of the executable code, which is then "fed into" the actual code. The quine combines the two parts to make identical the program itself and its printed output.

The "encoding" is of course just a string. If we assign variables it must show up in the string (we need not to - we can use literal or inline strings). Where we print that also shows up. The string says "this is what I'm about to do to myself" and then the code goes and does it. Since a quine's job is simply to print itself, this reduces to "this is what I'm about to print, using all or part of this exact string."

Here's an example in English:

Print out two copies of the following, the second one in quotes: "Print out two copies of the following, the second one in quotes."

Notice the statement itself is the code, and the encoding is the statement wrapped in quotes. Both the code and the encoding are one and the same, but that need not be the case (English is flexible - but hey, so is Perl!)

What follows is a straightforward derivation of a quine in Perl. It took many attempts and lots of dead-ends, but I hope I can translate the underlying ideas simply and clearly. I had a lot of help in the thought process behind constructing one here. I now see the identical quine and a few other similar ones here (maybe click that one later).

Remember: encoding + code; in this example I'll make an encoding, then print it (the order of which is unimportant - thanks rubasov). I start by creating a variable that stores the encoding, realizing that the encoding must also start with an assignment if the eventual output is to match:

$s = '$s = ...

At this point you can fall off a cliff by pursuing the temptation to keep recursing - so, don't! Instead just mark a placeholder for the current string that you can fill in later using the actual code (hint hint):

$s = '$s = X';

I closed off the string only to free myself into now thinking how the code should look, which I can then go back and paste into the encoding afterward.

So, how should the actual code look? The trick is deciding how you want Perl to fill in the X. Again, this example being straightforward I use printf, which by its very design does exactly what we want. Since we'll fill in the encoded string with the printf function during execution, we can use %s as the placeholder for the definition of $s:

$s = '$s = %s'; printf($s,...)

Whoops, forgot to paste in printf in the encoding:

$s = '$s = %s; printf($s,...)'; printf($s,...)

Yes, things are starting to flesh out now! Now, what should we put in %s? Why the encoded string of course:

$s = '$s = %s; printf($s,$s)'; printf($s,$s)

This is a basic sketch of any quine you'll come across. In fact, if you run it you'll see we're almost done:

$s = $s = %s; printf($s,$s); printf($s,$s)   # Output

All we're missing is the quotation marks! Again, this conundrum is solved by using placeholders:

$s = '$s = %c%s%c; printf($s,39,$s,39)'; printf($s,39,$s,39)

(Note that if you try to use a pair of literal \' you'll have to keep recursively accounting for the accumulating backslashes in the encoding - until you printf-format them - so you might as well keep it simple and format the darn quotation marks.)

And that's it, we have a quine! The key to all this (for me at least) was realizing that we actually make an encoding inside the encoding, which gets filled in during execution. In other words, the code knows how to translate the encoding into the program itself. Also, adding code is easy to accommodate in the encoding by merely pasting it in, so long as we don't fall off the "recursive cliff."

Here's a modified version of the same quine that compacts the encoding and statement of execution:

printf($s='printf($s=%c%s%c,39,$s,39)',39,$s,39)

Here's an adaptation of an elegant and popular example that "abuses" Perl's flexibility a bit more:

say << '' x 2 say << '' x 2

This looks very much like the English example, and in fact works almost exactly the same way. It uses the magic of Perl's << here-doc operator, which allows the statement of execution (top) to appear inline with the multiline encoding (bottom). As a side benefit no substitution is necessary and no quotes are needed, and we end up with the elegant code as presented.

Links

  • http://www.perlmonks.org/?node_id=661934
  • http://www.perlmonks.org/?node_id=62389
  • http://www.madore.org/~david/computers/quine.html
  • http://www.nyx.net/~gthompso/quine.htm

Corrections and comments are, of course, welcome.