Heres a vivisection of this japh, for those that have asked. It works by decoding Perl (or any ASCII text) from one of the two interwoven DNA strands, with every four nucleotides representing one ASCII character (unlike real DNA which uses three nucleotides per codon, ASCII requires four, since 4^4 == 256). Because DNA is "mirrored" on each strand (T's to A's and G's to C's), we only use one of the two strands. First, we'll rewrite the code in a more readable format:
# "$ _" is really "$_", and change the qq to a double-quote
$_ =
"CG
T--A
A---T
A----T
C----G
T----A
A---T
and
so
on
";
@_{A => C => G => T => } = 0..3;
s|.*(\w).*(\w).*\n|$_{$-++ / 9 % 2 ? $2:$ 1}|gex;
s|(.)(.)(.)(.)|chr (64*$1 + 16*$2 + 4*$3 + $4)|gex;
eval
Next, we'll make sense of the following line of code:
@_{A => C => G => T => } = 0..3;
# is really...
@_{'A', 'C', 'G', 'T'} = 0..3;
This is a hash slice notation that sets the A,C,G, and T keys of the %_ hash to their numeric value counterparts 0,1,2, and 3 respectively. The use of the digraph
=> operator allows for making strings of barewords.
The next line of code transforms the chromosome into a series of Base4 digits, by substituting the appropriate digit for each line:
s|
.* # greedily match
(\w) # match first letter, and store into $1
.* # greedily match
(\w) # match last letter, and store into $2
.*\n # eat up remainder of line
|
# this expression maps the relevant character to its Base4 digit f
+rom
# the %_ hash. The $- is used as a line counter (it defaults to 0)
+. When
# the DNA strands flip positions, this continues decoding on the c
+orrect
# strand (see physi's comment for a visual representation of this)
$_{$-++ / 9 % 2 ? $2:$ 1}
|gex;
After this substitution,
$_ looks something like
010210320210103203.... All that's left to do is to transform each sequence of four Base4 digits into their ASCII representation:
s|
# store next four characters into $1,$2,$3, and $4
(.)(.)(.)(.)
|
# replace with a Base4-to-ASCII conversion of those characters
chr (64*$1 + 16*$2 + 4*$3 + $4)
|gex;
After this,
$_ contains our decoded code:
print"Just Another Perl Hacker\n". The string is at last
eval'd, and japhage occurs.
And, in case anyone's interested, here's the corresponding Perl-to-DNA encoder:
use strict;
my $BASE = 4;
my %NUC_PAIRS = (
A => T =>
C => G =>
G => C =>
T => A =>
);
my @DIGIT_TO_NUC = qw( A C G T );
my $FMT_DNA = <<END;
01
0--1
0---1
0----1
0----1
0---1
0--1
01
10
1--0
1---0
1----0
1----0
1---0
1--0
10
END
my @FMT_DNA = split "\n",$FMT_DNA;
my $str = 'print"Just Another Perl Hacker\n"';
my @str_digits;
for (split//, $str) {
my $ord = ord($_);
my @digits = (0) x 4;
print "$ord:\t";
my $i = 0;
while ($ord) {
$digits[4 - ++$i] = $ord % $BASE;
$ord = int ($ord / $BASE);
}
print "@digits\n";
push @str_digits, [@digits];
}
my $i = 0;
for (@str_digits) {
for (@$_) {
my $fmt = $FMT_DNA[$i++ % @FMT_DNA];
my $nuc0 = $DIGIT_TO_NUC[$_];
my $nuc1 = $NUC_PAIRS{$nuc0};
$fmt =~ s/0/$nuc0/;
$fmt =~ s/1/$nuc1/;
print "$fmt\n";
}
}
MeowChow
s aamecha.s a..a\u$&owag.print
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.