Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
Narrative of how obfun! was created.
WARNING! Spoiler ahead. If you'd rather figure it out by yourself, read no more.

First stage. The original script

It all started with a coding exercise. I saw the hangman script in the Perl Power Tools package, and I wanted to code a better version. I especially disliked the print_noose() function, which is extremely naive. So I coded the program from scratch, just to demonstrate to myself that I could do it better, and the result is this one.
#!/usr/bin/perl -w use strict; my $file = shift; my $minlen = shift || 3; my @picture = ( ' +--+', ' |', ' |', ' |', ' |', '-----+'); my @cp =( # create picture [1,2,'O'], [2,2,'|'], [2,1,'/'], [2,3,'\\'], [3,1,'/'], [3,3,'\\']); my @wcp=( # winner create picture [2,1,'\\'], [2,2,'O'], [2,3,'/'], [3,2,'|'], [4,1,'/'], [4,3,'\\']); my ($errors, $word, $guess, $msg, $A) = (0,'','','','a'); my %letters = (); my %wordletters=(); my %garbled=map{$_,$A++} split //,'ifcseujmtaqybnzdwgvokxphlr'; { local $/=undef; my $words ; if (defined $file) { open FILE, "< $file" or die "file not found $file\n"; $words = <FILE>; close FILE; } else { $words = <DATA>; } srand(); $words =~ s/\n/ /g; my @ww = grep {length($_) >= $minlen} split / /, $words; my $count = scalar @ww; $word = lc($ww[rand ($count) -1]); $word = join '',map{$garbled{$_}}split //,$word unless $file; $guess = '-' x length($word); $count =0; for (split //,$word){push @{$wordletters{$_}},$count++}; } while (1) { # loops forever. Ends if win or lost draw($errors); if ($errors > 5) { print "YOU LOST ($word)\n"; exit } print $guess,"\n"; print 'used: <', keys %letters,">\n"; print "guess: "; my $x; chomp($x = <>); $x = lc($1) if $x=~/(\w)/; if ($x !~ /^\w$/) { $msg = "INVALID INPUT $x"; } elsif ($letters{$x}) { $msg = "ALREADY PLAYED: $x"; } elsif ($wordletters{$x}) { for (@{ $wordletters{$x}} ) { substr($guess, $_,1) = $x; } unless ($guess =~ /-/) { @cp=@wcp; $msg = "YOU WON! ($guess)"; draw(6); exit; } } else { $errors++ } $letters{$x}++; } sub draw { my $level = shift; my @pic = @picture; for my $j (1..$level) { substr($pic[$cp[$j-1][0]],$cp[$j-1][1],1) = $cp[$j-1][2]; } system ('clear') if $^O =~ /linux/; # add the following if you like it # system ('cls') if lc($^O) =~ /win/; print join $/,@pic,$/; if ($msg) { print "$msg$/"; $msg =""; } } # DATA contains the same words as in obfun.pl # from the 22nd line onwards __DATA__
This script can either take a word from its DATA (scrambled with a simple letter substitution), or from an external file. In addition to the original version features, it can print a different figure when you win.
So I said to myself "let's publish it to perlmonks.".
However, after a quick search, I found out that the Monastery has already its hangman code, plenty of them, actually, included golf ones.
What next? Well, let's obfuscate it!

Second stage. Shrinking

Therefore, I took my beautifully optimized code and started to shrink it, removing all unnecessary spaces, and encrypting the messages (A simple letter shifting).
@pc=qw(2+--+ 5| 5| 5| 5| -----+); s/(\d)/" " x $1/e for@pc; (@cp,@wcp)=((),()); $_="12O22|21/23\\31/33\\"; push@cp,[$1,$2,$3]while /(.)(.)(.)/g; $_="21\\22O23/32|41/43\\"; push@wcp,[$1,$2,$3]while /(.)(.)(.)/g; ($er,$wd,$gs,$msg,$A,%ls,%wl)=(0,"","","","a"); @xmg=("ZPV MPTU","vtfe:","hvftt: ","JOWBMJE JOQVU ", "BMSFBEZ QMBZFE:","ZPV XPO!"); %grbl=map{$_,$A++}split //,"ifcseujmtaqybnzdwgvokxphlr"; { local$/=undef; $_=<DATA> } srand(); s/\n/ /g; @ww=split / /,$_; $c=@ww; $wd=lc($ww[rand($c)-1]); $wd=join"",map{$grbl{$_}}split //,$wd; $gs="-" x length($wd); $c=0; push@{$wl{$_}},$c++for split //,$wd; @mg=map{tr/[b-z]/[a-y]/;tr/[B-Z]/[A-Y]/;$_}@xmg; while(1){ dw($er); if($er>5){print"$mg[0] ($wd)\n";exit} print$gs,"\n$mg[1] <",keys %ls,">\n$mg[2]"; chomp($x=<>); $x=lc($1)if$x=~/(\w)/; if($x!~/^\w$/){$msg="$mg[3] $x"} elsif($ls{$x}){$msg="$mg[4] $x"} elsif($wl{$x}) {for(@{$wl{$x}}){substr($gs,$_,1)=$x} unless($gs=~ /-/){@cp=@wcp;$msg="$mg[5] ($gs)"; dw(6);exit;}} else{$er++}$ls{$x}++} sub dw{ @p=@pc; for(1..$_[0]) {substr($p[$cp[$_-1][0]],$cp[$_-1][1],1)=$cp[$_-1][2]} system("clear")if $^O=~/linux/; print join $/,@p,$/; if($msg){print"$msg$/";$msg=""}} __DATA__

Third stage. Garbling

Even though this code is less clear than the previous one, it is easy to find out what it's doing.
So I wrote a quick garbling script, which creates a rather less clear version.
#!/usr/bin/perl -w use strict; my $count=0; my $text=''; while (<>) { last if /^__DATA__/; chomp; $text .= $_; } $text =~ s/\n/ /g; my @sorted = sort {$a->[1] cmp $b->[1]} map{[$count++,$_]} $text =~ /(.{20})/gs; my $index = join "", map {sprintf"%.2x",$_->[0]} @sorted; print substr($index,0,56),"\n"; print substr($index,56),"\n"; $count =0; for (@sorted) { print $_->[1]; if ($count++ > 1) { $count = 0; print "\n"; } } print "\n";
This scripts takes as its input (through STDIN) the previous one, with slightly less newlines than the one you see here (the final length must be divided by 20), and splits the code into chunks of 20 characters. Those chunks are stored into an array, where each element is an anonymous array of two items: the first one is one number, the same as the current substrcipt, and the second one is the code text .
By sorting this array alphabetically, the code is completely garbled. The keys to restore it are the first items, whose values are stored as a hex string (for compactness).
The result is what you can see in the first 21 lines of DATA in obfun.pl.

Reading and executing the garbled code

Now my task is to get the garbled code, restoring it to its normal status and executing it without giving too many clues of what I was doing.
A non-obfuscated version of what needs to be done is:
while (<DATA>){ chop; if($.< 3) { # read the index (first 2 lines) $ndx .= $_; } else { # or the code text $text .= $_; } last if $. > 20; # now DATA contains only the words # that we need for the game } # rebuilds the index @index = map {hex $_} $ndx =~ /(..)/g; # gets the text, in chunks of 20 chars @code = $text =~ /(.{20})/g; $count=0; @original = map {$$_[1]} # gets the second element of the # sorted array sort{$$a[0] <=> $$b[0]} # sorts by the index map{[$index[$count++],$_]} # creates the array @code; # from the garbled text $_ = join '', @original; # now $_ has the working code eval;
There is too much information here. Something must be made less obvious to read.

The final script tries to hide all numerical literals in the code, taking advantage of the default value of some Perl globals. The $= variable's default value is 60. From that one, thanks to the help of some other dark globals and the scalar of qw(Nothing to hide), I got the other numbers that I needed to perform the rest of the task. The eval is hidden into a bogus substitution, which is also masqueraded as a sort of subscript.
$c=$=;$=/=$-=@_=qw(Nothing to hide);($m=$-)++;$c+=$-*$m; while (<DATA>){ s#(sub)$#\1 #; chop && $. < $- ? $x : $t .= $_; $. > $= && last; } $!=$%;$-=$=;($%=$=-=$=)++; $.=~s[2][join $\, map {$$_[$%]} sort{$$a[$!] <=> $$b[$|]} map{[(map{unpack("c*",pack((pack"c",$c)."*",$_))} $x =~ /(..)/g)[$=++],$_]} $t =~ /(.{$-})/g]seeme; __DATA__ 0f1623020c2127042e12071f19202f290a1510333236353109081a25 1d000d24050611171c1b0314300b2637132a2d2b2c28340e22181e01 ","BMSFBEZ QMBZFE:" / /,$_;$c=@ww;$wd=l <",keys %ls,">\n$mg " " x $1/e for@pc;(@"","","a");@xmg=("ZP"$mg[0] ($wd)\n";exi "$mg[3] $x"}elsif($l"12O22|21/23\\31/33\";dw(6);exit;}}else{ "ifcseujmtaqybnzdwgv$_="21\\22O23/32|41/$_}@xmg;while(1){dw( $_}}split //,$wd;$gs$er);if($er>5){print$er++}$ls{$x}++}sub $x"}elsif($wl{$x}){f)/g;($er,$wd,$gs,$ms);s/\n/ /g;@ww=split ,"ZPV XPO!");%grbl=m,1)=$cp[$_-1][2]}sys-1][0]],$cp[$_-1][1] /,@p,$/;if($msg){pri/linux/;print join $0]){substr($p[$cp[$_ 2,$3]while /(.)(.)(.43\\";push@wcp,[$1,$="-" x length($wd);$ =lc($1)if$x=~/(\w)/;@mg=map{tr/[b-z]/[a-@pc=qw(2+--+ 5| 5| 5 V MPTU","vtfe:","hvf[2]";chomp($x=<>);$x\";push@cp,[$1,$2,$3 ]while /(.)(.)(.)/g;ap{$_,$A++}split //,c($ww[rand($c)-1]);$ c++for split //,$wd;c=0;push@{$wl{$_}},$cp,@wcp)=((),());$_= def;$_=<DATA>}srand(dw{@p=@pc;for(1..$_[g,$A,%ls,%wl)=(0,"", if($x!~/^\w$/){$msg=nt"$msg$/";$msg=""}}okxphlr";{local$/=un or(@{$wl{$x}}){substp;$msg="$mg[5] ($gs)r($gs,$_,1)=$x}unles s($gs=~ /-/){@cp=@wcs{$x}){$msg="$mg[4] tem("clear")if $^O=~ tt: ","JOWBMJE JOQVUt}print$gs,"\n$mg[1]wd=join"",map{$grbl{ y]/;tr/[B-Z]/[A-Y]/;| 5| -----+);s/(\d)/
The published code looks simple at first glance. There is a DATA section, and the code is reading from DATA, after all. So the reader feels like the code is under control. But nothing is revealed unless he/she manages to figure out what the heck is going on among those stream of join, map, sort, map, map, unpack, pack, pack and two regexes.
Just one moment! What about that pack/unpack business?
Without going into too much details, I give you a hint. The following three lines are equivalent.
# map{unpack("c*",pack((pack"c",$c)."*",$_))} # map {hex $_} # map {unpack("c", pack(____, $_))} # fill in the blanks!
The ending "seeme" could have been a simple "ee", but I wanted it to kind of match with the initial "Nothing to hide".

Then, mission accomplished. Code obfuscated. Have fun!

update Want more of this same stuff? Try Structured obfuscation

 _  _ _  _  
(_|| | |(_|><
 _|   

In reply to Yet another structured obfuscation by gmax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others exploiting the Monastery: (16)
    As of 2014-10-24 18:33 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      For retirement, I am banking on:










      Results (134 votes), past polls