Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Structured obfuscation

by gmax (Abbot)
on Jan 22, 2002 at 13:38 UTC ( #140626=obfuscated: print w/replies, xml ) Need Help??

Dear fellow Monks,
I would like to tell the inverse story of a recent obfuscation (Perl Monk's dream), starting from the idea of what to print and following the step-by-step process of garbling a honest (well, almost) piece of code.
I have always been fascinated by obfuscation, even though I know that I don't have a natural knack for it. When writing code, I almost automatically think in terms of either structured or OO programming. Therefore, thinking in obfuscated manner is a sort of act against nature for me. However, I wanted to produce an obfuscation, and I decided to do it in a scientific way, following a logic path to disrupt those rules that I try to observe on a daily basis.
This process is not something that I did in one day, but an idle pastime that I was enjoying from time to time between sessions of more demanding work.

There are two ways of disliking art. One is to dislike it. The other is to like it rationally.
-- Oscar Wilde

Warning! This is a (self-)SPOILER.
If you'd rather work out the details on your own, read no more.

We start with the shape we want to print out.
ss{}{}{}sss{}{}ssss{}{}{}sss{}ssss{} ssss{}ssss{}ss{}sss{}sss{}ss{}ssss{} ssss{}sss{}{}{}{}ss{}{}{}sss{}{}{}{} {}ss{}sss{}ssss{}ss{}sssssss{}ssss{} {}{}{}sss{}ssss{}ss{}sssssss{}ssss{}
Not difficult to transform it into a pattern.
Each space counts for 1, and so will each group of "{}".
Always starting with a space, the pattern looks like this one:
@pattern = ([2,3,3,2,4,3,3,1,4,1], # 2 spaces, 3 marks, 3 spaces, 2 marks, etc. [4,1,4,1,2,1,3,1,3,1,2,1,4,1], [4,1,3,4,2,3,3,4], [0,1,2,1,3,1,4,1,2,1,7,1,4,1], # 0 spaces, 1 mark, etc. [0,3,3,1,4,1,2,1,7,1,4,1] );
How to print it? Here is the most obvious, boring solution: a nested foreach, which does the job honestly and cleanly. Not much of obfuscation, here. To decide what to print, we need a variable indicating the current status (mark or space)
for $row (@pattern) { $mark=0; for $column (@$row) { if ($mark) { print "{}" x $column; $mark = 0; } else { print " " x $column; $mark = 1; } } print "\n"; }
It is a lot of wasted space, uninteresting code, too much predictable. We need some shrinking mechanism. How about making two maps out of the two for loops? Hmm, interesting, but we need to do something about this $mark=0 / $mark=1 business. Here we come up with the first trick. We store in a hash the values we want to print:
%pat = (" " => "{}", "{}" => " ");
The idea is that the key of one pair is the value of the other one.
Then, having
$output = " ";
we can make a loop without internal "if"
foreach (@numbers) { print $output x $_; $output = $pat{$output}; }
Looks promising:
%pat = (" " => "{}", "{}" => " "); $output = " "; for $row (@pattern) { for $column (@$row) { print $output x $column; $output = $pat{$output}; } print "\n"; }
Let's try a map:
%pat = (" " => "{}", "{}" => " "); $output = $pat{" "}; for $row (@pattern) { print map { $output = $pat{$output}; $output x $_} @$row; print "\n"; }
And now let's transform it into a nested map:
%pat = (" " => "{}", "{}" => " "); $output = $pat{" "}; print map { map @{[{ $output = $pat{$output}; $output x $_} @$_]}, "\n +" } @pattern;
It works, but we still have the problem that the pattern is too big and obvious. We should store it into a thinner structure. We can pack each anonymous array into a string, and represent it this way:
@pattern = qw(2332433141 41412131312141 41342334 01213141217141 033141217141); %pat = (" " => "{}", "{}" => " "); $output = $pat{" "}; print map { map @{[{ $output = $pat{$output}; $output x $_} split //]}, "\n" } @pattern;
It is still somehow predictable, but we can easily change numbers to letters, and use a tr/// within the map to restore the form. Also, making the variable names less telling, the candidate for the obfuscation is shaping up.
$p={" "=>"{}"}; $p->{$p->{" "}}=" "; $o=$p->{" "}; print map{@{[map{tr{abcdef}{012347};$o=$p->{$o};$o x $_}split//]},$/}q +w{cddceddbeb ebebcbdbdbcbeb ebdecdde abcbdbebcbfbeb addbebcbfbeb};
Not bad. Thin, efficient, darkish, not really obscure yet, but very much promising.


Could be more sophisticated, but it is effective. Some search and replace will make a string of gibberish out of our clean candidate. We rely on a tr/// operation to reinstate legitimate perl code, and a final eval will do the trick.
$_='8pj%z9zj>z%&z&u8ph>%8ph>%z9z&&jz9zu8oj8ph>%z9z&uprint9mkp%@%[mkp%t +r%kvcdef&%012347&u8oj8ph>%8o&u8o9x98_&splitgg]&,8g&qw%cddceddvev9evev +cvdvdvcvev9evdecdde9kvcvdvevcvfvev9kddvevcvfvev&u';y'hj%89z&kuvg'-={$ + "}a;b/';eval;
But, is it obfuscated? Well, difficult to read, yes, but the key is quite obvious. The transliteration will recreate the code and eval is going to execute it. We need a further level of concealing.


We want something that is truly deceptive, and that looks like something else. After some fruitless attempts, here comes the idea of making a string of hex codes out of the gibberish from the second level. So we need an intermediate script to perform this further change.
$hexstr=<<'HEX'; $_='8pj%z9zj>z%&z&u8ph>%8ph>%z9z&&jz9zu8oj8ph>%z9z&uprint9mkp%@%[mkp%t +r%kvcdef&%012347&u8oj8ph>%8o&u8o9x98_&splitgg]&,8g&qw%cddceddvev9evev +cvdvdvcvev9evdecdde9kvcvdvevcvfvev9kddvevcvfvev&u';y'hj%89z&kuvg'-={$ + "}a;b/';eval; HEX print q{eval pack "H*",'}, uc(unpack("H*", $hexstr)),"';";
with the following output. The uc operator ensures that all the hex code is uppercase. We'll base the fourth level upon this assertion.
eval pack"H*",'245F3D2738706A257A397A6A3E7A25267A26753870683E253870683 +E257A397A26266A7A397A75386F6A3870683E257A397A26757072696E74396D6B7025 +40255B6D6B70257472256B766364656626253031323334372675386F6A3870683E253 +86F2675386F397839385F2673706C697467675D262C38672671772563646463656464 +766576396576657663766476647663766576396576646563646465396B76637664766 +576637666766576396B64647665766376667665762675273B7927686A2538397A266B +757667272D3D7B2420227D613B622F273B6576616C3B200A';
Is it obfuscated now? Definitely. There is no way of telling what this program is going to do, without transforming back the hex codes into characters. But it is not interesting. Something more is needed. Yet the time for trasformation is over. Let's switch to deception (and fun).


The idea is to insert some wild text within the hex code, so that it looks like something else. What exactly depends on the skills and the mood of the viewer, but at first sight it may look like Perl code embedded inside a string, or words that can be magically turned into Perl instructions. Thus the "use strict;", "require 7.0.6;", "open GUTS" and "no CGI;" are just a smoke screen. Actually, only "require" and "no GI" are fake, since the numbers and the "C" belong to the hex string. The same we can say of all the capital letters A-F scattered through the test, making impressive statements ("Dozens", "Another", "Cool!", "Finally" and so on).
$_='use strict; my$diary=24/5. For 3 Dozens of %{{algorithms}} we got $2,738; require 7.0.6; Another 257 $Mil to %Anna. We &keep aside $397 for Anthony. SSN+=6A3E7A25267A26753870683E253870683E257;And open GUTS "> ", 397A26266A7A397A75. This one will break the back of my old 386. Finally, we understood the real meaning of 6A3870683E257A397A2675707: it was a combination of 2696E74396D6B702540255B6D6B and hopefully had nothing to share with that other @line my Uncle Lou found in a drawer (Something like 70257472256B766364656626253031323334372675386F6A3870) Numbers are &dancing around me. 683 (Even 25386F2675386 +Forcefully!) 397839385F2673706. no CGI; $_=\We seem to remember that 697467675D262 was given to $Tom. Cool! 3867267($1)7725636464636564647665763965 is a @sequence I will /^never*/ forget. (Related to[++]) 76657663766476647663766576396576646563646465396 Begins now and I know 76637664S7665766376667 is following. Is it going to [end]? Here comes one more. 66576396B64647665766376667665762675273B7927686A2538397A266. Big beast. 757667272 Do you really think it is over? Not yet. Wait(). 3D7B2420227D613B622 Finally, this should be the last one. Is it? -No? 273B6576616C3B200. Ah! Yes. '; tr/A-F0-9//cd; eval pack "H*", $_;
To execute this code, we need to filter off the non-Hex characters, including newlines, and eval the result. So the script could have been terminated this way.
However, an experienced Perl user could tell on the spot what was happening. One last effort to conceal these two simple statement and we are done.


To achieve the final goal, we replace the last two telling statements with the more committing line 'Now for something completely different:' introducing more darkness.
@a= map {$b=@{[unpack "c*",$_]};$b} qw(the dream becomes: nightmare); $a+=$_ for@a;($a*=3)-=2;($e,$d,$g,$f)=map {$a-=$_;$a}@a;$k=pack("c*", $d..$e).pack"c*",$f..$g; $h=pack"c",$d+34;$c=pack"c",$e+2;$i=pack"c", $d+35;$j=pack"c",$e+51;eval"$j.$k..$h$i";eval pack"$c*",$_;
The last four lines are busy creating a string containing 'ABCDEF0123456789' without giving it away. This latest trick, although confused looking, is the easiest part. Let's find the intervals between the ordinal values of '0', '9', 'A' and 'F' (3,5,8,9). Get them from the length of four words in an array (qw(the dream becomes: nightmare)), and use pack "c" to avoid the ord keyword. The base value is the result of summing up the array values and some ad hoc arithmetic operation. Then the letters are created from the base value from which we progressively subtract the array values. We'll skip the steps of this transformation that you can easily follow with a debugger. These details involve some acrobatic repeating uses of pack "c" to recreate the tr/// statement, further concealed as to almost look like a range (..) operator, by changing the slashes into dots. As a final touch, the "H*" becomes "$c*", thus getting visually camouflaged among the repeating pack "c*" and unpack "c*" scattered in the preceding code.
And finally we achieve the obfuscated result. Now your average Joseph P. Hacker needs more than a casual glance to understand what is going on. The comment "this id <not> poetry" is a further attempt to defeat the judgement of the casual reader, because what looks like Perl code is actually garbage and what looks like garbage is Perl code.
Hope you enjoyed it.

update Want more of this same stuff? Try Yet another structured obfuscation

 _  _ _  _  
(_|| | |(_|><

Replies are listed 'Best First'.
Re: Structured obfuscation
by Ryszard (Priest) on Jan 23, 2002 at 10:05 UTC
    This is cool.. ++
Re: Structured obfuscation
by Darkling (Beadle) on Jan 29, 2002 at 07:06 UTC
    I marvel at your imagination.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: obfuscated [id://140626]
Approved by root
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2017-01-20 23:41 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (180 votes). Check out past polls.