Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: With apologies to J.R.R (explained)

by sevensven (Pilgrim)
on Nov 06, 2001 at 05:52 UTC ( #123478=note: print w/replies, xml ) Need Help??

in reply to With apologies to J.R.R

Roger on that obfu working. And it is a "precioussssss" obfu, if I may say so. Thanks for the fun and for making me remeber "my precioussss", maverick.

I dont know if it's good PM etiquete to discuss the functioning of an obfu, but that's the fun of it, so here it goes a first look into it Warning, spoiler ahead:

The main code is inside the first interpolation q. That code goes into the @d array, wich is reordered in the for at the bottom

for(my $i=0;$i<$#d/2;$i+=2) { my $t=$d[$i]; $d[$i]=$d[$#d-$i]; $d[$#d-$i]=$t }

This reorders all that code in @d, swapping every third line with the one that has the same index if you started counting from the end of the @d

So, doing the swap we get the code that will be run, wich I formated for your viewing pleasure :

my $h = [[['t','o'],[['d','i'],'n']],[[['s','r'],['l','h']],['e',[['m' +,['f','b']] ,[[['k',['w','p']],['g','u']],'a']]]]]; my $m = '_001011110^_001_1000.000001.100111110111010110^00010111101110 +0.111111010 1010b_001011110._001_1000^000001.11101001010110100.000101111011100d_00 +101111 0._001_1000^000001.111011100101010111111010.000101111011100.1111110101 +010b_1 11110110100^0101011^0001011110._0100111111001111100001111010001000^111 +011010 10110100.000101111011100b_0101011^0001011110^_1010111110110100^0011110 +10._10 011100100111000010110100^1111001010111101001110^0001011110._11110011_1 +001.10 1001011101000d^....^^...^...^^..--._0001011110^_101000110010100^001111 +010.00 01011110._1001010101111110101000bb1111001011010101010c.100000110010001 +1111'; $m =~s!\s!!g; my %h=('^',' ','.',' ','b',"\n",'d',"\n",'c',',','e',',','-','-',); my $u; sub d { my($t,$d)=@_; if(ref($t)) { my $c=shift @{$d}; while($c !~ /^\d$/x && length $c) { ($c eq'_') ? $u=1 : print $h{$c}; $c = shift@{$d}; } return d($t->[$c],$d); } else { if ($u) { $u=0; print uc($t); } else { print $t; } return @{$d}; } } my @d = split//,$m; while(scalar @d) { @d = d($h,\@d); } print"\n";

It now looks much better, but the thing is, now it's realy getting harder.

Maverick used the sub d, with an array called d passed by reference and that gets changed in there, adding a new level to this obfuing (am I coining a term here ? ;^).The recursive call to sub d inside almost made me croak ;-)

OK, the main idea is that the binary string in $m (later splited into an array with each of its characters in @d) will be used in sub d to choose wich of the sub-arrays and elements of $h will be used.

Zeros make the array get shifted, ones means that that particular sub array will be used. The shifting occurs in the recursive call to d.

Each time a element of the array is reached (i.e. a character), as it's not a reference to an array, it will get printed (in upper case depending on a flag set by the _ in the control stream (er, $m), with some more caracters provided by %h.

This was a very nice use of recursion, passing arrays by ref, acessing arrays inside arrays of an array passed by reference, interpolation, eval, etc.

Until today, I wondered why people did obfuscation. It seamed just a way to use the worst of perl, the most arcane syntax and realy, just a bad thing to throw at people who are trying to grasp the elementary stuff in perl.

Well, i can now see *two* good motives for doing it,

  • one : you've got to learn (or know how to use) some of the hard but usefull things in perl and sometimes it makes to you run to documentation
  • two : it's Major fun
  • three : your ability to do math gets improved ;^)

    Thanks again to maverick and to all obfuers in general :-)

  • Replies are listed 'Best First'.
    Re: Re: With apologies to J.R.R (explained)
    by maverick (Curate) on Nov 06, 2001 at 08:48 UTC
      sevensven has it right on the nose. The only missing part is what the encoding algorithm is. It's called 'Huffman Compression' or a 'Huffman Tree'...named after some guy...named...Huffman...that's it. Anyways, here's an explanation of the algorithm (for those that are interested.

      /me digs out notes...

      The basic point of the algorithm is to compress the data by substituting the shortest string of bits for the most commonly occurring byte.

      So, step one is to make a frequency chart of the data (this isn't the output BTW for those that haven't ran it yet).

      J u s t a n o t h e r p e r l h a c k e r ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 1 1 1 2 2 1 1 2 3 3 1 1 1 1
      sort low to high by occurrence
      1 (j) 1 (u) 1 (s) 1 (n) 1 (o) 1 (p) 1 (l) 1 (c) 1 (k) 2 (t) 2 (a) 2 (h) 3 (e) 3 (r)
      combine the two smallest 'occurrences', add their frequencies, and the resort the list.
      1 (s) 1 (n) 1 (o) 1 (p) 1 (l) 1 (c) 1 (k) j 2< u 2 (t) 2 (a) 2 (h) 3 (e) 3 (r)
      repeat until you end up with a tree like so
      j < / u 0 / \ / t 0 Isn't ASCII art wonderful? / \ o / \ < / \ / p / 0 / \ s / < / n 21 \ k \ / \ 0 \ / \ l \ 0 < \ / \ c \ / e 0 \ r \ / 0 \ \ a < h
      Where 21 is (hopefully) the total number of bytes of input. I say hopefully, because, yes, I did built both trees by hand.

      Ok, so now that we have our tree (tilt your head 90 degrees to the left), you'll notice that the most common bytes (r and e) have the lowest depth in the tree. A byte's encoding is found by walking the tree depth first to the leaf of the byte...keeping track of the direction either left/down (0) or right/up (1). So, the encoding for 'e' is '010' and 'o' is '1011'. Get it?

      Thus the sequence 'perl' (4 bytes) could be encoded as 101001000101101, or 15 bits.

      Decoding requires that you have the tree that the stream is encoded against, in this case $h in the code. Then you descend the tree one bit at a time, until you hit a leaf node and then you start over at the top.

      1010 = p 010 = e 010 = r 01101 = l
      I alluded to sevensven (while were we talking in the CB) that we were both using a program right now that has a Huffman decompressor in it. Any guesses? The web browser. Unless you use lynx, odds are you can display .jpg files...and they're compressed Huffman tree.


      /me leaves the lectern

      perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"

        woulden't 0 be right up and 1 be left up on encoding
    (jeffa) 2Re: With apologies to J.R.R (explained)
    by jeffa (Bishop) on Nov 06, 2001 at 08:16 UTC

    Log In?

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: note [id://123478]
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others lurking in the Monastery: (5)
    As of 2020-10-24 01:19 GMT
    Find Nodes?
      Voting Booth?
      My favourite web site is:

      Results (242 votes). Check out past polls.