Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Inspired by BooK's quite excellent obfuscated Huffman Encoder (434 chars on a w32 box), and many hours of my time playing with the algorithm I decided I would make an attempt at my own, and try to shave a characters off of the solution. After quite a bit of effort (this obfus stuff is hard.. ++ to those of you who make it look so easy) getting bitten by binmode, discovering which global variables could be used in exotic ways I wish to present my solutions: (All of the solutions expect input and ouput over STDIN and STDOUT)

My first solution is the equivelent to BooK's original encoder, ie it only encodes the message, without a header containing the information necessary to decode the message. It is 316 characters long (on a w32 box).

map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$c)=@_;$c=$$n[0];return$ +1{$c}=$s if!ref($c);map r($$c[$_],$s.$_),0,1;}sub h{$/=\1;$0{$_}++,$,.=$_ while +<>;@_=map [$_,$0{$_}],keys%0;while(@_>1){@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=po +p;push@_ ,[[$l,$r],$$l[1]+$$r[1]];}r pop;$,=~s/./$1{$&}/gs;pack"SB*",$.,$,;}pri +nt h;
My next solution is the same thing but this time with the header information. I call it, it is 337 bytes long.
$"=$_;map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$t,$c)=@_;$c=$$n[0 +];return $,++,$_{$c}=$s,push@0,pack"aCB16",$c,$t,$s if!ref($c);map r($$c[$_],$s +.$_,$t+1 ),0,1}sub h{$/=\1;$0{$_}++,$;.=$_ while<>;@_=map[$_,$0{$_}],keys%0;whi +le(@_>1) {@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=pop;push@_,[[$l,$r],$$l[1]+$$r[1 +]]}r pop ;$;=~s/./$_{$&}/gs;chr($,)."@0".pack("SB*",$.,$;)}print h
The next is the corresponding decoder, as might be guessed I call it, and it is 259 bytes long. The output of the encoder can be piped into this script and the result will be the original file. such as by
perl < | perl
(usage on *nix machines might be little more graceful..)
$/=$_;map{binmode$_}\*STDIN,\*STDOUT;while($"=<>){map{$/=substr$",0,4, +'';($,, $;)=unpack"aC",$/;$_{(unpack"aCB$;",$/)[2]}=$,}1..ord substr$",0,1,''; +($,,$") =unpack"SB*",$";map{$;=1;$;++ until exists$_{substr$",0,$;};$\.=$_{sub +str$",0 ,$;,""}}1..$,;print""}
And for my last trick, I humbly present the combination of the two. This version I call and is 622 bytes long. Like the others it uses STDIN and STDOUT but also takes an argument to determine if it decodes or encodes (no arguments at all).
perl < | perl -d
$"=$/=$_;map{binmode$_}\*STDIN,\*STDOUT;sub r{my($n,$s,$t,$c)=@_;$c=$$ +n[0]; return$,++,$_{$c}=$s,push@0,pack"aCB16",$c,$t,$s if!ref($c);map r($$c[ +$_], $s.$_,$t+1),0,1}sub h{$/=\1;$0{$_}++,$;.=$_ while<>;@_=map[$_,$0{$_}], +keys %0;while(@_>1){@_=sort{$$b[1]<=>$$a[1]}@_;$l=pop;$r=pop;push@_,[[$l,$r +], $$l[1]+$$r[1]]}r pop;$;=~s/./$_{$&}/gs;chr($,)."@0".pack("SB*",$.,$;)} +if( pop@ARGV){while($"=<>){map{$/=substr$",0,4,'';($,,$;)=unpack"aC",$/;$_ +{( unpack"aCB$;",$/)[2]}=$,}1..ord substr$",0,1,'';($,,$")=unpack"SB*",$" +;map{ $;=1;$;++ until exists$_{substr$",0,$;};$\.=$_{substr$",0,$;,""}}1..$, +;print ""}}else{print h}
Many thanks to the Obfu crew and especially BooK for the inspiration and the walk through of his implementation, especially in regard to the use of funky global variables.

BTW, the code is set up for 80 columns, which means if you dont want it split you need to see your wrap code length to be 81 (go figure) in the User Settings to see all of them properly.

Oh and apologies to any who think this is a waste of time as its already been done, for me it was a _great_ exercise (I think that there are imprtant lessons to be learned by doing an obfu), and I'm damn proud of it. (never even looked at BooK's Huffman Decoder!)


You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)
UPDATE:Minor text changes

In reply to New Huffman Obfus by demerphq
in thread Huffman encoder by BooK

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (4)
    As of 2020-10-20 11:36 GMT
    Find Nodes?
      Voting Booth?
      My favourite web site is:

      Results (210 votes). Check out past polls.