Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Re: Compressing/Obfuscating a Javascript file

by Incognito (Pilgrim)
on Oct 11, 2001 at 03:32 UTC ( #118125=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: Compressing/Obfuscating a Javascript file
in thread Compressing/Obfuscating a Javascript file

Wholly crap! You are good... :)

I'm going to study this stuff to see if I can pick up some better skills and help others as well... If it's worth anything, I've found another scenario for the chunk() subroutine that causes it to break... If strings have escaped characters, then things go all whacked out:

function test () { var a = "James (aka Tachyon) is a \"Perl Saint\" !!!"; var b = "aaaa'bbbb" + 'cccc"dddd' + "eeee\"ffff" + "gggg\'hhhh"; }
I tried modifying chunk() to that it remembers the previous character (and if it were an escape character) so I could treat the current character as just a plain character (and not the end quote)...
sub chunk { my ($strOutput) = @_; my (@chunks); my ($chunk) = 0; my ($found_quote) = ''; my ($preceded_by_escape) = 0; for (split //, $strOutput) { # look for opening quote if ( /'|"/ and ! $found_quote and ! $preceded_by_escape) { $found_quote = $_; $chunk++; $chunks[$chunk] = $_; $preceded_by_escape = (/\\/) ? 1 : 0; next; } # look for corresponding closing quote if ( $found_quote and /$found_quote/ and ! $preceded_by_escape +) { $found_quote = ''; $chunks[$chunk] .= $_; $chunk++; $preceded_by_escape = (/\\/) ? 1 : 0; next; } # no quotes so just add to current chunk $chunks[$chunk] .= $_; $preceded_by_escape = (/\\/) ? 1 : 0; } # strip whitespace from unquoted chunks; for (@chunks) { next if m/^(?:"|')/; # leave quoted strings alone s/^[ \t]+|[ \t]+$//g; } return @chunks; }
This seems to work okay... I thought it didn't, but I believe it was bad data... let me know if this helps... (I'd like to do something productive for you today)! :)

Replies are listed 'Best First'.
Re: Re: Re: Compressing/Obfuscating a Javascript file
by tachyon (Chancellor) on Oct 11, 2001 at 05:40 UTC

    Ah the full horror of it is coming back to me. Patches on patches..... All I do (which you can see in the new chunk() sub) and should be in the string() sub is remeber the last char in $last. If this is a \ and I find the char I am looking for (in this case the closing ) of the replace) I ignore it as this is an escaped char. This gives you:

    print "$_\n" for chunk('Now replace(foo\(bar\)) "String \"escaped\" li +ke so" more'); # chop a function up into RE and non RE bits so we can chunkify # it into strings and non string sections sub chunk { my $func = shift; my @lotsa_chunks; my @array = split /(?=\breplace\s*\()/, $func; for my $bit (@array) { if ($bit =~ /^replace/) { # do careful quote parse on RE chunk my $last = ''; my $re = ''; for (split //, $bit) { unless ( $_ eq ')' and $last ne "\\" ) { $re .= $_; $last = $_; next; } $re .= $_; # add closing bracke +t push @lotsa_chunks, $re; # push complete RE i +nto a chunk $bit =~ s/\Q$re\E//; # hack RE off push @lotsa_chunks, strings($bit); # chunk the remain +der } } else { push @lotsa_chunks, strings($bit); } } return @lotsa_chunks } # this sub splits a function into quoted and unquoted chunks sub strings { my $func = shift; my @chunks; my $chunk = 0; my $found_quote = ''; my $last = ''; for (split //, $func) { # look for RE # look for opening quote if (/'|"/ and ! $found_quote and $last ne "\\" ) { $found_quote = $_; $chunk++; $chunks[$chunk] = $_; next; } # look for coresponding closing quote if ( $found_quote and /$found_quote/ and $last ne "\\" ) { $found_quote = ''; $chunks[$chunk] .= $_; $chunk++; next; } # no quotes so just add to current chunk $chunks[$chunk] .= $_; } continue { $last = $_; } # strip whitespace from unquoted chunks; for (@chunks) { next if m/^(?:"|')/; # leave quoted strings alone s/^[ \t]+|[ \t]+$//g; s/^\s*$//; } return @chunks; }

    This reads a little easier to me. Do you see how every loop we set $last = $_ ?? We do it three times - before the nexts and at the end of the loop to capture all cases. We could paste it in three times but....this is what continue blocks as this captures all three cases so that each loop we correctly set $last = $_ no matter whether we end the loop with a next or hit the end of it. Makes code much more maintainable it we add say anther condition in our loop there is NO chance of forgetting a $last = $_ before a next.

    As you seem determined to continue (poor foolish soul that you are) apply this patch to the code above. Start saving versions. Put snippets that cause breakages into a working JS test file. that way you test you new code on this little file full of gnarly code. If it still runs OK fine, if not compare your last working version to your latest code. Also switch the compression (newline removal off) and compare your JS files (working and broken) There is a script I wrote at Colour coded diff that will do this for you and highlight the exact differences. You will need to download and install Algorithm::Diff. I am assuming you are on Windows - if your are on nix you already have a diff but mine is prettier!

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://118125]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2020-04-09 11:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (47 votes). Check out past polls.

    Notices?