Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
<HTML><HEAD> <BODY>

I've written a perl script to go through a directory containing JavaScript library files (plain text files containing pure JavaScript code, mainly functions and some global variables at the top) and remove comments and unnecessary whitespace. This works using some pretty complex (to me) regular expressions, which first remove comments, then go through the document and hide all strings, compress the whitespace, replace the strings, etc...

The next (and final) step to this process is the obfuscation... I want to go through each function in the document and, for each parameter in the function header, replace the parameter with a single letter (or whatever) to make the function smaller (and I suppose unreadable - but that's secondary)...

I want to do this, because there's over 400K of JS library files which when transmitted over a 56K modem (what's a modem you say?), can be a very large download on the initial hit to the web page. By obfuscating each function's parameter list, significant file size reduction can be done...

I'm really looking for a regular expression that can go through the file (stored as one big string) and returns each function in an array... or I just pass the name of the function that I want and the regex returns the entire function as a string to me....

A typical Javascript function looks like this:

function myFunctionName (strName, intValue, strOtherString, blnResult, strAnotherString, strEtc) {
var intMyLocalVariable = 0;

var strAnotherVariable = "blah";



if (blnResult) {
strAnotherVariable = "yakk";
} else {
strAnotherString = "yikes";
}


print strName; return (intValue);
}

Anyway, the problem is the fact that I can't just go and look for the first non-escaped "}" character, since there could be if statements and other "{}" characters that are completely valid in a subroutine/function. What is this mystical regex that I'm looking for? It would take me days and days to figure one out, when I know someone out there is definitely smarter/more experienced than I at this sort of stuff...

Here's a regex that I wrote to grab just the function headers in a file...

sub GetJSFunctionHeaders { 
my ($strOutput) = @_;
my (@subroutines) = ();

while ($strOutput =~ m/(function\s*\S+\s*\(\s*(?:^\\\)|\\.)*\s*\))\s*/ig) {
push (@subroutines, $1);
}

return (\@subroutines);
}

This works great.... but I want all the data IN the function, as that's what I'm going to go through and obfuscate the function parameters.... Does anyone out there know the magical REGEX? If so, that would be sooooooo cool. I would worship you every night for the next week if you so desire :)

</BODY></HTML>

In reply to Compressing/Obfuscating a Javascript file by Incognito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-03-29 08:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found