Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

How do I match/extract a function from a JavaScript file?

by Incognito (Pilgrim)
on Oct 05, 2001 at 23:29 UTC ( #117090=perlquestion: print w/ replies, xml ) Need Help??
Incognito has asked for the wisdom of the Perl Monks concerning the following question:

When parsing through a properly formatted JavaScript file (containing just functions), we are interested in returning the entire string of a specific function, say foo() in this snippet:
foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
we want to match the entire first function... or actually we just want the ability to write a regex that matches any (not too complex is okay) valid function...

Comment on How do I match/extract a function from a JavaScript file?
Download Code
Replies are listed 'Best First'.
Re: How do I match/extract a function from a JavaScript file?
by boo_radley (Parson) on Oct 06, 2001 at 01:16 UTC
    be sure to show your work.
    :P
    eh? wot? oh, alright. You probably want to look at Text::Balanced for any serious answer, otherwise perlre can assist you.
Re: How do I match/extract a function from a JavaScript file?
by ForgotPasswordAgain (Deacon) on Sep 30, 2007 at 17:53 UTC

    You need to escape comments and strings, too. Try this one:

    bum () { // :} if (d) { e++; } else { f = g + h; } print "ho \" :\}"; /* if (hi) { */ if (ho) { print 'hee \\\' :}'; } }
      ok..here we go with this ..
      #!/usr/bin/perl use strict; use re 'eval'; my $Jscript = join (" ",<DATA>); # Extract function foo from Jscript my $content = _extract("foo",$Jscript); sub _extract { my ($funct,$Jscript) = @_; #regexes to match all but starting positions of strings and commen +ts my $double_quote_str = qr!(\\.|[^"\\])*"!; my $single_quote_str = qr!(\\.|[^'\\])*'!; my $comment1 = qr![^\n\r]*!; my $comment2 = qr!.*?\*/!s; #now match complete strings or comments..whatever matches first wi +ll 'eat' others my $escape_strings = qr! ( '(?{$start='sing_quot'})| "(?{$start='dbl_quot'})| //(?{$start='comment1'})| /\*(?{$start='comment2'}) ) (?(?{$start eq 'sing_quot'})$single_quote_str) (?(?{$start eq 'dbl_quot'})$double_quote_str) (?(?{$start eq 'comment1'})$comment1) (?(?{$start eq 'comment2'})$comment2) !x; my ($round,$curly); # match balanced nested round braces ..escape strings and comments $round = qr! \(( $escape_strings | [^()]|(??{$round}) )* \) !x; # match balanced nested curly braces ..escape strings and comments $curly = qr! { ( $escape_strings | [^{}]|(??{$curly}) )* } !x; if ($Jscript =~ m!(\bfunction\s+$funct\s*$round\s*$curly)!) { print "Matched :\n",$1; } } __DATA__ //.. function foo goes here .......
Re: How do I match/extract a function from a JavaScript file?
by Shakya (Initiate) on Sep 30, 2007 at 17:35 UTC
    Following definitely works in "not so complex" scenarios
    #!/usr/bin/perl use strict; my $Jscript = join (" ",<DATA>); # Extract function foo from Jscript my $content = _extract("foo",$Jscript); sub _extract { my ($funct,$Jscript) = @_; my ($round,$curly); $round = qr!\(([^()]*|(??{$round}))*\)!; $curly = qr!\{([^{}]*|(??{$curly}))*\}!; if ($Jscript =~ m!($funct\s*$round\s*$curly)!ms) { print "Matched :\n",$1; } } __DATA__ foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
    Output is :
    Matched : foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; }
    On more explanation on matching nested parentheses, you can refer Friedl's Mastering Regular Expressions, page 330.
      sub _extract { my ($funct,$Jscript) = @_; my ($round,$curly); $round = qr!\(([^()]*|(??{$round}))*\)!; $curly = qr!\{([^{}]*|(??{$curly}))*\}!; if ($Jscript =~ m!($funct\s*$round\s*$curly)!ms) { print "Matched :\n",$1; } }
      err, I can't see where you've matched the function keyword or the function operator anywhere in your code. That would seem to be a prerequisite to parsing out functions from ECMAScript source.

      foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
      That's not valid EMCAScript.

      -David.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://117090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (14)
As of 2015-07-31 12:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (277 votes), past polls