Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

How do I match/extract a function from a JavaScript file?

by Incognito (Pilgrim)
on Oct 05, 2001 at 23:29 UTC ( [id://117090]=perlquestion: print w/replies, xml ) Need Help??

Incognito has asked for the wisdom of the Perl Monks concerning the following question:

When parsing through a properly formatted JavaScript file (containing just functions), we are interested in returning the entire string of a specific function, say foo() in this snippet:
foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
we want to match the entire first function... or actually we just want the ability to write a regex that matches any (not too complex is okay) valid function...

Replies are listed 'Best First'.
Re: How do I match/extract a function from a JavaScript file?
by boo_radley (Parson) on Oct 06, 2001 at 01:16 UTC
    be sure to show your work.
    :P
    eh? wot? oh, alright. You probably want to look at Text::Balanced for any serious answer, otherwise perlre can assist you.
Re: How do I match/extract a function from a JavaScript file?
by ForgotPasswordAgain (Priest) on Sep 30, 2007 at 17:53 UTC

    You need to escape comments and strings, too. Try this one:

    bum () { // :} if (d) { e++; } else { f = g + h; } print "ho \" :\}"; /* if (hi) { */ if (ho) { print 'hee \\\' :}'; } }
      ok..here we go with this ..
      #!/usr/bin/perl use strict; use re 'eval'; my $Jscript = join (" ",<DATA>); # Extract function foo from Jscript my $content = _extract("foo",$Jscript); sub _extract { my ($funct,$Jscript) = @_; #regexes to match all but starting positions of strings and commen +ts my $double_quote_str = qr!(\\.|[^"\\])*"!; my $single_quote_str = qr!(\\.|[^'\\])*'!; my $comment1 = qr![^\n\r]*!; my $comment2 = qr!.*?\*/!s; #now match complete strings or comments..whatever matches first wi +ll 'eat' others my $escape_strings = qr! ( '(?{$start='sing_quot'})| "(?{$start='dbl_quot'})| //(?{$start='comment1'})| /\*(?{$start='comment2'}) ) (?(?{$start eq 'sing_quot'})$single_quote_str) (?(?{$start eq 'dbl_quot'})$double_quote_str) (?(?{$start eq 'comment1'})$comment1) (?(?{$start eq 'comment2'})$comment2) !x; my ($round,$curly); # match balanced nested round braces ..escape strings and comments $round = qr! \(( $escape_strings | [^()]|(??{$round}) )* \) !x; # match balanced nested curly braces ..escape strings and comments $curly = qr! { ( $escape_strings | [^{}]|(??{$curly}) )* } !x; if ($Jscript =~ m!(\bfunction\s+$funct\s*$round\s*$curly)!) { print "Matched :\n",$1; } } __DATA__ //.. function foo goes here .......
Re: How do I match/extract a function from a JavaScript file?
by Shakya (Initiate) on Sep 30, 2007 at 17:35 UTC
    Following definitely works in "not so complex" scenarios
    #!/usr/bin/perl use strict; my $Jscript = join (" ",<DATA>); # Extract function foo from Jscript my $content = _extract("foo",$Jscript); sub _extract { my ($funct,$Jscript) = @_; my ($round,$curly); $round = qr!\(([^()]*|(??{$round}))*\)!; $curly = qr!\{([^{}]*|(??{$curly}))*\}!; if ($Jscript =~ m!($funct\s*$round\s*$curly)!ms) { print "Matched :\n",$1; } } __DATA__ foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
    Output is :
    Matched : foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; }
    On more explanation on matching nested parentheses, you can refer Friedl's Mastering Regular Expressions, page 330.
      sub _extract { my ($funct,$Jscript) = @_; my ($round,$curly); $round = qr!\(([^()]*|(??{$round}))*\)!; $curly = qr!\{([^{}]*|(??{$curly}))*\}!; if ($Jscript =~ m!($funct\s*$round\s*$curly)!ms) { print "Matched :\n",$1; } }
      err, I can't see where you've matched the function keyword or the function operator anywhere in your code. That would seem to be a prerequisite to parsing out functions from ECMAScript source.

      foo (a, b, c) { if (c) { a++; } else { b--; } print "hi"; } fum (d, e, f, g, h) { if (d) { e++; } else { f = g + h; } print "ho"; }
      That's not valid EMCAScript.

      -David.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://117090]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (2)
As of 2024-03-19 04:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found