Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^2: Recognizing Perl in text

by Anonyrnous Monk (Hermit)
on Jan 06, 2011 at 13:47 UTC ( [id://880823]=note: print w/replies, xml ) Need Help??


in reply to Re: Recognizing Perl in text
in thread Recognizing Perl in text

I think a problem with the syntax check idea is, what exactly would you feed to perl -c ?

If you do it line by line, and have a code snippet like this

for my $foo (@foo) { for my $bar (@$foo) { push @{ $self->{results} }, { baz => foo( $bar->{baz}, $bar->{quux}[1] ) }; } }

not a single line (on its own) would pass a syntax check, while taken as a whole, the snippet is perfectly valid Perl code.

Of course, you could try to work around that problem by passing multiline snippets to the syntax checks, but then the number of possible combinations is going to explode rather soon, even for moderate file sizes...  So you'd at least need some additional heuristic to identify likely beginnings of code sections, or some such, in order to make this approach feasible in practice.

Replies are listed 'Best First'.
Re^3: Recognizing Perl in text
by LanX (Saint) on Jan 06, 2011 at 15:21 UTC
    With a clever strategy it's possible to significantly limit the number of possible chunks to check!

    Simply start checking the most indented line and successively add surrounding lines.

    for my $foo (@foo) { # 8 fails for my $bar (@$foo) { # 6 fails push @{ $self->{results} }, # 5 works { # 3 fails baz => foo( $bar->{baz}, # 2 works $bar->{quux}[1] ) # 1 fails }; # 4 works } # 7 works } # 9 works

    like this the overhead for identifying n lines of code is (statistically) at most linear!

    UPDATE: And it's still possible to rely on the existence of trailing semicolons or braces before running a syntax check.

    Cheers Rolf

Re^3: Recognizing Perl in text
by LanX (Saint) on Jan 06, 2011 at 14:48 UTC
    sure, but thats why I added the update about the indentation convention.

    Do you know any man pages with perl code that don't origin from POD? I don't...

    And I agree with Marshall who recommended scanning for trailing /;\s*$/ or /;\s*#.*$/ for a pretty good weighting heuristic.

    Cheers Rolf

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://880823]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (9)
As of 2024-04-18 17:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found