Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Slow evolution of Perl = Perl is a closed Word (use)

by tye (Sage)
on Sep 01, 2007 at 05:42 UTC ( [id://636477]=note: print w/replies, xml ) Need Help??


in reply to Re: Slow evolution of Perl = Perl is a closed Word
in thread Slow evolution of Perl = Perl is a closed Word

To have a good IDE for Perl 5 is difficult, since Perl 5 has a very complex syntax, especially because Perl 5 depends of runtime states to be really parsed.
I do not believe the later statement to be true. Can you cite credible sources for this assertion?

Of course it is true. The whole reason BEGIN blocks were created was so that useing a module could force the module's run time to happen before the remainder of the useing code's compile time so that the run-time states could impact how the remaining Perl code is parsed.

In the general case, reliably parsing Perl code that uses a module requires one to run the Perl code (and perhaps even XS code) of the used module. This is also true for BEGIN blocks not implied by use statements.

And the PPI documentation covers quite a bit of this. So if you use PPI for implementing your IDE, then your IDE will simply not be able to handle some Perl code. For many people, the Perl code that they deal with is rarely such that PPI actually fails on it (how fully and correctly PPI parses it is surely more variable, though I haven't seen good data on this). In fact, the author of PPI reports that almost everything on CPAN (excluding some things) can be parsed as well as PPI tries to.

But trying to pretend that the parsing problem doesn't in fact exist is just silly, IMHO.

- tye        

Replies are listed 'Best First'.
Re^3: Slow evolution of Perl = Perl is a closed Word (use)
by erroneousBollock (Curate) on Sep 01, 2007 at 05:57 UTC

    Of course it is true. The whole reason BEGIN blocks were created was so that useing a module could force the module's run time to happen before the remainder of the useing code's compile time so that the run-time states could impact how the remaining Perl code is parsed.

    In the general case, reliably parsing Perl code that uses a module requires one to run the Perl code (and perhaps even XS code) of the used module. This is also true for BEGIN blocks not implied by use statements.

    Ok, that's a reasonable explanation. Do you know whether there are any examples of this that aren't Source-Filter related issues ?

    I believe that modules which change code semantics aren't really relevant to this issue, only those which change how code is parsed for the purposes of simple static analysis.

    And the PPI documentation covers quite a bit of this. So if you use PPI for implementing your IDE, then your IDE will simply not be able to handle some Perl code. For many people, the Perl code that they deal with is rarely such that PPI actually fails on it (how fully and correctly PPI parses it is surely more variable, though I haven't seen good data on this). In fact, the author of PPI reports that almost everything on CPAN (excluding some things) can be parsed as well as PPI tries to.
    Well that's good. With the size of CPAN, that's a pretty good indicator that a PPI-based approach would be very effective for use in an IDE.

    But trying to pretend that the parsing problem doesn't in fact exist is just silly, IMHO.
    You're right, of course.

    I wonder if anything can be done (perhaps in the area of machine-readable POD) to make it easier for IDE implementors to deal with such issues.

    -David

      Do you know whether there are any examples of this that aren't Source-Filter related issues ?
      Of course there are.
      use constant FOO => 1234;
      Whether or not this line is parsed/run first will change how the next line is interpreted:
      print FOO;

      This will either be treated as printing a constant, or printing $_ to a filehandle FOO.

      And functions with prototypes:

      foo +3
      will be parsed differently if foo is declared first as
      sub foo ();
      or with
      sub foo;

      It means foo()+3 in the former and foo(3) in the latter case.

      (It'll do the same if the sub with body is placed in the source code before the call, without the separate declarations, or in a module you use.)

      And no source filter in sight.

        Both those mechanisms are very predicable in their effect on source code. In those cases, a pre-parse of the source code could yield information which could be easily used by the static analyzer.

        Source filters are more difficult because they can have arbitrary effects on the parse of the source code. Possibly some new construct would be necessary to (at least) document the effect of a source filter on the AST.

        As tye mentioned, BEGIN {} blocks (and equivalent constructs) are much more problematic, as they are truly computed at runtime. Perl is arguably much more context-sensitive than languages like C# or Java, but with a few simple heuristics a very simple static analysis could be performed.

        I also don't see why it's all or nothing... the IDE could get it 99% right and could deal with parser/analyzer warnings in a flexible policy-based (or case-by-case) manner (perhaps with configuration from the user).

        I'm not saying it would be easy... just possible. Interest would be the limiting factor.

        -David

      Of course there are other issues besides source filters. If you read the PPI documentation (really, go ahead, do it now, just the first part, not the really gory details; I'll wait right here in front of the comma until you come back), it covers this fairly well though briefly, even linking to "our own" On Parsing Perl.

      Yes, even the non-source-filter issues can impact correctly parsing to the point of doing static analysis correctly.

      With the size of CPAN, that's a pretty good indicator that a PPI-based approach would be very effective for use in an IDE.

      Um. Maybe. PPI doesn't fail when trying to parse most of CPAN (that is, it doesn't notice that it has parsed something incorrectly to the point that it throws up its hands and declares that it can't go on). I have yet to see any claims as to how accurately or even completely PPI's parsing of all of CPAN actually is.

      I wouldn't be surprised to find that many people only rarely run into problems with PPI's parsing of the code that they usually deal with. In fact, if someone is burdened1 by the use of syntax highlighting when editting their Perl code, then their coding style will be somewhat skewed such that they avoid even reasonable constructs if they happen to confuse their syntax highlighter. So I wouldn't be surprised if they also avoid some things that would trip up PPI. If their syntax highlighter actually uses PPI (such as if they were to use a PPI-based IDE for editting their Perl code), then surely their style will be skewed toward things that PPI gets right.

      1 Is that the right word? I can't find my thesaurus.

      My impression is that PPI, in practice, is good enough often enough that a PPI-based IDE could be successful if used on a new project (due to the feedback loop I noted above). The effectiveness of a PPI-based IDE when applied to an existing code base is certainly less certain, IMHO.

      - tye        

        W.r.t static analysis of perl, there would then seem to be a lot of things that would need to be sorted out (or at least heuristics provided for) to make it useful for the purposes of an IDE.

        In fact, if someone is burdened1 by the use of syntax highlighting when editting their Perl code, then their coding style will be somewhat skewed such that they avoid even reasonable constructs if they happen to confuse their syntax highlighter.
        Heh :-) I do use syntax-highlighting wherever possible, if only because it allows me (in my perception) to skim my code faster.

        I mostly use Vim (I've been trying to use Emacs of late)... it has lots of problems with its very simplistic highlighter. I don't avoid the "problematic" constructs; I've just formed a habit of inserting comments to "correct" those mistakes ;)

        My impression is that PPI, in practice, is good enough often enough that a PPI-based IDE could be successful if used on a new project (due to the feedback loop I noted above). The effectiveness of a PPI-based IDE when applied to an existing code base is certainly less certain, IMHO.
        Well, it has to start somewhere (assuming that making perl "more available" to beginners/infidels is a worthwhile goal).

        I agree that the feedback loop would eventually lead to a parser that works reasonably for arbitrary perl5 code.

        -David.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://636477]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-03-19 03:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found