Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Block-structured language parsing using a Perl module?

by BrowserUk (Pope)
on Aug 14, 2012 at 15:52 UTC ( #987389=perlquestion: print w/ replies, xml ) Need Help??
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Does anyone know of a publicly available example of one of the many CPAN parsing modules being used to parse a full-featured, block-structured language?

I'm looking for real world examples rather than toy, how-to demos.

Thanks.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Comment on Block-structured language parsing using a Perl module?
Re: Block-structured language parsing using a Perl module?
by sundialsvc4 (Abbot) on Aug 14, 2012 at 17:29 UTC

    There are several heavy-duty parsers, including Parse::RecDescent, Parse::Yapp and others.

    All of these are heavy-duty production-grade systems.   I once used them to chomp through Tivoli Workload Scheduler files, Korn shell scripts, SQL files and SAS programs to extract a data-flow model for an entire system into an SQLite database.   Perl dun me proud that day ... it could chomp through an entire run in about eleven minutes, having processed tens of thousands of files.

      Do you never actually read the question?

      I know what modules exist -- I am perfectly capable of searching cpan -- that isn't what I asked.

      And unless you are prepared to make the source code of your magical endeavors available, they are nothing more that an (unbelievable) idle boast. Entirely useless as an answer to my request.

      Please stop answering responding to my questions. Your responses are never useful to me and simply annoy me.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Re: Block-structured language parsing using a Perl module?
by Illuminatus (Curate) on Aug 14, 2012 at 18:08 UTC
    I can't really help you directly, but I would contact the author of Parse::Eyapp, if you haven't already. I know it's pretty new, but he must have had a reason to expand Yapp, and his changes look pretty extensive.

    fnord

Re: Block-structured language parsing using a Perl module?
by sundialsvc4 (Abbot) on Aug 14, 2012 at 19:46 UTC

    If you asked if any one of the CPAN modules could be used to parse a full-featured block-structured language, then the answer is Yes, and here are two that could do it.   As for Perl being the basis for a complete language compiler/interpreter or just parser, I am not aware of any.   Generally when you are looking at a parser you are in fact looking at the whole shebang:   compiler, interpreter, runtime.   There are numerous complete compiler and interpreter construction toolkits already available which will get you much closer, much faster, to that complete goal, and to be platform-independent as a freebie.

    And, parenthetically, what I said is no idle boast, and the code that did it is proprietary now.   I would love to share it if I could, because to be quite honest I didn’t think Perl could do it.   Meanwhile:     You have “long skills” in certain pursuits, which I have quickly learned to respect and not to question, and I have long skills in others.   This happens to be one of them.

    Now, let’s just stay on-topic here.   These parsers are very robust and they will give you excellent performance although not on par with Bison.   They can furthermore be programmed to winnow-through source code (as I did) through their exception-handling and error-recovery features, which are very robust.   Their tight integration with Perl (being in one case “pure Perl”) is a tremendous boost.   But I would not choose to build a complete language system in it.

      Literally nothing you said above relates in any way to the question I asked.

      I asked for references to publicly available USES of CPAN parser modules. Nothing else!

      I'm not interested in your extremely dubious "expertise". I'm not interested in your opinion.

      So please. Just stop.

Re: Block-structured language parsing using a Perl module?
by Anonymous Monk on Aug 15, 2012 at 03:05 UTC

      The problem with those is they are

      • either: hand-crafted parsers constructed to parse a specific language (HTML).

        These are no use because I'm looking for a parser constructor module.

      • or: examples, of using the parser constructor module to construct a parser to parse some more or less complicated language, written by the author of the module that does the construction.

        It is unsurprising that the author of a given module is motivated enough, and reasonably adept at using his own module, to persist in getting something moderately complicated to work.

        But can anyone else?

      If I could find an example of a parser module being used a) in a real-world project; b) of reasonable complexity; c) by some one other than its author; it would give some level of confidence that the module stands up to a) being learned; b) being debugged; c) being maintained in a timely fashion when bugs discovered through real-world usage are reported.

      Of the 3 modules I've experimented with, they:

      • had awful apis -- large, complicated, verbose -- with lousy documentation, often as not couched in so much academic/theoretical terminology as to be almost unintelligible.

        I want to use a parser; not learn about the theory behind them.

      • gave almost useless error diagnostics when defining the grammar; and even worse diagnostics when given non-complaint source to parse.
      • so ridiculously slow in operation that the are almost useless for real-world usage.
      • produce parse tree so complicated you need to write another parser to process them.

      A can see I am going to end up writing my own; but given the richness of the modules on cpan, I hoped that there was one amongst them that might stand up to RW usage.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        :) I know this probably doesn't qualify also (and you probably saw it) , but GraphViz2::Marpa is not by Marpa author :) though it is also accompanied by how-to article

        FWIW, Marpa guy does give some praise for his error diagnostics on his blog :)

        I want to use a parser; not learn about the theory behind them.

        Part of the problem of lousy documentation is that you have to know enough theory to know both what type of parser you can use on a grammar and if your grammar is even parsable. Because semi-structured text can vary so much in the structure and meaning, the best any general purpose grammar engine can do is push back on you a little bit to figure out whether your language is a regular language, whether you need lookahead and how much, and how you handle things like recursion, if at all.

        Also a lot of the theoretical work comes from the world of linguistics, which is messy on its own.

        I agree about lousy APIs though.

        I can't speak about the performance of Regexp::Grammars, but if I were doing something like this, I'd start there for ease of use. I'd use Marpa for speed and completeness.

Re: Block-structured language parsing using a Perl module?
by thargas (Chaplain) on Aug 16, 2012 at 17:43 UTC

      Thanks for the link. I've pulled the pdf and will give it a read over the next few days.

      Though I do so with a great deal of skeptisism. P::RD (used to?) commits every one of my cardinal sins:

      1. horrible API;
      2. lousy documentation;
      3. useless diagnostics;
      4. glacial performance;

      Maybe Regexp::Grammars does better, but on a cursory inspection, I do not hold out much hope :(


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        I don't know a lot about it. I did have to deal with it once in a program which took commands in an sql-like syntax and we had it pre-compile the the grammar and save it instead of compiling it on each load, which did make a difference IIRC. It was a while ago.
Re: Block-structured language parsing using a Perl module?
by tobyink (Abbot) on Aug 17, 2012 at 07:54 UTC

      Thank you tobyink. Those are both fine examples of the information I was looking for.

      (It's a shame that the metacpan site sends my browser (Opera) off into la-la land, but that's not your problem :)

      For me, the most telling files are the "compiled" grammars: OwlFn & CSS.

      I realise these are generated files, but damn are they ever resource hungry. It is no wonder that P::RD is so slow. Dog-forbid that either of you authors ever has to go plugging around inside there in order to solve a problem.

      Have you ever had occasion to measure the performance of your parser? (Do OwlFn source files ever get big enough that it is a concern?)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://987389]
Approved by Old_Gray_Bear
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2014-09-20 06:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (155 votes), past polls