Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: PRD parser problem: How to deal with mutiple lines

by philcrow (Priest)
on Jun 13, 2008 at 13:31 UTC ( #691893=note: print w/replies, xml ) Need Help??

in reply to PRD parser problem: How to deal with mutiple lines

This is about choices. From the examples, there are three possible section contents: single line description, description with a brace block, and blank. That leads to three productions (including blank which is not an <error>).

You could start that like:

section_content: description |
Then define a description
description: 'DESCRIPTION' '=' statement | 'DESCRIPTION' '=' '{' statement(s) '}'
Finally a statement
statement: ...

The key is to think: what are the choices for a valid description or statement or whatever? Each choice is an alternative. Each alternative is made of pieces which themselves might have choices.

p.s. Normal conventions of grammars have us use upper case on the left side of a rule only if we are defining a token. Other left sides, which are built from other things, are usually lower case.


The Gantry Web Framework Book is now available.

Replies are listed 'Best First'.
Re^2: PRD parser problem: How to deal with mutiple lines
by Hanken (Acolyte) on Jun 16, 2008 at 08:47 UTC
    Hi, Philcrow, thanks for your reply! Now I changed my grammar as following:
    List: 'SECTION_START' SECTION_NAME SECTION_CONTENT 'SECTION_END' |<error> SECTION_NAME: /\w+/ {print "section name is $item[1]\n";} |<error> SECTION_CONTENT: Description | Description: 'DESCRIPTION' '=' Statement |'DESCRIPTION' '=' '{' Statement(s?) '}' ';' Statement: /.+\n/
    But I still can not parse the multiple-lined contents inside the description brackets.
    section name is BANK001 section name is BANK002 section name is BANK003 Invalid List: Was expecting 'SECTION END' but found "UK BANK; " instea +d
    How can I do to solve it?
      I think your statement rule needs a trailing semi-colon. If you have further problems you should trace the execution. Do this by adding these statements to the program before constructing the parser:
      $::RD_TRACE = 1; $::RD_HINT = 1;
      The first one is actually the tracer. Be warned that it will generate a lot of output. Reading it will help you understand what the parser is doing and will probably lead to the error.


      The Gantry Web Framework Book is now available.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://691893]
[Corion]: hippo: If you have a Unicode-wise Perl then likely some zero-width characters in $x would work. Maybe $x = "\x{200b}" works.
[Corion]: Hmm - no, that outputs 1 for me on 5.14 - perl -wle "my $x = qq(\x{200b}); warn $x; warn length $x"
[hippo]: Smart - I'll give that a go. Thanks.
[hippo]: Ah
[Corion]: But maybe there is some other Unicode string that will be true but have a zero width
[hippo]: For explanation, I've seen this construct in someone else's code (no names, no pack drill) and couldn't think of a situation to trigger it.
[Corion]: You'll have to look somewhere esoteric for that. Maybe some tied variable or special dualvar can also trigger that. But it's certainly not a common occurrence
[Corion]: And on 5.20, the following also outputs no find:perl -wle 'for my $x ("\x{2000}".."\ x{1fffff}") { if( $x && ! length $x ) { warn qq(<$x>); warn length $x; die } }'
[Corion]: (this time on Unix)
[hippo]: Understood. I'll have to go through the code and see if it's doing anything fancy with ties, dual-vars or non-scalars. In the end, it's probably a bug though.

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (9)
As of 2017-07-27 13:28 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (413 votes). Check out past polls.