Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Parse::RecDescent - specifying dynamic subrule repetition

by leahymr (Initiate)
on Feb 18, 2013 at 21:28 UTC ( #1019403=perlquestion: print w/ replies, xml ) Need Help??
leahymr has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I beg your wisdom...

I am trying to parse a string for which I need to consume dynamically determined number of characters. Can P:RD be persuaded to do that?

Here is a simple example string:
s:5:"abcde"; -- the '5' is the number of characters in the string

Here's the hard part:
s:18:"s:10:"abcdefghij";"; -- the string can store other strings, so I can't just search for ";

I know that subrules can have repetition counts, but I can't get something like
String: 's: ' Count '"' Cdata($item{Count}) '";' to work

Here's the simplified grammar. Is there a way I can consume the number of characters specified by $item{Count}?

my $grammar = q{ CompText: Command(s) Command: String | Integer Array: 'a:' <commit> Count ':{' Command(s) '}' { $return = "a:$item{Count}:{" . join( '', @{$item[-2]}) . "}" } String: 's:' Count ':"' Array '";' { $return = "s:$item{Count}:\"" . join( '', @{$item{Array}}) . + "\";" } | 's:' Count ':"' Cdata_String '";' { $return = "s:$item{Count}:\"" . join( '', @{$item[-2]}) . "\ +";" } Cdata_String: Cdata(s?) Count: /\d+/ Cdata: /./ };

Thank you,
--Michael

Comment on Parse::RecDescent - specifying dynamic subrule repetition
Select or Download Code
Re: Parse::RecDescent - specifying dynamic subrule repetition
by sundialsvc4 (Monsignor) on Feb 19, 2013 at 00:20 UTC

    It would superficially appear to me that this might not be a good application of this tool.   The base-structure of this string appears to be that it consists of s:integer:string, and this, to my way of thinking, is not the sort of request that is well-suited to, nor that particularly needs, the services of a parser.   In what context do strings of this nature appear in your input data?   Please tell us more about the surrounding context of your problem.

Re: Parse::RecDescent - specifying dynamic subrule repetition
by 7stud (Deacon) on Feb 19, 2013 at 01:26 UTC
    Here is a simple example string:
    s:5:"abcde" -- the '5' is the number of characters in the string

    Here's the hard part:
    s:18:"s:10:"abcdefghij";"; -- the string can store other strings, so I can't just search for ";

    You mean like you can in the first string? Great examples!

Re: Parse::RecDescent - specifying dynamic subrule repetition
by Yary (Scribe) on Feb 19, 2013 at 04:14 UTC
    Like the other people answering, I'm also wondering "how about reaching for unpack, or substr?" Is this part of a larger grammar?

    Perhaps you want to parse string-within-a-string into a nested data structure, which would justify a parser. If that's the case, the String rule should have a recursive alternate. If you just want the outer string, copying literally any inner string without interpretation, then you need a dynamic rule that uses $item{Count} to eat exactly that many characters.

Re: Parse::RecDescent - specifying dynamic subrule repetition
by 7stud (Deacon) on Feb 19, 2013 at 04:20 UTC
    I know that subrules can have repetition counts, but I can't get something like
    
    String: 's: ' Count '"' Cdata($item{Count}) '";' to work
    

    I'm a mere beginner with P:RD, but previously I have found the lack of backreferences to be very frustrating. Rule names are not double quote-ish things, so variables are not interpolated into rule names. But literal strings or regexes in your rules are double quote-ish things, so variables are interpolated into them. You can use that fact to create a backreference:

    use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; #Give out hints to help fix problems. #$::RD_TRACE = 1; #Trace parsers' behaviour my $text = <<'END_OF_TEXT'; 5:abcdefghi END_OF_TEXT my $grammar = <<'END_OF_GRAMMAR'; { use 5.012; use Data::Dumper; my $count_match; #****DECLARE A VARIABLE**** } line: section(s /:/) section: count { $count_match = $item{count} } #**SET THE VARIABLE** | word {say Dumper(\@item)} count: m{ \d+ }xms word: m| .{$count_match} |xms #***INTERPOLATE THE VARIABLE*** END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->line($text) or die "Can't match text"; --output:-- $VAR1 = [ 'section', 'abcde' ];

    However, your grammar needs to employ recursion in order to match your more complicated strings, i.e patterns within a pattern. And recursion is very hard. The only way I know how to do recursion in a programming language whose syntax is not specifically designed for recursion, i.e. perl, is by trial and error (thousands of times) until I get it right.

    If you are just beginning with P::RD, here are some tips I wrote up: http://www.perlmonks.org/index.pl?node_id=1015944

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1019403]
Approved by snoopy
Front-paged by snoopy
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2014-08-31 08:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls