Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Why won't this basic Parse::RecDescent example work?

by 7stud (Deacon)
on Jan 28, 2013 at 03:35 UTC ( #1015622=perlquestion: print w/ replies, xml ) Need Help??
7stud has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Why the <deleted> does this code show an empty hash:

use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; # Enable warnings- warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. my %HASH; my $grammar = <<'END_OF_GRAMMAR'; startrule : dir { $main::HASH{dir} = $item{dir} } dir : 'hello' END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("hello"); use Data::Dumper; say Dumper(\%HASH); --output:-- $VAR1 = {};

Comment on Why won't this basic Parse::RecDescent example work?
Download Code
Re: Why won't this basic Parse::RecDescent example work?
by 7stud (Deacon) on Jan 28, 2013 at 03:43 UTC

    Never mind. Solution:

    our %HASH;

    However, stay tuned--I'm just getting started.

Re: Why won't this basic Parse::RecDescent example work?
by 7stud (Deacon) on Jan 28, 2013 at 04:11 UTC
    Next up:
    use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. our %HASH; my $grammar = <<'END_OF_GRAMMAR'; startrule : from_clause from_clause : 'from' dir(s) { print "-->$item{dir}<--\n"; $main::HASH{dirs} = $item{dir} } dir : m{/} END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from ./"); use Data::Dumper; say Dumper(\%HASH); --output:-- $VAR1 = {};

    Expected output:

    -->./<-- $Var1 = { "dirs" => "./" };

      I’m not familiar with Parse::RecDescent, but by reference to the docs plus a bit of trial-and-error I got this to work by adjusting the regex and assigning $1 to a local variable (see the section “Start-up Actions” in Parse::RecDescent):

      #! perl use strict; use warnings; use 5.012; use Data::Dumper; use Parse::RecDescent; $::RD_ERRORS = 1; # Parser dies when it encounters an error $::RD_WARN = 1; # Enable warnings- warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. #$::RD_TRACE = 1; # if defined, also trace parsers' behaviour our %HASH; my $grammar = <<'END_OF_GRAMMAR'; { my $directory; } startrule: from_clause from_clause: 'from' dir(s) { print "-->$directory<--\n"; $main::HASH{dir} = $directory; } dir: m{ ^ ( .*? / .* ) $ }x { $directory = $1; } END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n +"; defined $parser->startrule("from ./foo") or print "Bad text!\n"; say Dumper(\%HASH);

      Output:

      18:38 >perl 505_SoPW.pl -->./foo<-- $VAR1 = { 'dir' => './foo' }; 18:51 >

      Note that setting $::RD_TRACE = 1; is useful for understanding what the parser is doing.

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        I’m not familiar with Parse::RecDescent,

        Then thanks for being brave enough to take a look!

        by reference to the docs plus a bit of trial-and-error I got this to work by adjusting the regex and assigning $1 to a local variable.

        Nice going! After reading the "Start up Action" section in the docs, I reread the "Action" section, and I noticed this statement:

        The results of named subrules are stored in the hash under each subrule's name (including the repetition specifier, if any)

        So the key I was using in the %item hash, 'dir', was wrong. The key should be 'dir(s)'. So now I can get this output:

        use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. our %HASH; my $grammar = <<'END_OF_GRAMMAR'; startrule : from_clause from_clause : 'from' dir(s) { print "-->@{$item{'dir(s)'}}<-- \n"; $main::HASH{dirs} = @{$item{'dir(s)'}}; } dir : 'hello' END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from hello world"); use Data::Dumper; say Dumper(\%HASH); --output:-- -->hello<-- $VAR1 = { 'dirs' => 1 };

        Partial success! Note the dereference of $item{'dir(s)'}. Now, what is that '1'? The return value from print()? But print() isn't the last statement of the action. If I change the dir rule to:

        dir : 'hello' | 'world'

        I get this output:

        -->hello world<-- $VAR1 = { 'dirs' => 2 };

        Is 2 the count of the words matched? What is going on? I am using the exact same array in each of these lines:

        print "-->@{$item{'dir(s)'}}<-- \n"; $main::HASH{dirs} = @{$item{'dir(s)'}};

        ...yet I am getting different results 'hello world' v. 2! How is that possible? Argghh, of course! Perl doesn't care about giving you the exact same results for any expression you use--because perl determines the result by the context in which the expression appears. In my case, the print() statement supplies list context for the array, and "$main::... =" provides scalar context for the array--and an array provides its length in scalar context.

        So now I can get the expected output:

        use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. our %HASH; my $grammar = <<'END_OF_GRAMMAR'; startrule : from_clause from_clause : 'from' dir(s) { print "-->@{$item{'dir(s)'}}<-- \n"; $main::HASH{dirs} = $item{'dir(s)'}; } dir : 'hello' | 'world' END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from hello world hello"); use Data::Dumper; say Dumper(\%HASH); --output:-- -->hello world<-- $VAR1 = { 'dirs' => [ 'hello', 'world' ] };

        Next up, the regex problem. This doesn't work:

        my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } startrule : from_clause from_clause : 'from' dir(s) { say "-->$item[0]<--"; say "-->@{$item[-1]}<--"; } dir : m{/} END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from ./hello"); --output:-- (blank)

        Note that I tried using the @item array this time. The first item in @item is the rule name, "from_clause", and the next items should be the matches for the subrules, so $item[2], or equivalently $item[-1], should be the matches for dir(s). But because I am not even seeing the arrows in my print statement, that means the parser isn't finding a match for my rule.

        I also notice there are weird rules the parser follows for comments. This does not cause an error:

        my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } ...

        ...but this does cause an error:

        my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } ... --output:-- Unknown starting rule (Parse::RecDescent::namespace000001::startrule) +called at 3.pl line 76.

        Back to the regex problem. It seems that Parse::RecDescent takes the regex pattern and adds a ^ to the beginning of the pattern and adds $ to the end of the pattern. In other words, the regex you specify has to match all of the text you are interested in examining.

        my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } startrule : from_clause from_clause : 'from' dir(s) { say "-->$_<--" for @{ $item{'dir(s)'} }; } dir : m{\S* / \S*}xms END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from ./hello hello/world"); --output:-- -->./hello<-- -->hello/world<--

        And reworking my original example:

        use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; # Give out hints to help fix problems. our %HASH; my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } startrule : from_clause from_clause : 'from' dir(s) { say "-->$_<--" for @{ $item{'dir(s)'} }; $main::HASH{target_dirs} = $item{'dir(s)'}; } dir : m{\S* / \S*}xms END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from ./hello hello/world"); use Data::Dumper; say Dumper(\%HASH); --output:-- -->./hello<-- -->hello/world<-- $VAR1 = { 'target_dirs' => [ './hello', 'hello/world' ] };

        Success! Thanks.

Re: Why won't this basic Parse::RecDescent example work?
by 7stud (Deacon) on Jan 28, 2013 at 04:23 UTC
    Another variation that doesn't produce the expected output:
    our %HASH; my $grammar = <<'END_OF_GRAMMAR'; startrule : from_clause from_clause : 'from' dir(s) { print "-->$item{dir}<--\n"; } dir : "hello" | "world" END_OF_GRAMMAR my $parser = Parse::RecDescent->new($grammar); $parser->startrule("from hello world"); use Data::Dumper; say Dumper(\%HASH); --output:-- --><-- $VAR1 = {};
    Expected output:
    -->SOMETHING<-- $VAR1 = { "dirs" => "SOMETHING"; };
Re: Why won't this basic Parse::RecDescent example work?
by 7stud (Deacon) on Jan 30, 2013 at 00:33 UTC

    Some Tips(from a beginner) for using Parse::RecDescent:

    But first a simple example. Suppose you want to parse lines of text that look like this:

        employee Joe 10
    

    Here is another example of such a line:

        employee Cathy 14
    

    A line of text consists of the literal 'employee' followed by a name and an id. To parse the text, first you define a rule such as employee_info:

    my $grammar = <<'END_OF_GRAMMAR'; employee_info: 'employee' name id name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR

    Then you parse the text:

    my $text = "employee Joe 10"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->employee_info($text) or die "Text doesn't match";

    But if you run that in a fully fleshed out program(which you'll see soon enough), it will produce a big fat nothing for output. Yet, due to the fact that neither of the die() error messages were displayed, you know that your grammar didn't have any errors and that the text matched.

    To actually produce some output, you need to add an Action. An Action is executed when the parser finds a match for the rule. Here is what an Action looks like:

    my $grammar = <<'END_OF_GRAMMAR'; employee_info: 'employee' name id { print "$_\n" for @item; } #Action name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR

    The array @item is provided by Parse::RecDescent, and it contains the text that matches the rule. Here is a complete sample program using the employee_info rule and its associated action followed by the output:

    use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; #Give out hints to help fix problems. #$::RD_TRACE = 1; #Trace parsers' behaviour my $grammar = <<'END_OF_GRAMMAR'; employee_info: 'employee' name id { print "$_\n" for @item; } #Action name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR my $text = "employee Joe 10"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->employee_info($text) or die "Text doesn't match"; --output:-- employee_info employee Joe 10

    Note that @item contains the rule name at index position 0.

    The way the parser works is it takes your text and splits it on whitespace, e.g. producing ('employee', 'Joe', '10'), and then the parser sees if those pieces match your rule.

    1) The matches are in @item and %item.

    When a rule matches, the @item array contains the rule name at index position 0, with successive index positions containing the text that matched each term in the rule. Similarly, %item is a hash where the keys are the term names and the values are the matched text. However, term names can get complex(see tip #7), so often it is easier to use @item to get the matches, e.g. $item[-1].

    However things aren't always so straightforward. Suppose you want to parse text like this:

        { hello }
    

    So you come up with this grammar:

    myrule: brace_clause brace_clause: '{' word '}' word: m{ [a-z]+ }xms

    To see the matches for the brace_clause rule, you might add an action like this:

    myrule: brace_clause brace_clause: '{' word '}' { print "$_\n" for @item; } word: m{ [a-z]+ }xms

    That would produce this output:

    brace_clause #the rule name { #the match for '{' hello #the match for word } #the match for '}'

    Okay, no surprises there. But what if you move the action so that it is under myrule, like this:

    myrule: brace_clause { print "$_\n" for @item; } brace_clause: '{' word '}' word: m{ [a-z]+ }xms

    What do you expect the output to be now? Maybe this:

    myrule { hello }

    The actual output is:

    myrule }

    What the? It turns out that when a rule is used as a subrule, the subrule only produces what matched its last term, which in this case is a literal '}'. Yes, that effect will make you tear your hair out at some point.

    In order to send along the entirety of the matched text to another rule, you'll need to retrieve all the matches from @item and join() them into a string:

    myrule: brace_clause { print "$_\n" for @item; } brace_clause: '{' word '}' { join ' ', @item[1..3] } word: m{ [a-z]+ }xms --output:-- myrule { hello }

    Just remember that often times one of the elements in @item will be a reference to an array of matches, so joining all the matches together may take several lines of code. As always, use Data::Dumper to display the structure of @item so that you can figure out how to retrieve all the matches.

    2) Add a Start-up action to your grammar to use Data::Dumper.

    The parser executes in a different namespace than your program, so the parser can't see any use statements at the top of your program. As a result, if you need to use a module inside several actions you can put a use statement in a Start-up action:
    my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #enable say() use Data::Dumper; } … … END_OF_GRAMMAR

    With that Start-up action, you can call the functions defined in Data::Dumper to display @item in any action. Using Data::Dumper will allow you to see exactly what form the matches are in (a string? an array of strings? an array of arrays of strings?). I suggest Dumping @item as the first line in an action and not writing any additional code in the action until you examine the output:

    my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #enable say() use Data::Dumper; } employee_info: 'employee' name id { say Dumper(\@item); } #Action name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR my $text = "employee Joe 10"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->employee_info($text) or die "Text doesn't match"; --output:-- $VAR1 = [ 'employee_info', 'employee', 'Joe', '10' ];

    Once you see the exact layout of the matches in @item, it is much easier to figure out the correct syntax for retrieving the information you want.

    3) Actions change the matched text.

    The return value of an Action is the value of the last expression executed inside the action. Furthermore, the return value of the action becomes the substitute for the text that actually matched the rule. That effect rears its ugly head when one rule incorporates another rule:
    use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; #Give out hints to help fix problems. #$::RD_TRACE = 1; #Trace parsers' behaviour my $grammar = <<'END_OF_GRAMMAR'; { use Data::Dumper; use 5.012; #enable say() } another_rule: 'new' employee_info { say Dumper(\@item); } employee_info: 'employee' name id { say Dumper(\@item); say '-' x 20; 'hello world'; } name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR my $text = "new employee Joe 10"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->another_rule($text) or die "Text doesn't match"; --output:-- $VAR1 = [ 'employee_info', 'employee', 'Joe', '10' ]; -------------------- $VAR1 = [ 'another_rule', 'new', 'hello world' ];
    See how the matched text inside the employee_info rule's action was 'employee', 'Joe', '10', but in another_rule, which contains the term employee_info, the matched text for employee_info has changed to 'hello world'?

    A common problem is seeing 1 displayed as the match for part of a rule. You need to remember that print() or say() return 1, so if either of those statements is the last thing executed in an action, the action will return 1 as the matched text to another rule. When you are tearing your hair out, come back to this tip and re-read it, then write down what the actions in your grammar return and stare at the values for awhile; then see if those values appear anywhere in your output.

    It's also possible to insert an action in the middle of a rule--rather than at the end. For instance, you can do this:

    employee_info: 'employee' name { say $item[1]; } id name: m{ \S+ }xms id: m{ \d+ }xms

    But there is a side effect of doing that: the return value of the action will be inserted into @item just after whatever matched the subrule. That can cause problems if you try to do something like this:

    employee_info: 'employee' name { say $item[1]; } id { say $item[3] } #print match for id

    The match for id is not at position 3 in @item--the match is at position 4 because the first action inserted something in @item. Once again, if you get strange errors when trying to retrieve or print out matches, you should use Data::Dumper to display @item to see exactly where a match is located in @item.

    4) Some comments cause errors.

    The comment at the end of the line in this action is benign:

    my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } ...

    ...but compressing that action into one line:

    my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #So I can use say() } ...

    ...causes this error:

    Unknown starting rule (Parse::RecDescent::namespace000001::startrule) +called at 3.pl line 76.

    As a result, I recommend that you not use trailing comments.

    5) Why do I keep getting the errors:

    1. "Unknown starting rule (Parse::RecDescent::namespace000001::non_rule) called"
    2. "Text doesn't match"

    a) Don't forget to change the line:
    defined $parser->another_rule($text) or die "Text doesn't match";

    ...to reflect the new rule name when you start adding or changing rule names. Or, it could be a comment causing the error (see tip #4).

    b) Similarly, make sure you update your $text string when testing a new rule.

    6) Parsing delimited lists.

    If you have text like this:

        hello,world,goodbye,mars
    

    You can parse it with this rule:

    word_list: word(s /,/) word : m{ [^,]+ }xms

    Here's how that works. A subrule such as word(s) will match one or more words, so if you have this grammar:

     
        word_list:  word(s)
                    { 
                      say Dumper(\@item);
                    }
    
        word: m{ \S+ }xms
    

    ...it will match text like this:

      hello
      hello world
      hello world goodbye mars

    In addition, the syntax word(s) allows you to specify a regex as the separator between the words, e.g. word(s /,/). So if you have grammar like this:

    word_list: word(s /,/) { say Dumper(\@item); } word : m{ [^,]+ }xms

    …then word(s /,/) will match text like this:

    
       hello
       hello,world
       hello,world,goodbye,mars

    Here is some sample Data::Dumper output:

    $VAR1 = [ 'word_list', [ 'hello', 'world', 'goodbye', 'mars' ] ];

    You have to be a careful about how you define the word rule because word(s) can also match a single word, and the parser will greedily eat up as much text as it can trying to match a single word. As a result, note how the regex for word changed when parsing the comma separated list.

    A delimited list can also look like this:

        apple or strawberry or cherry
    

    Instead of being delimited by a comma, the words are delimited by 'or'. That delimited list is even easier to parse:

    my $grammar = <<'END_OF_GRAMMAR'; { use Data::Dumper; use 5.012; #enable say() } delimited_or_list: word(s /or/) { say Dumper(\@item); } word: m{ \S+ }xms END_OF_GRAMMAR my $text = "apple or strawberry or cherry"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->delimited_or_list($text) or die "Text doesn't match"; --output:-- $VAR1 = [ 'delimited_or_list', [ 'apple', 'strawberry', 'cherry' ] ];

    Unlike in the previous example, this time you don't have to worry about the delimiter when constructing the regex for the word rule. I'm not quite clear on the details of that, but I think it has something to do with how the parser splits the text on whitespace before trying to match the rule. In the previous example, the text is really a single token, ("hello,world,goodbye,mars"), where here you are matching against the tokens: ('apple', 'or', 'strawberry', 'or', 'cherry').

    7) The keys in %item might not be what you think.

    If you have the following rules:
    some_rule_name: 'hello' word(s /,/) word : m{ [^,]+ }xms

    …then the key in the %item hash for the text that matched word(s /,/) is the unwieldy key 'word(s /,/)'--not the key 'word'. So you could grab the match by writing:

    $item{'word(s /,/)'}

    ...but that is difficult to type and it looks like hell, so I find it easier and clearer to use the @item array instead and write:

    $item[-1]

    8) Parsing quoted strings.

    If you have text like this:
        commands-> 'go to' 'stop' 
        commands-> 'next' 'back up' 
    

    …and you want to get the text inside the quotes, you can use this grammar:

        cmd_choices: 'commands->' quoted_string(s)    
                     { say Dumper(\@item); }
                      
        quoted_string: <perl_quotelike>
    

    The <perl_quotelike> thing is a predefined action which handles parsing interior quotes inside the text (it actually matches any perl "quote-like operator", see the Parse:RecDescent docs). Remember that actions return the value of the last expression executed in the action, and the action's return value is considered to be the text that matched the rule. As a result, whatever the <perl_quotelike> action returns is considered the matching text for each quoted_string in the cmd_choices rule. Here is some sample Data::Dumper output:

    $VAR1 = [ 'command_choices', 'commands->', [ [ '', '\'', 'go to', '\'', '', '', '', '' ], [ '', '\'', 'stop', '\'', '', '', '', '' ] ] ];

    Whoa. What is that mess? The <perl_quotelike> action returns an array of arrays, where each sub array is an 8 element array containing information about one of the quoted strings that matched, which is mostly blank for a string quoted with " or '. At index position 2 in the sub arrays is the text that was inside the quotes, and at index positions 1 and 3 are the actual quote marks that were found. If you want the text that was inside the quote marks, you just have to figure out the right syntax in order to grab the values at index position 2 in each array (explanation after the code):

    cmd_choices: 'commands->' quoted_string(s) { my @results = map { $_->[2] } @{$item[-1]}; say for @results; } quoted_string : <perl_quotelike> --output:-- go to stop

    The last item in @item, $item[-1], is whatever matched the last subrule. The last subrule in word_list is the rule quoted_string(s), and from the Data::Dumper output you can see that $item[-1] is a reference to an array of arrays. If you dereference $item[-1], @{$item[-1]}, you get an array where the items, $_ , are references to arrays. Each array reference is a reference to an array where index position 2, $_->[2], contains the text inside the quotes.

    9) Saving the matched data as the parser moves along the text.

    I read a tutorial that saves matches (or anything else) like this:
    use strict; use warnings; use 5.012; use Parse::RecDescent; $::RD_ERRORS = 1; #Parser dies when it encounters an error $::RD_WARN = 1; #Enable warnings - warn on unused rules &c. $::RD_HINT = 1; #Give out hints to help fix problems. #$::RD_TRACE = 1; #Trace parsers' behaviour our %RESULTS; #Need to declare the variable, but my variables #won't be seen in another namespace. So declare #a global variable with our(). my $grammar = <<'END_OF_GRAMMAR'; #Start up action(executed in parser namespace): { use 5.012; #enable say() use Data::Dumper; } some_rule: name id { $main::RESULTS{names} = $item{name}; $main::RESULTS{phone_numbers} = $item{id}; } name: m{ \S+ }xms id: m{ \d+ }xms END_OF_GRAMMAR my $text = "Joe 10"; my $parser = Parse::RecDescent->new($grammar) or die "Bad grammar!\n"; defined $parser->some_rule($text) or die "Text doesn't match"; use Data::Dumper; say Dumper(\%RESULTS); #Do something with %RESULTS --output:-- $VAR1 = { 'names' => 'Joe', 'phone_numbers' => '10' };

    10) Backreferences.

    Suppose you have some text like this:

    {{ hello }}
    

    But the text can have a variable number of opening braces, say n opening braces, followed by 'hello', followed by n closing braces. The problem is that in order to match the closing braces, you need to know how many opening braces matched. Because you are able to interpolate variables into literal strings or regular expressions in your rules, you can construct a backreference like this:

    my $text = <<'END_OF_TEXT'; {{ hello }} END_OF_TEXT my $grammar = <<'END_OF_GRAMMAR'; { use 5.012; use Data::Dumper; } #Declare some my() variables for use within the rule: brace_block: <rulevar: ($lbraces, $rbraces)> brace_block: lbrace(1..) { $lbraces = join '', @{$item[1]}; $rbraces = '}' x length $lbraces; } 'hello' "$rbraces" { say "$lbraces $item[3] $rbraces"; } lbrace: / [{] /xms END_OF_GRAMMAR

    Or, perhaps this is cleaner:

    my $text = <<'END_OF_TEXT'; {{ hello }} END_OF_TEXT my $grammar = <<'END_OF_GRAMMAR'; { use 5.012; use Data::Dumper; my $lbrace_count; #**DECLARE VARIABLE** } brace_block: lbrace(1..) { $lbrace_count = @{$item[1]}; #SET VARIABLE** } 'hello' rbraces { say "rbraces matched: $item{rbraces}"; } lbrace: / [{] /xms rbraces: / [}]{$lbrace_count} /xms #**INTERPOLATE VARIABLE** END_OF_GRAMMAR

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1015622]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2014-12-27 01:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (176 votes), past polls