Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Parse::RecDescent - some questions

by ribasushi (Pilgrim)
on Jan 13, 2007 at 03:31 UTC ( [id://594496]=perlquestion: print w/replies, xml ) Need Help??

ribasushi has asked for the wisdom of the Perl Monks concerning the following question:

Greetings honorable monks,
After fiddling for some time with P::RD I sort of got the hang of it. I have a fully functional grammar, and it gives me what I want. However I want to enhance it a bit, and in general I would take any comments/suggestions on how to write a proper grammar. The examples are rather long so I am including them at the end
What I need to know:
  • Currently the grammar is an all-or-nothing parser. What do I need to add to be able to give errors of the sort "invalid ip specified in subnet description at line xxx". I tried to fiddle with <error> but it doesn't seem to get triggered when I expect it to. A good reading or a couple of examples based on my code would be great.
  • Currently I am weeding out trailing comments, before going to P::RD $cfg = join ("\n", map { $_ =~ s/\# .* $//ox; $_ } (<$fh>) ) Can this be done from within P::RD without making the grammar insanely complex?
  • How can I prevent certain directives from appearing more than once? E.g. there can be only one keyfile or server, or only one default per subnet.
  • Anything else that is very wrong in the way I approach this.


Thank you in advance for your great help.
Peter

My data:

ttl 20d tsig-key "/etc/bind/rndc.key" subnet 192.168.1.0 netmask 255.255.255.0 dns-update domain net1.example.com option routers 192.168.1.1 option netbios-name-servers 192.168.1.1 option netbios-node-type 2 ttl 20m max-ttl 40m 192.168.1.4 pc4 pc-4 192.168.1.5 ab:dd:ee:ff:ff:ee pc-5.subnet2 ttl 3d default <host>-<addr>.unreg. bind-server 127.0.0.1 subnet 192.168.2.0 netmask 255.255.255.0 #empty subnet subnet 192.168.3.0 netmask 255.255.255.0 ttl 3h default <addr>


My grammar:
my $gr = (<<'__END_OF_GRAMMAR'); { use warnings; use strict; use Data::Dumper; sub get_duration { my %quantify = ( s => 1, m => 60, h => 3600, d => 86400, w => 604800, ); return $_[0] * $quantify{$_[1] || 's'}; } } host_str : /[a-z][0-9a-z\-]*/ domain_str : /(?: [a-z][0-9a-z\-]* \. )* [a-z][0-9a-z\-]* + /x mac : /(?: [0-9a-f]{2} : ){5} [0-9a-f]{2} /ix ip : / (?: (?: 25[0-5] | 2[0-4][0-9] | [01]?[0-9]?[0-9] )\. ){3} (?: 25[0-5] | 2[0-4][0-9] | [01]?[0-9]?[0-9] ) /x filename : /"[^"\n\t\r]+"/ { substr ($item[1], 1, -1) } | /[^"\s]+/ ttl : 'ttl' duration { { ttl => $item[2] } } max_ttl : 'max-ttl' duration { { max_ttl => $item[2] } + } duration : /(\d+)([smhdw]?)/ { $return = get_duration +($1, $2) } parse : directive(s) /\z/ {$item[1]} directive : server | keyfile | ttl | max_ttl | subnet(s) server : 'bind-server' ip { { bind_server => $item[2] } } keyfile : 'tsig-key' filename { { keyfile => $item[2] } } subnet : 'subnet' ip 'netmask' ip subnet_directive(s? +) { { subnet => [ @item[2,4], @{$item[5]} ] +} } subnet_directive: dns_update | domain | default | option | ttl | max_ttl | mapping dns_update : 'dns-update' { {dns_update => 1} } domain : 'domain' domain_str { {domain => $item[2]} } default : 'default' /[^\s;]+/ { {default => $item[2]} } option : 'option' /[^\n;]+/ #everything to the en +d of the line { {option => $item[2]} } mapping : ip identifier domain_str { $return = { $item[1] => { @{$item[2]}, d +omain_name => $item[3] } } } identifier : mac { ['mac', lc $item[1]] } | host_str { ['hostname', $item[1]] } __END_OF_GRAMMAR


What I get (Dumper of $prd->parse):
$VAR1 = [ { 'ttl' => 1728000 }, { 'keyfile' => '/etc/bind/rndc.key' }, [ { 'subnet' => [ '192.168.1.0', '255.255.255.0', { 'dns_update' => 1 }, { 'domain' => 'net1.example.com' }, { 'option' => 'routers 192.168.1.1' }, { 'option' => 'netbios-name-servers 192.16 +8.1.1' }, { 'option' => 'netbios-node-type 2' }, { 'ttl' => 1200 }, { 'max_ttl' => 2400 }, { '192.168.1.4' => { 'domain_name' => 'pc- +4', 'hostname' => 'pc4' } }, { '192.168.1.5' => { 'domain_name' => 'pc- +5.subnet2', 'mac' => 'ab:dd:ee:ff +:ff:ee' } }, { 'ttl' => 259200 }, { 'default' => '<host>-<addr>.unreg.' } ] } ], { 'bind_server' => '127.0.0.1' }, [ { 'subnet' => [ '192.168.2.0', '255.255.255.0' ] }, { 'subnet' => [ '192.168.3.0', '255.255.255.0', { 'ttl' => 10800 }, { 'default' => '<addr>' } ] } ] ];
Update: Thank you both, you've been rather helpful. It became a monster, but it works extremely well, and seems quite efficient compared to what it can do. I am including what came out of my parser below for google's reference :)
{ use warnings; use strict; use Digest::MD5 qw/md5_hex/; use Data::Dumper; my $line; my %quantify = ( s => 1, m => 60, h => 3600, d => 86400, w => 604800, ); sub get_duration { $line = shift; my $dur = $_[0] * $quantify{$_[1] || 's'}; if ($dur < 60) { die "Lease durations under one minute are not perm +itted (line $line)\n"; } return $dur; } sub ip_32 { return sprintf ('%08b%08b%08b%08b', split (/\./, shift +)); } sub ip_ascii { my @result; foreach my $ip (@_) { push @result, join ('.', unpack ('C4', pack ('B*', + $ip) ) ); } return @result; } my $parsed_result = {}; my $current = {}; sub add_kv { $line = shift; my ($key, $value, $option, $to) = @_; my $target; if (! defined $to) { $target = $parsed_result; } elsif ($to eq 'subnet') { $target = $parsed_result->{subnets}[-1]; } elsif ($to eq 'soa') { $target = $parsed_result->{soa} ||= {}; } else { die "Unknown add_kv (... to) option '$to'\n"; } if (exists $target->{$key}) { die "You can only have one occurence of '$option' +per section (line $line)\n"; } $target->{$key} = $value; return 1; } sub add_op { $line = shift; my $option = shift; my $digest = md5_hex ($option); if (exists $current->{options}{$digest}) { die "Option '$option' specified twice in the same +subnet (line $line)\n"; } push @{$parsed_result->{subnets}[-1]{options}}, $optio +n; $current->{options}{$digest} = undef; return 1; } sub add_subnet { $line = shift; my ($ip, $netmask) = @_; if ($netmask !~ /^1*0*$/) { die "Illegal netmask specified for a subnet at lin +e $line\n"; } my $network = "$ip" & "$netmask"; if ($network != $ip) { warn sprintf ("Warning: you should specify a netwo +rk number (%s) in a subnet declaration, not a random ip (%s). Fixing\ +n", ip_ascii ($network, $ip), ); } if (my $subs = $parsed_result->{subnets}) { if (grep { $_->{network} eq $network and $_->{netm +ask} eq $netmask} (@{$subs}) ) { die sprintf ("A second declaration of subnet ' +%s' netmask '%s' encountered on line %d\n", ip_ascii ($network, $netmask), $line, ); } } push @{$parsed_result->{subnets}}, { network => $netwo +rk, netmask => $netmask }; $current = {}; return 1; } sub adjust { $current->{$_[0]} = $_[1]; return 1; } sub check_map_prereq { my ($sub, $dn, @ips) = @_; if ($sub->{dns_update} xor (length ($dn || '') ) ) { die "Domain name for a mapping must be specified a +fter a 'dns-update' option (line $line)\n"; } if ($sub->{dns_update}) { if (not $parsed_result->{bind_server}) { die sprintf ("Can not perform requested dns up +dates for subnet %s/%s without a 'bind-server' specification.\n", ip_ascii ($sub->{network}, $sub->{netmask} +), ); } if (@ips == 2 and $dn !~ /<addr>/) { die "Range domain specifications must contain +an <addr> macro, in order to make the domain name unique (line $line) +\n"; } unless (grep { index ($dn, $_) >= 0 } (keys %{$par +sed_result->{soa}} ) ) { die "No parent SOA found for the FQDN '$dn'. P +lease add a proper 'soa' option in the main config section (line $lin +e).\n"; } } foreach my $ip (@ips) { unless ( ("$ip" & "$sub->{netmask}") eq $sub->{net +work} ) { die sprintf ("%s not within current subnet %s/ +%s on line %d\n", ip_ascii ($ip, $sub->{network}, $sub->{net +mask}), $line, ); } my $ptr = join ('.', (reverse unpack ('C4', pack ( +'B*', $ip) )), 'in-addr.arpa.'); unless (grep { index ($ptr, $_) >= 0 } (keys %{$pa +rsed_result->{soa}} ) ) { die "No parent SOA found for the Reverse DNS F +QDN '$ptr'. Please add a proper 'soa' to the main config section (lin +e $line).\n"; } } unless ($current->{ttl}) { die "A 'ttl' option must be specified prior to an +ip mapping or range declaration (line $line)\n"; } $current->{max_ttl} ||= $current->{ttl} * 2; $current->{min_ttl} ||= 60; if ($current->{ttl} > $current->{max_ttl}) { $current->{max_ttl} = $current->{ttl} * 2; warn "'max-ttl' is lower than 'ttl' for mapping at + line $line. Bumping up to $current->{max_ttl} seconds.\n"; } if ($current->{ttl} < $current->{min_ttl}) { $current->{min_ttl} = $current->{ttl} / 2; warn "'min-ttl' is higher than 'ttl' for mapping a +t line $line. Bumping down to $current->{min_ttl} seconds.\n"; } } sub make_fqdn { my ($domain, $dn) = @_; unless ( (substr $dn, -1) eq '.') { unless ($domain) { die "Non-fully qualified domain name at line $ +line not permitted without a prior 'domain' option for the subnet.\n" +; } $dn = join ('.', $dn, $domain); } return $dn; } sub add_range { $line = shift; my ($from, $to, $dn, $op_dn) = @_; my $subnet = $parsed_result->{subnets}[-1]; if ($subnet->{range}) { die sprintf ("More than one range statement encoun +tered for subnet %s/%s (line %d)\n", ip_ascii ($subnet->{network}, $subnet->{netmas +k}), $line, ); } if ($from > $to) { die sprintf ("Invalid range '%s' - '%s' at line %d +\n", ip_ascii ($from, $to), $line, ); } $dn = make_fqdn ($current->{domain}, $dn) if (defined +$dn); $op_dn = substr ( (make_fqdn ($current->{domain}, $op_ +dn), 0, -1) ) if (defined $op_dn); check_map_prereq ($subnet, $dn, $from, $to); $subnet->{range} = { range => [$from, $to], $dn ? ( dn => $dn ) : (), $op_dn ? ( op_dn => $op_dn ) : (), map { $_, $current->{$_} } qw/ttl max_ttl min_ttl/ +, }; return 1; } sub add_mapping { $line = shift; my ($ip, $id_type, $id, $dn, $op_dn) = @_; my $subnet = $parsed_result->{subnets}[-1]; if (exists $subnet->{maps}{$ip}) { die sprintf ("Mapping for ip '%s' specified twice +in subnet %s/%s (line %d)\n", ip_ascii ($ip, $subnet->{network}, $subnet->{n +etmask}), $line, ); } if (exists $current->{ids}{$id}) { die sprintf ("Id '%s' used for more than one mappi +ng in subnet %s/%s (line %d)\n", $id, ip_ascii ($subnet->{network}, $subnet->{netmas +k}), $line, ); } $dn = make_fqdn ($current->{domain}, $dn) if (defined +$dn); $op_dn = substr ( (make_fqdn ($current->{domain}, $op_ +dn), 0, -1) ) if (defined $op_dn); check_map_prereq ($subnet, $dn, $ip); $subnet->{maps}{$ip} = { id_type => $id_type, id => $id, defined $dn ? ( dn => $dn) : (), defined $op_dn ? ( op_dn => $op_dn) : (), map { $_, $current->{$_} } qw/ttl max_ttl min_ttl/ +, }; $current->{ids}{$id} = undef; return 1; } sub line_error { die sprintf ("Unable to parse line %d:\n'%s'\n", $_[0], (split (/\n/, $_[1], 2))[0], ); } } nl : / (?: \# [^\n]* )? \n /x # newline with + or without preceeding comment { $return = 1 } parse : line(s) { $return = $parsed_result } line : /\z/ | nl | bind_server | key | soa | subnet | { line_error ($thisline, $text) } bind_server : 'bind-server' ip nl { add_kv ($prevline, 'bind_server', $item[ +2], 'bind-server') } key : 'tsig-key' name_str nl { add_kv ( $prevline, 'keyfile', $item[2], 'tsig-key', ) } soa : 'soa' domain_str nl { add_kv ( $prevline, (substr $item[2], -1) eq '.' ? $item[2 +] : $item[2] . '.', undef, 'soa', 'soa', ) } subnet : 'subnet' ip 'netmask' ip { add_subnet ($thisline, $item[2], $item[4 +]) } sub_line(s) sub_line : nl | sub_option | /\z/ sub_option : dns_update | domain | option | ttl | max_ttl | min_ttl | range | mapping dns_update : 'dns-update' nl { add_kv ($prevline, 'dns_update', 1, 'dns +-update', 'subnet') } domain : 'domain' domain_str nl { adjust ('domain', (substr $item[2], -1) +eq '.' ? $item[2] : $item[2] . '.' ) } option : 'option' /[ \t\w\-"\.\,]+/ nl { add_op ( $prevline, $item[2] ) } ttl : 'ttl' duration nl { adjust ('ttl', $item[2] ) } max_ttl : 'max-ttl' duration nl { adjust ('max_ttl', $item[2] ) } min_ttl : 'min-ttl' duration nl { adjust ('min_ttl', $item[2] ) } range : 'range' ip ip domain_str(0..2) nl { add_range ($prevline, @item[2,3], @{$ite +m[4]} ) } mapping : ip identifier domain_str(0..2) nl { add_mapping ( $prevline, $item[1], $item[2][0], $item[2][1], @{$item[3]}, ) } host_str : /[a-z][0-9a-z\-]*/ domain_str : /(?: (?: [0-9a-z\-]+ | <addr> | <host> )+ \. +? )+ /x mac : /(?: [0-9a-f]{2} : ){5} [0-9a-f]{2} /ix identifier : mac { ['hardware', lc $item[1]] } | host_str { ['option host-name', $item[1]] } ip : / (?: (?: 25[0-5] | 2[0-4][0-9] | [01]?[0-9]?[0-9] )\. ){3} (?: 25[0-5] | 2[0-4][0-9] | [01]?[0-9]?[0-9] ) /x { $return = ip_32 ($item[1]); 1; } name_str : /"[^"\n\t\r]+"/ { substr ($item[1], 1, -1) } | /[^"\s]+/ duration : /(\d+)([smhdw]?)/ { $return = get_duration +($thisline, $1, $2) }

Replies are listed 'Best First'.
Re: Parse::RecDescent - some questions
by Beechbone (Friar) on Jan 13, 2007 at 17:15 UTC
    As it seems you known what you're doing, I'll keep it short:

    For the error directive to be helpful, you should commit between the keyword and the data, so the parser cannot backtrack out of it. This will force the location of the error to a position that can produce helpful error messages.

    For comments it would help to handle whitespace in your grammer and disable automated whitespace handling. So you could define a nl as "ws* comment? nlchar" and comment as "'#' anychar_but_nlchar*". If you need inline comments, something like ws="(space|tab|comment2) ws?", comment2="'/*' anychar_but_nlchar* '*/'" would be the way to go.

    "subnet_directive(s?)" allows all directives unlimited times. That is easy, and I would choose that approach, too, and check after the parsing for the number of directives. If you want to do it in the grammar, you need to recompose "subnet_directive", e.g. subnet_directive="repeatable_directive(s?) server? repeatable_directive(s?) keyfile? repeatable_directive(s?) default? repeatable_directive(s?) | repeatable_directive(s?) keyfile? repeatable_directive(s?) server? repeatable_directive(s?) default? repeatable_directive(s?) | repeatable_directive(s?) default? repeatable_directive(s?) keyfile? repeatable_directive(s?) server? repeatable_directive(s?) | repeatable_directive(s?) default? repeatable_directive(s?) server? repeatable_directive(s?) keyfile? repeatable_directive(s?)". Or, and that's a little bit easier, you modify the attached Perl code to reject being run twice---analyze the existing data structure if the value is already there...


    Search, Ask, Know
Re: Parse::RecDescent - some questions
by philcrow (Priest) on Jan 15, 2007 at 14:01 UTC
    You've probably already tried using
    <error: Your message here>
    If you did and your <error> is triggering, but you don't like the message it makes, don't be afraid to use an action like
    { die "with your message here at line $thisline"; }
    instead of calling error. There are even P::RD variables to help you form the message, like $thisline (line number in input), $text (remaining unparsed text). See the Actions subsection of the pod for a complete list of these vars.

    In addition to <commit> mentioned by a previous poster, also look up <reject> and <resynch>.

    Phil

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://594496]
Approved by randyk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-04-20 00:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found