Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

question when learning P::RD?

by xiaoyafeng (Deacon)
on Aug 28, 2016 at 08:11 UTC ( [id://1170611]=perlquestion: print w/replies, xml ) Need Help??

xiaoyafeng has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

Long time no see! Recently I have to parse a file and export to another format. the original file is like this:

Y 034309201607258 1 #Y means header Q 02751VACANT / CLOSE #comment G .... # inner head ... ... ... T RCRC 0810010 T RDRD 0810010 A 22OP A 13O A 12O P 3472CHSK00010014 #P means inner trailer Z 034309201607258 #Z means trailer
I took this weekend to learn P::RD, and seems a little bit to understand it. but when I tried some snippet to run, it's hard to succeed as I image:
use Parse::RecDescent; use IO::All; my $text = io("xxx.DAT")->slurp; # Create and compile the source file $parser1 = Parse::RecDescent->new(q( startrule : HeadRule /.+/ms TrailerRule HeadRule : /^Y.+$/m TrailerRule: /^Z.+$/m )); $parser2 = Parse::RecDescent->new(q( startrule : HeadRule HeadRule : /^Y.+^Z.+$/ms )); # Test it print "Valid data\n" if $parser1->startrule($text); #no! print "Valid data\n" if $parser2->startrule($text); #yes
why parser1 is failed? and in order to learn P::RD, Could anyone tell me if some module parsing files( like xml or html etc> on cpan make use of P:RD?

Thanks




I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Replies are listed 'Best First'.
Re: question when learning P::RD?
by duelafn (Parson) on Aug 28, 2016 at 17:26 UTC

    If you add $::RD_TRACE = 1; to your script, P::RD will print out a trace of the parse which can help with the debugging. In this case, it shows that the /.+/ms pattern gobbles up "Z 034309201607258" (P::RD can't backtrack into the pattern, it will only backtrack whole rules). Your middle rule should leave the "Z" line at the end. For instance:

    #!/usr/bin/perl use strict; use warnings; use 5.014; use Parse::RecDescent; # $::RD_TRACE = 1; my $text = <<'CONTENT'; Y 034309201607258 1 Q 02751VACANT / CLOSE G .... ... ... T RCRC 0810010 T RDRD 0810010 A 22OP A 13O A 12O P 3472CHSK00010014 Z 034309201607258 CONTENT my $parser1 = Parse::RecDescent->new(<<'GRAMMAR'); startrule : HeadRule OtherRule(s) TrailerRule HeadRule : /^Y.+$/m TrailerRule: /^Z.+$/m OtherRule: /^(?![YZ]).*$/m GRAMMAR print "Valid data\n" if defined($parser1->startrule($text));

    Also, your parser2 shows why you should almost always end your top-level rule with a /\Z/ pattern so that parsing fails if P::RD doesn't consume the whole string.

    Good Day,
        Dean

Re: question when learning P::RD?
by Laurent_R (Canon) on Aug 28, 2016 at 10:03 UTC
    Hi xiaoyafeng,

    I am not sure you really need a parser for such a simple format. Simple regexes on individual lines should probably do the work.

    I haven't been using the Parse::RecDescent recently, so this may have no consequence, but maybe you could try to remove the spaces before the colons (":" characters) in your rule definitions.

      Laurent_R, Thanks for your reply, the format of original file is not as simple as I showed. it's about 10M size and various types nest in it.(like ^M xxxx, ^D xxxx etc). besides, it's a good chance to learn knowledge of parsing language to me. I want to have a try. ;)




      I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Re: question when learning P::RD?
by ikegami (Patriarch) on Aug 29, 2016 at 15:08 UTC

    /.+/ms matches the rest of the file, so there's nothing for TrailerRule to match.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1170611]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (5)
As of 2024-03-28 16:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found