I have data like
my $testData = <<'_EOGTESTA_';
RECORD
#input_id 1210758171x001_0013
#output_id
#input_type PTC
#output_type PTC
#addkey
#source_id 01
#filename TTFILE01-0001-20080101000000
F ptc_record_length 00B6
F ptc_record_type
B firstBlock
F ptc_charging_end_time 20080604093721
F ptc_called_msrn_ton FF
.
F ptc_term_mcz_duration 060000
F ptc_term_mcz_change_direction
.
_EOGTESTA_
I tried the following code using RecDescent, but of course the grammar is wrong somewhere:
#!/usr/bin/perl -w
use strict;
use Parse::RecDescent;
use Data::Dumper;
# Enable warnings within the Parse::RecDescent module.
$::RD_ERRORS = 1; # Make sure the parser dies when it encounters an er
+ror
$::RD_WARN = 1; # Enable warnings. This will warn on unused rules &c
+.
$::RD_HINT = 1; # Give out hints to help fix problems.
my $grammar = <<'_EOGRAMMAR_';
RECORDSTART : /RECORD/
{ print "\n[*] RECORDSTART"; }
RECORDEND : /^\.$/
{ print '\n[*] RECORDEND'; }
fieldName : /[^ \t]+/
{ print "\n[*] fieldName\n" }
metaName : /[^ \t]+/
{ print "\n[*] metaName\n" }
# metaFieldValue: /^\.$/
# fieldValue: /^\.$/
# blockName : /^\.$/
# metaFieldValue: /.*$/
# fieldValue: /.*$/
# blockName : /.*$/
metaFieldValue: /.*/
{ print "\n[*] metaFieldValue\n" }
fieldValue: /.*/
{ print "\n[*] fieldValue\n" }
blockName : /.*/
{ print "\n[*] blockName\n" }
metaField : #/#/ metaName /[ \t]+/ metaFieldValue
/#/ metaName metaFieldValue
{ print "\n[*] Got metafield named $me
+taName" . $item{ metaName } . ' with value ' . $item{ metaFieldValue
+} . "\n" }
field : /^F[ \t]+/ fieldName /[ \t]+/ fieldValue
{ print '\n[*] Got field named ' . $item{ fieldName }
+. ' with value ' . $item{ fieldValue } . '\n' }
block : /^B/ blockName
{ print '\n[*] Got block named ' . $item{ blockName }
+. ' with value ' . ':-P' . '\n' }
recordBody : field(s)
{ print '\n[*] field(s)\n' }
|
block(s)
{ print '\n[*] block(s)\n' }
|
metaField(s)
{ print '\n[*] metaField(s)\n' }
#startOfRecord: RECORDSTART recordBody(s /$/) RECORDEND
startOfRecord: RECORDSTART recordBody RECORDEND
| <error>
_EOGRAMMAR_
#$skeletonPattern = "#input_type[ \t]*";
#my $metaFieldPattern = qr/[ \t]*#([^ \t]+)[ \t]+(.*)/o; # "#input_typ
+e SCDR+", "#filename processed_01_20080616001403.cdr", etc
#my $normalFieldPattern = qr/([ \t]*)([0-9]*)F[ \t]+([^ \t]+)[ \t]+([^
+ \t\r\n]+)(.*)/; # "1F S_Diagnostic1 62" OR " F S_Diagnostic1 62"
+OR " F S_Diagnostic1 62" are synonymous, etc
print $testData, "\n\n";
my $parser = Parse::RecDescent->new($grammar);
$parser->startOfRecord($testData) or die "Bad input!\n";
But all I get are unhelpful error messages like:
Variable "$errorprefix" is not available at C:/cpanfly/var/megalib/Par
+se/RecDescent.pm line 2906.
Use of uninitialized value $errorprefix in formline at C:/cpanfly/var/
+megalib/Parse/RecDescent.pm line 2850.
I am using ActivePerl 5.10 and the latest RecDescent from CPAN.
Any ideas how to get "trace" messages as and when the parsing gets done, or to get something more helpful?
UPDATED Brought the grammar to something so that it at least runs - but I will come back to the original grammar later
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.