<?xml version="1.0" encoding="windows-1252"?>
<node id="185077" title="Re: Re: Re: Re: Parse::RecDescent Grammar Fun" created="2002-07-24 20:52:53" updated="2005-06-06 08:52:08">
<type id="11">
note</type>
<author id="45391">
ichimunki</author>
<data>
<field name="doctext">
Hahaha! We're going to beat this grammar into submission yet. :)
&lt;p&gt;
Unfortunately we can't brute force it, think of the labor and testing involved to add new tags. I think the best I can do here is to collect punct as single character chunks (storing them in a temp var), then, when I get to a token, insert that temp var back into the tree. I'd post code, but rather than printing discrete sensible morphemes, I really just need the morphemes (productions in P::RD-speak) concatenated in a string. For that purpose whether it emits punct one character at a time or in chunks won't matter.
&lt;p&gt;
Either way, this Parse::RecDescent module is the best thing since HTML::TokeParser&amp;#91;::Simple&amp;#93;, imho.</field>
<field name="root_node">
184966</field>
<field name="parent_node">
185058</field>
</data>
</node>
