<?xml version="1.0" encoding="windows-1252"?>
<node id="947136" title="Re^7: putting text into array word by word" created="2012-01-10 05:35:51" updated="2012-01-10 05:35:51">
<type id="11">
note</type>
<author id="258724">
Not_a_Number</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;c&gt;$this_word =~ s/[[:punct:]]//g;&lt;/c&gt;&lt;/blockquote&gt;
&lt;p&gt;The only problem with that approach is that it removes internal punctuation (ie apostrophes) as well, so that &lt;i&gt;I'll&lt;/i&gt; becomes &lt;i&gt;ill&lt;/i&gt;, &lt;i&gt;she'd&lt;/i&gt; becomes &lt;i&gt;shed&lt;/i&gt;, etc. ('Why was Virgina Woolf so obsessed with sheds?' I hear someone ask.)&lt;/p&gt;
&lt;p&gt;I'd use this instead:&lt;/p&gt;
&lt;c&gt;$this_word =~ s/^[[:punct:]]+//; # Remove leading punct.
$this_word =~ s/[[:punct:]]+$//; # Remove trailing punct.
&lt;/c&gt;
&lt;p&gt;Update: added Virginia Woolf sentence.&lt;/p&gt;</field>
<field name="root_node">
947041</field>
<field name="parent_node">
947091</field>
</data>
</node>
