<?xml version="1.0" encoding="windows-1252"?>
<node id="947068" title="Re^5: putting text into array word by word" created="2012-01-09 15:11:06" updated="2012-01-09 15:11:06">
<type id="11">
note</type>
<author id="258724">
Not_a_Number</author>
<data>
<field name="doctext">
&lt;p&gt;OK, now that's solved, let's look at the definition of 'word' (yes, things are going to get hairy...). Take this sentence, for example:&lt;/p&gt;
&lt;blockquote&gt;"No, he said."&lt;/blockquote&gt;
&lt;p&gt;The 'words' that your current code would extract are:&lt;/p&gt;
&lt;c&gt;"No,
he
said."
&lt;/c&gt;
&lt;p&gt;Is that really what you want? Or would you prefer:&lt;/p&gt;
&lt;c&gt;No # or, better, 'no'
he
said&lt;/c&gt;
&lt;p&gt;?&lt;/p&gt;</field>
<field name="root_node">
947041</field>
<field name="parent_node">
947059</field>
</data>
</node>
