<?xml version="1.0" encoding="windows-1252"?>
<node id="928901" title="Re: Words in Words" created="2011-09-30 15:29:42" updated="2011-09-30 15:29:42">
<type id="11">
note</type>
<author id="352046">
ww</author>
<data>
<field name="doctext">
&lt;p&gt;Re Line 23, you &lt;i&gt;&lt;b&gt;MAY&lt;/b&gt;&lt;/i&gt; speed things up a little by reading $wordfile directly into a hash. Searching Google for &lt;font color="navy"&gt;&lt;tt&gt;site:PerlMonks.org +Perl +'file to hash'&lt;/tt&gt;&lt;/font&gt; produces some relevant results in prior discussions here in the Monastery. Strike the site specification and you'll find some other sources (albeit, of unknown reliability).&lt;/p&gt;

&lt;p&gt;Even with a wordlist of the size you're using, that runtime seems far to the high-side. Is the box upon which you're writing this reasonably current (fast?) and what version of Perl are you using?&lt;/p&gt;

&lt;p&gt;Counting the first para above, you now have three reasonable alternatives (&lt;b&gt;update: and one very reasonable question about the precision of your spec&lt;/b&gt; &amp;lt;/update&amp;gt;)  ... so the rest of this node will focus on some nits.&lt;/p&gt;

&lt;p&gt;Line 4 seems to reflect a view that &lt;c&gt;$datapath&lt;/c&gt; and &lt;c&gt;$wordfile&lt;/c&gt; are constants. While you're using them within the scope of the conventional meaning of "constant," they aren't in the sense of allowing the compiler to optomize.&lt;/p&gt;

&lt;p&gt;Lines 9 and 10 declare global variables... not, IMO, as major problem here, but in more complex programs, it's wise to declare them in such a way as to minimize their scope (for more on this, try &lt;font color="navy"&gt;&lt;tt&gt;perldoc -q scope&lt;/tt&gt;&lt;/font&gt; at your CLI).&lt;/p&gt;

&lt;p&gt;Your comment at Line 26 reflect what is actually ( IMO, usually &amp;#91;&lt;i&gt;...additional qualifiers may be required&lt;/i&gt;&amp;#93; ) a better (if not "best") practice.  The more complex you make your regex, the more opportunities you'll have to obscure a logic problem and/or to create something that sets the regex engine thrashing.&lt;/p&gt;</field>
<field name="root_node">
928877</field>
<field name="parent_node">
928877</field>
</data>
</node>
