<?xml version="1.0" encoding="windows-1252"?>
<node id="1015107" title="Re^9: selecting columns from a tab-separated-values file" created="2013-01-24 04:01:00" updated="2013-01-24 04:01:00">
<type id="11">
note</type>
<author id="171588">
BrowserUk</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;i&gt;Rerunning with 5.16 yielded a runtime of 60 seconds.&lt;/i&gt;&lt;/blockquote&gt;

&lt;p&gt;Conclusion: With 384GB of ram; your (relatively) tiny 10e6 lines test file is being read from system file cache, hence effectively disguising the disk IO costs.

&lt;p&gt;If your 80GB file fits in cache and will always be there when you need to do this; you can ignore the effects of disk. 

&lt;P&gt;Otherwise ... you need to re-run all your testing using the real file and having flushed the cache before each test.

&lt;div class="pmsig"&gt;&lt;div class="pmsig-171588"&gt;
&lt;hr /&gt;
&lt;font size=1 &gt;
&lt;div&gt;With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'&lt;/div&gt;
&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;
&lt;div&gt;"Science is about questioning the status quo. Questioning authority". &lt;/div&gt;
&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.
&lt;/div&gt;
&lt;/font&gt;

&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
1014517</field>
<field name="parent_node">
1015056</field>
</data>
</node>
