<?xml version="1.0" encoding="windows-1252"?>
<node id="1014519" title="Re: selecting columns from a tab-separated-values file" created="2013-01-21 17:59:44" updated="2013-01-21 17:59:44">
<type id="11">
note</type>
<author id="885521">
hippo</author>
<data>
<field name="doctext">
&lt;p&gt;If by "1B" you mean 10^9 and if your fields have mean length 9 chars, then including tabs you have roughly 500GB in one file, correct? I'm not surprised it is very slow. How fast to just cat the file? How much slower is your script?&lt;/p&gt;

&lt;p&gt;Best advice is buy the fastest disk you can afford. And maybe think about preprocessing.&lt;/p&gt;</field>
<field name="root_node">
1014517</field>
<field name="parent_node">
1014517</field>
</data>
</node>
