<?xml version="1.0" encoding="windows-1252"?>
<node id="1014822" title="Re^4: selecting columns from a tab-separated-values file" created="2013-01-22 21:25:01" updated="2013-01-22 21:25:01">
<type id="11">
note</type>
<author id="750012">
ibm1620</author>
<data>
<field name="doctext">
&lt;p&gt;I've not yet tried the two-process solution, but I intend to.  One concern I have is that, at some point fairly early on, it seems like the "pipeline" would get full, at which point the two processes would have to operate in lock-step.&lt;/p&gt;

&lt;p&gt;Another thought I've had is that, since the process appears to me to be CPU-bound, it might be worth forking several children and distributing the work across them.  Each child would have to write to a separate output file, which admittedly would increase the possibility of disk head thrashing, but I think it's worth a try.&lt;/p&gt;</field>
<field name="root_node">
1014517</field>
<field name="parent_node">
1014812</field>
</data>
</node>
