in reply to Re^3: selecting columns from a tab-separated-values file
in thread selecting columns from a tab-separated-values file
I've not yet tried the two-process solution, but I intend to. One concern I have is that, at some point fairly early on, it seems like the "pipeline" would get full, at which point the two processes would have to operate in lock-step.
Another thought I've had is that, since the process appears to me to be CPU-bound, it might be worth forking several children and distributing the work across them. Each child would have to write to a separate output file, which admittedly would increase the possibility of disk head thrashing, but I think it's worth a try.