in reply to to thread or fork or ?
Unless the work you do is A LOT more expensive than a simple frequency count, or your data set extremely large and your disks very fast, you're best off using a single process. Synchronization is a lot of work with threads so accessing say a shared hash can be orders of magnitude slower than a non-shared one; with processes you'd end up having to serialize the end result and somehow pipe it back to the master process---also very expensive.
If you're sure you want this (maybe just as a learning experience), I'd suggest just using fixed-size chunks from the input stream per process/thread, to minimize shared data. Say, read a couple of megabytes plus a line into a single string (so as to read at maximum speed, plus the line so you don't split your work in the middle of a word), then start a thread to process it (split into words, optionally normalize, count) into a local hash that then goes into a queue read by the master thread that checks for results from worker threads in-between blocks.