As appealing as such a notion might be at first blush, in my experience the results turn out to be disappointing. Basically, this sort of algorithm is I/O-bound, limited in its execution time by the speed of the disk drive and associated drivers and nothing else. The CPU is loafing. If you now inject multiple processes or threads into the mix, you can actually make things worse because the disk drive is now faced with a much more unpredictable situation ... odds are that it will now be moving the read/write head back and forth much more frequently than if it were servicing requests from only one worker. A CPU that thinks in nanoseconds is now waiting for multiple milliseconds: one or more Ferraris, all stuck in traffic right next to a Yugo who might be moving along faster than they are.
The fact that you are now updating a single shared data-structure is another pinch-point. Although of course Perl can do this reliably, the workers are now obliged to wait, not only for the disk-drive, but also for one another.
The best approach, very-recently discussed here, is to leverage the operating system’s built-in file buffering mechanisms as aggressively as possible, so that data is read-in from the disk “in great big gulps,” and devising the entire algorithm to minimize the need to move the read/write mechanism to some other cylinder on the platter. “When I/O is necessary, make it count.”
Just my two cents ...