|
|
| more useful options | |
| PerlMonks |
Re: To Fork or Not to Fork (bottle necks)by tye (Sage) |
| on Jul 01, 2014 at 04:17 UTC ( [id://1091794]=note: print w/replies, xml ) | Need Help?? |
|
Forking certainly can make processing a large number of large files go much faster. We have a system that does exactly that and forking allows things to run many times faster. But then, our processing of files is mostly CPU-bound as we are transcoding the files and this runs on a system with 32 cores exactly because of this. We just finished benchmarks on a revamp of this and it is about 4x faster than it used to be (despite it previously forking more workers than there are CPU cores). The old process worked pretty much exactly like Parallel::ForkManager. The new strategy pre-forks the same number of workers and just continuously feeds filenames to them over a simple pipe. There are several advantages to the new approach. The children are forked off of the process before it has built up the list of files to be processed, which will often be a huge list, so there are much fewer copy-on-write pages to eventually be copied. The children live a very long time now, so there is less overhead from fork()ing (once per worker instead of once per file). The above two features also mean that it makes sense for the children to be the ones to talk to the database, which is probably the biggest "win". It also significantly simplified the code. If your processing of files is mostly I/O bound, then doing a bunch of them in parallel could actually be slower than doing them in serial. Though, I would expect that your processing of one file isn't perfectly I/O bound and having at least two running will provide some speed-up as often one can use the CPU while the other is waiting for I/O. Once you have enough processes that you are maxing out the available throughput of either your CPU cores or your I/O subsystem, then adding more processes will just add overhead. - tye
In Section
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||||||||||||||||||||