in reply to Re: Parallel::ForkManager and CPU usage? in thread Parallel::ForkManager and CPU usage?
- Whether the processing will be CPU- or IO- bound depends a lot on the processing. If the processing is complex enough, the IO will be negligible.
- If the processes "spend most of their time waiting for disk-IO" and all those images are on the same disk, then starting a lot of processes, all competing for the same disk, is not the best thing to do. Disks nowadays have caches and clever firmware doing read-aheads and other tricks to minimize the need to move the reading heads too much, but with enough processes reading big enough images you can easily render all the caching ineffective and spend time waiting for the heads to move to read the next bit of one of the files. The fact that the tasks are IO-bound doesn't necessarily mean you should start many.
- If the processing takes long enough, then starting and destroying a new process for each and every image may not matter much, but it might still help to start eight processes and keep them instead. The easiest solution would be to split the list into eight parts at the start and start a script to process each batch. With thousands of images of a fairly random size, they should all end their work at around the same time, give or take a few images.
Jenda
Enoch was right!
Enjoy the last years of Rome.
|