To achieve this with the current architecture I'm limited to about 12 -15 concurrent processes
that limit seems too low for the task you want to accomplish, specially if you have a good internet connection. Have you actually tried incrementing it to 30 or even 50. Forking is not so expensive in moderm Unix/Linux systems with support for COW.
update: actually, much of the overhead generated by the forked processes can be caused by perl cleaning up everything. On Unix, this cleanup is mostly useless, and you can get rid of it calling
exec $ok ? '/bin/true' : '/bin/false';
instead of
exit($ok) to finalize child processes. Just remember to close first any file you had written to.