The stupid question is the question not asked | |
PerlMonks |
Storable problem of data sharing in multiprocessby hhheng (Initiate) |
on Oct 03, 2014 at 08:49 UTC ( [id://1102710]=perlquestion: print w/replies, xml ) | Need Help?? |
hhheng has asked for the wisdom of the Perl Monks concerning the following question: I tried to develop a script to grab urls from a website, and since it's a very big site I need to fork many processes and then use Storable to share the data among processes. Parent process will fetch the main page for some urls, put in a hash and array. The hash is for containin the urls, while the array is also for containation of the urls, but will be used for iteration (shift one url each time) and 0 urls as the end of iteration. Then the child process will fetch pages for links, put into the hash and array. The design is that the child process will work on the shared hash and array, but my script actually all copy the hash and array from parent process. See script below: Please see the test result by this link: http://www.aobu.net/cgi-bin/test_gseSM.pl. And you can see each child process is doing the same thing without sharing %urls and @unique_urls between them.
Testing the code with a small size website, and found that each forked child process will get the %urls and @unique_urls from the parent process which marked as the start point, while my aim is that each child process will write to %urls, and each process will shift urls from and then push urls into @unique_urls, and then each process will retrieve the other child process modified %urls an @unique_urls.
I don't want to use other modules like IPC::Sharable, Parallel::ForkManager, etc to achieve my aim, and just want to use fork and Storable module. Can anybody tell me what's wrong in my script?
Back to
Seekers of Perl Wisdom
|
|