Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

[solved] Parallel::ForkManager memory mixup?

by ground0 (Novice)
on Mar 28, 2013 at 18:36 UTC ( [id://1026015]=perlquestion: print w/replies, xml ) Need Help??

ground0 has asked for the wisdom of the Perl Monks concerning the following question:

Here I have nested forks with objects created prior to for loops. When $html = $mech->content; and $html .= $mech->content; happens it seem to mix up concurrent $mech object. My result is wrong document elements in my temporary files.

This is an upgrade to my existing production code which only runs one FireFox and does not use multiple FireFox profiles and Mozrepl tcp/ip ports.

for each $country_key (keys %$countries) { $fork_countries->start and next; ... foreach $case_no (keys %$case) { $fork_cases->start and next; #do DBIx stuff to get random $ff_profile / avoid race condition $ff_port = 42420 + $ff_profile; &myMain($ff_profile, $ff_port); $fork_cases->finish; } $fork_cases->wait_all_children; $fork_countries->finish; } $fork_countries->wait_all_children; sub myMain { $mech = WWW::Mechanize::Firefox->new( launch => ['firefox','-P',$ff_profile,-no-remote'], repl => "localhost:$ff_port", bufsize => 10_000_000, tab => 'current', autoclose => 1 ); $mech->get($url); $url = $new_url; ### KLUDGED HERE ### # # $temp = $mech->uri; # (undef, $uri) = split(/(\?.*)/, $temp); # $url .= $uri; # ### $html = $mech->content; $mech->get($url); #or eval, click(), etc. $html .= $mech->content; #open temp filehandle and print $html #(temp filenames padded with String::Random) $mech->get(temp_filename); $png = $mech->content_as_png; #add png to PDF with PDF::API2 #write PDF file }

According to what I can find on Parallel::ForkManager each forked $mech and $html should be separate. However, when the $url is the same in each fork either the $mech object, or the $html scalar get mix up.

Then what happens when I do some other method, say $mech->eval() then $html .= $mech->content; the $html I want to print to temp file is mixed with what I thought should be totally separate FireFox profiles and HTTP sessions.

Replies are listed 'Best First'.
Re: Parallel::ForkManager memory mixup?
by Corion (Patriarch) on Mar 28, 2013 at 22:55 UTC
    If that is really your code, and $ff_port is the same for all Firefox instances, you will only be using one instance, and I would expect such mixups.

      No this was just pseudo code to try to set an example. I have pre-created 64 FireFox profiles each with Mozrepl running on a different port. $ff_profile is picked in a DBI call and $ff_port is a sum of that and equals the Mozrepl setting.

      I've watched it run. It visually appears as if it is running perfectly. I've limited the number of forks to 25 (5x5) and WWW::Mechanize::Firefox appears to do every method properly. Except that when a set of forks goes to work on cases with the same $url, the $html for each fork is a jumble of the $html of other forks. This result in a temporary html with a mix of case details, which end up in the proper PDF.

      I update the pseudo code to address the fact I am indeed using multiple unique instances/profiles of FireFox.

      Tonight I'm going to run a test where I write the temp HTML file immediately after each $html = $mech->content; and see if that has any effect.

        Didn't effect anything, meh. :(

        If anyone has a recommended test URL with a few different elements to grab I can tune up the pseudo-code above into something runnable that replicates the issue (assuming this is interesting enough hah).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1026015]
Approved by herveus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-25 10:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found