Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

Jim has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;
by hdb (Monsignor) on Feb 04, 2014 at 07:47 UTC

When $pm->start returns, two processes exist, the parent and the child. In the parent process, $pm->start returns the process id which is a non-zero value (ie true) and thus and executes the next command. In the child process, $pm->start returns 0 (ie false) and thus and short-circuits (if the first argument is false already, and returns false always without evaluating the second argument). This way, next is not executed but the remaining body of the loop.

So, you are correct, next is skipped (in the child) and not skipped (in the parent).

[reply]
[d/l]
[select]

Re^2: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Jim (Curate) on Feb 04, 2014 at 08:02 UTC

Thanks. I still don't get it. I'm having a more profound problem than just not understanding the idiom. I don't understand the ubiquitous expressions I read over and over again in the documentation and in PerlMonks tutorials and posts: "…in the parent process…", "…in the child process…", etc. These make no sense to me. And running the example scripts and observing their behavior isn't helping, but instead making it worse because the behavior is utterly counterintuitive to me. The particular example code I'm running from the module's perldoc page has sleep()s in it that I know the code is reaching, but are never actually happening. There's never a pause in the execution of the program. It blows right past the sleep()s.

[reply]
[d/l]
[select]

Re^3: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by karlgoethebier (Abbot) on Feb 04, 2014 at 09:17 UTC

Mh, imho the example with the callbacks is very nice and instructive. Did you run it and ~~take a look at~~ watched top during execution?

Regards, Karl

ŤThe Crux of the Biscuit is the Apostropheť

[reply]

Re^4: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Jim (Curate) on Feb 04, 2014 at 17:10 UTC

Re^5: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by karlgoethebier (Abbot) on Feb 04, 2014 at 18:24 UTC

Some notes below your chosen depth have not been shown here

Re: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;
by Anonymous Monk on Feb 04, 2014 at 08:07 UTC

Mr. Peabody Explains fork()

[reply]

Re^2: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Jim (Curate) on Feb 04, 2014 at 16:56 UTC

Gosh, I remember reading bart's excellent tutorial Mr. Peabody Explains fork() years ago when fork() was relatively new to Perl on Windows. Obviously, I wasn't able to wrap my head around it then, and I continue to struggle with it today. But I'm more determined now than I was then to understand how it works.

At least one light bulb has gone on above my head after reading this and seeing its accompanying graphic in blue and green text:

What fork() does is extrordinary! It takes the existing process and clones it. If you're familiar with Star Trek, this is like a bad transporter accident. An exact copy is made of each process, and each process is almost unaware of the other.

The code is executed instruction-by-instruction until the fork() call is reached. At that point, the process is cloned and now there are two identical processes running the instruction right after the fork().

In fact, they're so identical that only one bit of information distinguishes between them: the return value of fork(). In the child process fork() appears to have returned 0. In the parent process, fork() appears to have returned a non-zero integer. This is how parent and child tell themselves apart.

The important difference here is the explanation that "the code is executed instruction-by-instruction until the fork() is reached." This is the first time I've seen it plainly stated that fork() alters the ordinary sequential execution of statements in a Perl program. This helps me begin to understand what's going on in the mystifying example program I'm studying.

I likened fork() in Perl to job control in the Unix shell, but I'm now realizing they're not the same thing at all. A shell loop that launches commands in the background with & is not the same as a fork() in Perl.

Jim

[reply]
[d/l]
[select]

Re^3: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Anonymous Monk on Feb 05, 2014 at 00:20 UTC

Here is a forking program expressed as a tree (this is parent)

code before fork
fork call
code after fork

The child process execution is

fork call
code after fork

When the fork call is reached, the parent process creates a child process (clone), and returns a pid for the new child process

The child process doesn't clone/create a new child process (only parent does that), and fork returns zero

After that the two processes are identical, and each of them continues execution from the point of fork ... they both execute code after fork

The child doesn't start from beginning to execute code before fork (its not a new process, its a clone of the parent)

When you run the program here is the execution order as a tree

before fork
fork() ....
- - parent clones child, and gets not-zero (pid of child)
  - code after fork is executed
- - child gets zero (only parent forks/clones itself)
  - code after fork is executed

Or the same in table tree form :) the parent process is started and it runs

code before fork

fork call (I parent clone child here) I child gets 0 and I run in parallel

code after fork code after fork

[reply]

Re^3: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by soonix (Canon) on Feb 06, 2014 at 10:45 UTC

plainly stated that fork() alters the ordinary sequential execution of statements

No. It doesn't state that. It emphasizes that both before and after the fork() there is no difference except for fork's return value.

The only alteration is: after the fork(), there are two identical copies of those instructions. If there were is no if around the fork() (or you save the return value to a variable and evaluate it afterwards), you would not be able to distinguish between them. The "ordinary sequential execution" is not altered.

[reply]
[d/l]
[select]

Re: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;
by sundialsvc4 (Abbot) on Feb 04, 2014 at 15:10 UTC

It can be confusing, sure. Think of it this way: when the parent process start()s a child process, there are, from that moment forward, two processes executing more-or-less parallel with one another. (I say “more-or-less” because the relative timing of the two can’t be exactly predicted ...) Nevertheless, both are executing the same Perl code, at the same point, but with exactly one very important difference:

In the parent, the process-ID of the new child process is returned as the value of the call to start().
In the child, the returned value is zero.
And that’s how the two of them each know which one they are.

The “fork in the road” (heh...) therefore happens with and next. This is shorthand, exploiting Perl’s use of “short-circuit” expression evaluation. The parent immediately continues with the foreach loop (or ends it); the child proceeds. (Here’s the short-circuit: since 0 and anything is known to be false, the right side of the and clause (that is to say, the next statement) is never evaluated in the child.)

Ordinarily, the console-output of both processes will now be intermingled on the screen, in no particular (exactly predictable) order, and yes, you will see that the child does sleep() properly, as any process would do. If you still don’t observe that, please post a snippet of your code (remember to use <code> tags ...) so that we can point out the error of your ways. It is a bit tricky ...

Re^2: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Jim (Curate) on Feb 04, 2014 at 20:20 UTC

It's not terribly important to this discussion that I mention this, but the and next bit is not part of what's confusing me. I understand this part of the idiom.

However, instead of doing this…

CHILD:
foreach my $child ( 0 .. $#names ) {
    my $pid = $pm->start($names[$child]) and next;

    # Child process...
}
[download]

…I would typically prefer to do this…

CHILD:
for my $child ( 0 .. $#names ) {
    my $pid = $pm->start($names[$child]);
    
    next CHILD if $pid != 0; # Parent process

    # Child process...
}
[download]

I think this is more in line with PBP, but I could be mistaken.

Jim

[reply]
[d/l]
[select]

Re^3: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by runrig (Abbot) on Feb 04, 2014 at 23:35 UTC

for my $name (qw(foo bar baz)) {
  $pm->start($name) and next;
  ...
}
[download]

[reply]
[d/l]

Re^4: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by Jim (Curate) on Feb 05, 2014 at 01:46 UTC

Re^5: Please Explain the Parallel::ForkManager Idiom my $pid = $pm->start and next;

by runrig (Abbot) on Feb 05, 2014 at 19:18 UTC