Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Firstly: perl 5's threads are in a pretty grim shape - they're slow, they're quirky, and lots of stuff isn't as easy as it should be. I would recommend against using them if you can find another alternative.

Second: since you asked for an explanation of how fork is efficient, here's how it works:

Processors have real and protected addressing modes. Real mode is what people would normally expect - every pointer points to an address in memory. Protected mode is what is actually used 99% of the time in practice. Under protected mode, fetching a memory address is an indirect operation - the address is translated from the virtual address to a real one by the MMU (the memory management unit).

This is sensitive to the current process, and the way the notion of the current process is defined varies from processor to processor.

Memory is handled by the MMU in chunks called pages, which are typically 4kb long. These are the smallest unit the MMU will take care of in it's virtual/real addressing scheme.

When a process forks (under true fork, not vfork which I will shortly discuss) all of it's memory is set to be read only, and the process itself is duplicated, returning different values to each process. All the At this point almost no data is updated - the pages themselves are marked read only, but this is quick and trivial, and the process handle in the kernel is duplicated, and this is not a large structure, in comparison to the amount of memory a process can actually consume.

Whenever the processor tries to store a value in a read-only memory page the MMU raises a fault, which the kernel has to handle. The kernel will ask the MMU to copy the page, to a new location in virtual memory (all of the physical memory and the swap space), but makes the mapping appear, to the process, to be at the same memory location.

When the memory page has been copied, it is safe to actually write to it. By keeping a reference count of the number of processes sharing each page you can also unlock read-only pages when they are accessible by only one process.

vfork is a version of fork that is suited for fork and exec sequences... When you call 'vfork' nothing is copied - instead the child process has complete control of the parent's resources (file descriptors, memory, etc), until it calls exec to load another process image into memory.

When the child process finally executes the new program the parent process is resumed.

The reason the parent is suspended while the child is running is that since the addressing space is shared, so is the stack. Since the stack (together with the instruction pointer) represents the state of the process, including the return value from 'vfork' (and any called function for that matter), this value cannot be both 0 for the child and the child's process ID for the parent at the same time.

Slightly off topic but nevertheless interesting is the page swap system of an OS with virtual memory - when a page is resolved by the MMU, and it isn't in physical memory, a page fault is sent to the kernel. The kernel is then responsible for loading the memory page from disk, by swapping it with a physical memory page. When the data has been loaded to physical memory the access to the memory can finish.

Some notes:

  • Modern, complex MMUs can probably do copy-on-write for the kernel without raising a fault
  • vfork used to be a replacement for fork when copying the whole process was unnecessary but still costly. Since the process is just going to exec a new process image, there is no point in copying everything and then throwing it away. plain old fork() should be almost as efficient, more responsive, and probably safer too.
  • vfork might have other semantics on a platform that isn't MacOS X, i'm too lazy to check

Update: Ven'Tatsu cought a wordo confusing protected mode with real mode.

zz zZ Z Z #!perl

In reply to Re: Threads and fork and CLONE, oh my! by nothingmuch
in thread Threads and fork and CLONE, oh my! by xdg

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others contemplating the Monastery: (7)
    As of 2017-02-25 22:11 GMT
    Find Nodes?
      Voting Booth?
      Before electricity was invented, what was the Electric Eel called?

      Results (369 votes). Check out past polls.