Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

latest on ithreads vs forks?

by Anonymous Monk
on May 26, 2004 at 21:33 UTC ( #356736=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Good day, monks. I've seen some of the old discussion threads about the virtues/short-comings of perl's implementation of ithreads, and was wondering if changes released since most of these discussions (late '03) have converted any of the fork devotees.

To summarize what I got out of earlier talks, my impression was that perl's non-COW address space copying with ithreads made them actually "heavier" than the optimized fork() routines on most *nix systems. Data sharing was more convenient, but otherwise ithreads seemed to be frowned upon. Liz's forks.pm package, which implements the threads.pm interface using forks and sockets, seemed the more popular approach.

Now, admittedly, looking at the release notes, it seems like all we've really gotten is memory leak fixes for ithreads, not performance help. Nevertheless, when I run the same code with "use threads" vs "use forks", I get better perf with threads, regardless of any sharing. I'm still working on getting reliable cpu time numbers (it's tough to track child processes), but wall clock time on the same system is consistently faster for threads, and threads get 1/4 the page faults (more or less).

Does this seem consistent with what others would expect? Thanks.

Comment on latest on ithreads vs forks?
Re: latest on ithreads vs forks?
by perrin (Chancellor) on May 26, 2004 at 21:48 UTC
    You have to remember that Liz's "forks.pm" module, while very cool, is just one possible implementation of sharing data across forked processes. Depending on what you are doing, it may be possible to do something much more efficient. The forks.pm module is constrained by the fact that it is trying to match the threads API and to be portable across many OSes. Those are not important considerations for most in-house apps.
      True, but even if I fire off threads/processes that just print a message and do no sharing, ithreads were faster. That suggests to me that the overhead of forks is greater than that of threads.
        Are you on Windows? Forking is mostly faked on Windows. Are you loading any modules before using threads? Threads cause all loaded code to be copied into each spawned interpreter. Did you try a simple $pid = fork() approach? There may be overhead in forks.pm due to the emulation of the threads API.
Re: latest on ithreads vs forks?
by Joost (Canon) on May 26, 2004 at 22:10 UTC
      Sure thing. I'm running on a P4 with Linux 2.4.9. Here's a simple test I tried, using perl 5.8.4:

      # which threading model do we want? use threads; #use forks; $num_threads = shift(@ARGV) || 10; foreach (0..$num_threads-1) { # create a new thread push(@threads,threads->new(\&thread_sub)); } foreach $thread (@threads) { # wait for threads to finish $thread->join(); } printf "Orig thread done\n"; sub thread_sub { printf "Thread %d started\n",threads->tid(); }

      When I run time on a series of runs with threads.pm/forks.pm, I can't tell how much cpu time forks.pm really uses because I only get the cpu time of the parent (working on that), but I the average page faults looks like this:

      threads.pm: Major (requiring I/O) page faults: 387.00 Minor (reclaiming a frame) page faults: 1151.20 forks.pm: Major (requiring I/O) page faults: 470.00 Minor (reclaiming a frame) page faults: 4591.20

      Wall clock time, for what that's worth (these are all run on the same machine), is consistently 4x for forks vs threads.

        Can you try loading up some heavy modules like DBI, HTML::Parser, LWP, and IO::Handle, all before spawning threads, and see what happens? Wall time seems like a reasonable metric to me. Also, you might want to compare memory used.
      Latency in the inter-thread communication in forks is much higher than with threads, as it uses a TCP/IP socket and Storable for it.

      Thread startup, especially if you have a lot of modules loaded, should be faster with forks, as there is no copying of data-structures when a thread is started with forks (it just does a fork()). However, every thread with forks, has a bit of latency at startup, because of the setting of the inter-thread communications socket.

      Hope this explains it a bit.

      Liz

Re: latest on ithreads vs forks?
by BrowserUk (Pope) on May 27, 2004 at 00:34 UTC

    Update: Incase it is unclear, this post refers ONLY to the situation when using ithreads.

    (Though it's also worth remembering that under Win32, fork is emulated using ithreads under the covers.)

    It's worth noting that unless you intend using a module in multiple threads, there is little point in loading that module prior to spawning your threads. It only consumes extra memory.

    If you do need to use a module in multiple threads, remember that you cannot call methods across threads, or share objects. That is to say, if you create an object in one thread, and share it with another thread and then try to invoke methods upon the object in the second thread, it won't do what you want it to.

    It is almost always better to require the modules needed by a thread, from within that thread, once it has started.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      Actually, loading a module before fork will more likely save you memory. Forking uses a "copy on write" semantic. That is, the child process shares its memory pages with its parents until one or the other tries to write into that page. Only at that point does it make a copy of that page for the process that is trying to write.

      So if you take an application that has a fairly large number of modules (and the perl interpretter itself) sitting in pages that aren't going to get overwritten, compared to a fairly small number of pages that are getting actively written to, then preloading is going to save you memory (if a fair portion of the modules will be used in more than one of the forked processes). Not to mention the fact that it will also save you time.

      Anyway, here's an example. I've got a web server here with three processes: one is a controller process, and two are worker processes. In the first example, I preload a large number of the perl modules before forking. In the second example, the modules are loaded after forking.

        PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
      16701 me         0   0 35712  34M 31832 S       0  0.0  1.6   0:02 httpd
      18302 me         0   0 41400  40M 33100 S       0  0.0  1.8   0:04 httpd
      32301 me         0   0 41932  40M 33776 S       0  0.0  1.9   0:02 httpd
      
      
        PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
       2906 me         0   0  4608 4608  3052 S       0  0.0  0.2   0:00 httpd
       2907 me         0   0 35084  34M  5844 S       0  0.0  1.5   0:10 httpd
       2908 me         0   0 38804  37M  5880 S       0  0.0  1.7   0:11 httpd
      
      Paying careful attention to the SHARE column, you see that the total actual used memory is about 50 megs for the preloading and about 65 megs for the non-preloading. Also, you see that two seconds of processing time in the controller (parent) process when preloading that aren't in the controller process when not preloading. You can think of those two seconds as "shared" seconds, too, in much the same way as the shared memory (because two seconds happens to be about how long it takes to load the large corpus of perl modules involved here).
      ------------ :Wq Not an editor command: Wq

        What you discussed here is not exactly what BrowserUK mentioned. As he is one of those who has contributed lots here to the threading topic, be default, I trust that when he said thread, he really meant thread, not fork or child processes.

        I like the content of your node, no dount about that, but I think it is worth to clarify that you didn't strictly stay on the same subject as his.

Re: latest on ithreads vs forks?
by pg (Canon) on May 27, 2004 at 01:44 UTC

    I am following up on threads, and threads::shared all the time. There is basically no profound improvement in this area, although there might be bug fixes, like memory leaks. So to give a direct answer to your question, if any defect discussed earlier on this forum discourages you, then don't use threads at this time.

    This does not mean that I agree that fork is better than threading, but rather that it is not wise to invest in perl thread at this stage.

Re: latest on ithreads vs forks?
by Anonymous Monk on May 28, 2004 at 16:18 UTC
    Got some interesting (to me) data on this comparison. After finding a way to track memory and cpu usage for child processes, here's how threads vs forks stacks up:

    threads.pm: WC: 13.76s Usr: 0.39s Sys: 0.07s CSv/f: 0/0 IOops: 0/0 Sigs: 0 Swaps: 0 PF: 2500/1265 Msg: 0/0 Me +m: 65499 forks.pm: WC: 18.30s Usr: 0.33s Sys: 0.22s CSv/f: 0/0 IOops: 0/0 Sigs: 0 Swaps: 0 PF: 14427/2524 Msg: 0/0 M +em: 5454

    These come from running the exact same tests ~20 times, varying only the use of threads.pm vs forks.pm. The code for this test is below, it does do a little bit of data sharing (since my intended usage model will as well).

    Interesting to note that forks.pm does use far less memory, but gets far more page faults, and also uses more cpu time. Guess it depends which is more valuable to you. As stated by others above, forks.pm isn't necessarily the ideal implementation for using forks, but rather a convenient one because it uses the threads.pm interface.

    Here's the code I used. For both threads and forks, I ran 9 times with 10 threads, 6 with 20, 3 with 50, and 3 with 100.

    # which threading model do we want? use threads; use threads::shared; #use forks; use forks::shared; $num_threads = shift(@ARGV) || 10; $nonshared_var = 1; my $shared_var : shared = 1; foreach (0..$num_threads-1) { # create a new thread push(@threads,threads->new(\&thread_sub)); } foreach $thread (@threads) { # wait for threads to finish $thread->join(); } printf "Orig thread done\n"; sub thread_sub { my($count) = 20*rand() + 1; my($random); printf "Thread %d started\n",threads->tid(); # do some stuff while ($count-- > 0) { $random = (5*rand()) | (time() & 0x3); $nonshared_var += int($random); lock($shared_var); $shared_var += int($random); sleep($random); printf(" Thread %d loop %d: random=%d, nonshared_var=%d, ". "shared_var=%d\n",threads->tid(),$count,$random,$nonsha +red_var, $shared_var); threads->yield(); } printf("Thread %d DONE!\n",threads->tid()); }
      If you are interested to test the memory usage of different ways of using Perl ithreads, there's of course Benchmark::Thread::Size by yours truly.

      Liz

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://356736]
Approved by kvale
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2014-07-26 09:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (175 votes), past polls