Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Perl threading stability?

by guice (Scribe)
on Jul 25, 2005 at 16:04 UTC ( #477876=perlquestion: print w/ replies, xml ) Need Help??
guice has asked for the wisdom of the Perl Monks concerning the following question:

Here shortly I'm going to be recompiling the Perl binary that we currently use on a number of Solaris systems. It's Perl 5.83 and not compiled with threading due to some rumored stability issues with threadings.

I'm curious on what the monk community's thought is on Perl's threading stability. Is it worth getting into now?

I'm currently incharge of an application that relies on forking for speed, but lacks the maintaining of state due to the nature of forking. I've been thinking of moving into Perl threadings, but fear of possible stability problems (one co-worker spoke of how her herd that threaded Perl causes issues with other things non-thread related).

What are your thoughts and experiences with Perl threading?

-- philip
We put the 'K' in kwality!

Comment on Perl threading stability?
Re: Perl threading stability?
by tphyahoo (Vicar) on Jul 25, 2005 at 17:21 UTC
    I still haven't really grokked threads, partly because timtowtdi, partly because I haven't been *forced* to yet.

    At What is the fastest way to download a bunch of web pages? various wise monks shared their wisdom about differing ways to accomplish things with "asynchronous" perl: threads, ithreads, fork, poe. This discussion may be illuminating if you are looking for various community "takes" on threads.

Re: Perl threading stability?
by dave_the_m (Parson) on Jul 25, 2005 at 17:27 UTC
    I am not aware of any problems that that a threaded build of perl causes for non-threaded applications, apart from a minor drop in performance.

    As for writing threaded apps, bear in mind the following:

    • use the latest possible perl, ie 5.8.7; user-level threads only appeared in 5.8.0, and there have been many fixes in the minor releases that followed.
    • Perl ithreads are generally less efficient than forking; fork() uses the OS's memory-management facilities to do copy-on-write of the process's pages in hardware, while perl's threads->new() goes through the entire interpreter's data structures and copies each individual variable etc (only code is shared, not data).
    • Shared data in perl's ithreads is very expensive, both in terms of speed and memory usage. A shared scalar variable usually has a copy in every thread that uses it, plus a separate shared copy. When you assign to a scalar, the thread's copy is updated, then that data is also copied to the hidden shared copy. When you read from a shared scalar, the data from the hidden shared copy is copied to the thread's one. Thus if you have N threads all accessing a long string variable, your memory usage will tend towards (N+1) times the length of the string.
    • Shared arrays and hashes use a mechanism similar to tieing, which means that they are slow, but not too memory-inefficient.
    • Try to avoid having complex, deeply-nested shared data structures

    Dave.

      Now this part surpises me:
      Shared data in perl's ithreads is very expensive, both in terms of speed and memory usage. A shared scalar variable usually has a copy in every thread that uses it, plus a separate shared copy. When you assign to a scalar, the thread's copy is updated, then that data is also copied to the hidden shared copy. When you read from a shared scalar, the data from the hidden shared copy is copied to the thread's one. Thus if you have N threads all accessing a long string variable, your memory usage will tend towards (N+1) times the length of the string.

      I would have thought Perl use references for everything, not complete copies, making things much less memory and resource intensive...

      Try to avoid having complex, deeply-nested shared data structures

      Doh! One of the ideas was to use an XML::Simple object to store server data and each thread would update that object based on possible changes ... looks like I might have to keep with fork() with a slight restructure in how i would modify that main hash.

      -- philip
      We put the 'K' in kwality!

      <blockquot>I am not aware of any problems that that a threaded build of perl causes for non-threaded applications, apart from a minor drop in performance.

      Not sure if you'd know, but I thought I'd ask; know how minor? or how impacting is it to currently non-threaded scripts?

      One thing I do have to watch out about is this version will be put out onto approximately 350 servers. At this time, my data collection scripts, only, will be using it. The collections script is not to be threaded. The threading is more for the "server" side script which gathers all the data collection dumps and loads it into the database.

      -- philip
      We put the 'K' in kwality!

        I knocked up this random meaningless benchmark script:
        my $t; for my $i (1..10_000_000) { if ($i % 3) { $t += $i; } }
        and ran it a few times on a threaded and non-threaded perl build (a recent bleedperl) and got average timings of 4.46s and 4.99s, so about 10% slower.

        Dave.

Re: Perl threading stability?
by perlhaq (Scribe) on Jul 25, 2005 at 17:33 UTC
    The biggest problem with Perl threads is that they gobble up tons of memory. You can minimize that by spawning your threads early on, before many modules are loaded, and then loading stuff at runtime with require/import instead of at compile-time with "use". But that means you also have to avoid spanwing new threads after all those modules have been loaded and any large data structures have been initialized, because all that stuff will get copied over into any newly-created threads.
    I found threads useful only in Win32, because that platform lacks a native fork() system call and doesn't support many standard IPC mechanisms (or provides very poor support for them). So short of using pure multiplexing everywhere (which is hard, and sometimes counter-productive), threads can be very helpful. But again, you have to be careful about /when/ they get spawned, or else your process will keep growing in size...
    In any case, don't expect Perl threads to give you a perfomance advantage over fork(). On SMP machines, separate processes may well run concurrently, but not Perl threads. I'd even be surprised if they were spawned faster than today's nice, fast copy-on-write fork() implementations.
Re: Perl threading stability?
by BrowserUk (Pope) on Jul 25, 2005 at 17:46 UTC

    You may find this subthread informative.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

      Thanks, it was.

      I can see now that Perl5's threading isn't ready for main stream. Having to copy everything is just far too much of an overhead. The idea that Perl updates the "copies" of variables whenever a thread updates a variable is far from efficient.

      I don't buy Perl not being able to use references for it's data since it's done in real threading applications just fine. It looks to me that Threading in Perl was added just as a "hack job" to make shared data across "fork()"s (I do know a thread not really a fork()).

      My idea wants the ability to share data across threads while maintaining somewhat of a speed benefit from multiple children. However, due to the complexity of Perl's own threads, the performance hit seems to negate the benefits of using threading in the first place.

      -- philip
      We put the 'K' in kwality!

        If you want maximum speed use shared memory. See perldoc IPC::SysV

        I'm not really a human, but I play one on earth. flash japh
        I don't buy Perl not being able to use references for it's data ...

        You obviously do not appreciate the problems involved.

        ... since it's done in real threading applications just fine.

        What do you call "real threading"? Are you thinking of Kernel threads in C or User threads in say Java?

        Each of these forms of threading have their own set of demands and limitations. In addition, chosing to use either language to benefit from that flavour of threading imposes the disadvantages that language imposes upon the rest of your application.

        My idea wants the ability to share data across threads while maintaining somewhat of a speed benefit from multiple children.

        Please describe your application in some detail. Ithreads can be used to accomadate many uses, but it does require that you understand both their advantages and limitations.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: Perl threading stability?
by cider (Acolyte) on Jul 26, 2005 at 13:34 UTC
    Threading became stable as of 5.8.5

    Thread implementations previously before this would die
    mysteriously if a program was doing a large number of
    memory operations over a period of 24 hours of greater.

    As of 5.8.5 perl was capable of handling threading for
    approximately 24-72 hours (3 days) but thats honestly
    kindof a vague expectation.

    I actually reccomend that you design any long running
    thread implementation as a state engine capable of
    resuming where you left off, and when you left off.

    Perl 5.8.7 seems to handle things alot more stable,
    however please heed my advice regarding the state engine.
Re: Perl threading stability?
by SimonClinch (Chaplain) on Jul 26, 2005 at 13:58 UTC
    It might be worth re-evaluating your approach to fork before grasping for threads modules. Especially if using Solaris, the IPC set of modules, notably IPC::SysV and IPC::Semaphore may be the most pertinent low-overhead support for fork.

    Moreover, there is ample guidance for using the low-level system facilities the above IPC modules give you access to in the Programming Perl book ISBN 0596000278

    One world, one people

      Thanks, the main reason about a jump into threads, vs fork(), was purely due to data sharing. I can run the script just find using a bunch of fork() and Parallel::ForkManager does makes things easier.

      The problem I kept running into is data sharing. The ability to maintain variable state throughout each thread. But would appear this is the biggest hog/limitation within Perl threading.

      -- philip
      We put the 'K' in kwality!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://477876]
Approved by marto
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2014-12-20 19:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (97 votes), past polls