Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Status and usefulness of ithreads in 5.8.0

by BazB (Priest)
on Jul 14, 2002 at 21:57 UTC ( #181655=perlquestion: print w/ replies, xml ) Need Help??
BazB has asked for the wisdom of the Perl Monks concerning the following question:

Barring the limitations listed in perl-5.8.0-RC3's perldoc threads, threads::shared and perlthrtut, what status is Perl's current threading implementation considered to be?

As of 5.6.1 it was experimental, but I cannot see any such warnings in 5.8.0. Are they now considered production quality, or just not quite so experimental as to be labelled as such?

I've read some of the earlier articles discussing fork() and Threads (now superceded by threads), but does Perl's current thread implementation offer the benefits of threading against just using fork() and spawning full child processes?

From a design and resource usage point of view, could someone please give some conditions where threading is considered preferable to fork()'ing?
I use fork a fair bit in order to process very large chunks of data in parallel, but I'd be interested to see if threading this sort of thing has any benefits.

Cheers.

BazB

Comment on Status and usefulness of ithreads in 5.8.0
Select or Download Code
Re: Status and usefulness of ithreads in 5.8.0
by samtregar (Abbot) on Jul 14, 2002 at 23:13 UTC
    My impression is that the core threading code is fairly stable. However, many popular Perl modules are not yet thread safe - DBI is one that I know of. Most pure-Perl modules will be thread safe out of the box but most XS modules will not be, particularily if they access non-thread-safe C libraries underneath.

    As far as fork()-vs-threads the main difference is in how the processes communicate. With fork() you can communicate using pipes but it is far from convenient. Threads can share() variables and communicate quickly and easily. Also, under some operating systems threads can be created with lower overhead than fork().

    -sam

Re: Status and usefulness of ithreads in 5.8.0
by Abigail-II (Bishop) on Jul 15, 2002 at 09:32 UTC
    • "Thread" is a buzzword, fork isn't. Of course, fork has clocked over 30 years of mileage, while threads are the new kid on the block.
    • Threads have been invented and reinvented. So we have kernel threads, green threads, posix threads, interpreter threads, etc. It's just like Java, it never works the same twice. fork is just fork, you just pull out your Stevens or other book from your bookshelf, and it just works reliably, with a clear and simple interface.
    • Inferior OSses don't have fork.
    • Threads mangle each others variables by default, and you need to explicitely guard yourself against that from happening. (Note that just *looking* at a variable can mean Perl changes the variable under the hood - it's not a safe operation.) fork is safe for your data, you need to explicitely share data (using shared memory for instance).
    • Compared to a select() loop (as provided by for instance POE), threads give you underterministic behaviour, while a select loop gives you deterministic behaviour.
    • Do you know which thread is going to handle a send signal? On any signal capable OS? I know when I fork().

      What exactly are you criticizing? Windows? (What makes an "inferior OS"?) Java? No, threading is not just a buzzword, and it frankly doesn't matter that threads have gone through many changes through the years. The history of threads demonstrates that they are not the "new kid on the block". So they've evolved over the years; this is not a useful critique. Threads, at least in their contemporary manifestations as in Win32, Java, and ithreads as of Perl 5.6, allow lightweight and highly manageable distibution of resources -- especially when it comes to interpreted languages due to the overhead of the interpreter or virtual machine -- whereas forked processes are more memory intensive and more difficult to work with. The difference between interprocess and intraprocess communication is a big one.

      It's just like Java, it never works the same twice

      Really? Core Java has been stable since at least 1997. If you want to criticize Java's stability, please look first at Java extensions like Swing, J2EE, or the implementation of (the recently added) anonymous inner classes. And by the way, threads were built into the language from its very inception, in fact one of the revolutionary aspects of Java was the fact that Gosling/Sun realized the usefulness of threads early on and built that capacity into the language for the 1.0 release. Abigail-II, if you're going to slander another programming language, at least make it believable.

      Threads mangle each others variables by default, and you need to explicitely guard yourself against that from happening.

      The benefit of threading as opposed to forking processes is shared memory space, and that's why Java and threading packages in C++ and Win32 support code that is thread safe. The bottom line is that it's a trade for efficiency in memory usage in exchange for the extra effort it takes to type "sychronized" in your method declarations.

      I for one am glad that thread support in Perl is progressing, and that it will be fully realized in Perl 6.

      Threads mangle each others variables by default

      Not according to perldoc threads:

      It is very important to note that variables are not shared between threads, all variables are per default thread local. To use shared variables one must use threads::shared.

      I wrote my first multi-threaded app under OS/2 v1.0 in early 1987 whilst working at IBM. The os (at that time called CP/DOS) was still so early in it development the only IO possible was Beep(frequency,duration). Proving that multiple threads were running meant having each thread beep at a different pitch and/or tempo. WinNT is a successor to/deriviative of the same core mechanisms as were developed for OS/2.

      They have come a long way in the last 14 years. The 'problems' of synchronisation and re-entrancy are well understood and easily handled.

      There are many good reasons to knock Win32. Braindead CLP, bloat, undue care and attention to security issues and fixes. All that source under the lock & key of one of the most predatory companies around.

      There seems little benefit in constructing specious arguments against the threading, critical sections and semaphore mechanisms. Some of the bits they actually got right.

      Threads mangle each others variables by default, and you need to explicitely guard yourself against that from happening. (Note that just *looking* at a variable can mean Perl changes the variable under the hood - it's not a safe operation.) fork is safe for your data, you need to explicitely share data (using shared memory for instance).

      Not true. Variables under ithreads (what threads.pm implements) need explicit sharing. (You do this by saying my $variable : shared.) The difference is that Perl still manages your variables, so you don't have the problems inherent in using normal Unix IPC (such as it not meshing well with Perl).

      Do you know which thread is going to handle a send signal? On any signal capable OS? I know when I fork().

      I'm not sure about this, but I think Perl 5.8's new safe signals have a way to deal with this.

      =cut
      --Brent Dax
      There is no sig.

Re: Status and usefulness of ithreads in 5.8.0
by kal (Hermit) on Jul 15, 2002 at 21:42 UTC

    In a way, it's useful to look at the various OS implementation of threads. Take, for example, Linux - not POSIX-compatible threads, but near enough. But, they're implemented pretty much as fork()s - that's where the non-POSIX-compatibility comes from (or, at least, the most part).

    So, what you're actually doing is comparing the management of fork()s. fork(), especially when COW, is extremely lightweight, the main cost being the task switch and the increased complexity in scheduling.

    Threading is, therefore, unlikely to gain much in terms of "raw speed". What it might gain, though, is increased readability of code - it's very obvious to see the thread of execution of a master thread, whereas following the various parents of fork()s is not always obvious. And, of course, on non-Unix derived OSes (Windows) threads may actually be quite a win, presuming that they are implemented fairly natively. My personal feeling, though, would be go with what makes the code work out best - if it's easier to understand with threads (esp. working with ex-Java/C++ programmers), that might be the best route.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://181655]
Approved by BrowserUk
Front-paged by Aristotle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (10)
As of 2014-07-13 13:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (249 votes), past polls