mulli has asked for the wisdom of the Perl Monks concerning the following question:
I am reading up on using threads and in the past I know there have been issues with stability, speed, etc. So I need to get a few things cleared up.
1. The most recent documentation for threads.pm states that variables are by default thread local? I have also read that everything gets copied over to a new thread. Which is it?
2. I am reading that sharing data between threads is slow. Is this true? One of the main purposes for my application will be to share data between worker threads.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Perl thread confustion
by BrowserUk (Patriarch) on Feb 15, 2013 at 03:54 UTC | |
If you would describe your application, the shared data and usage patterns, we may be able to offer tips on the best way to program it. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
by 7stud (Deacon) on Feb 15, 2013 at 07:00 UTC | |
All lexical (my) variables are local to the thread they are declared in. Unless they are: Implicitly cloned by being made closures. Okay, here is a closure:
But in the following thread can the sub see $x because it closes over $x, or can the sub see $x because: All lexical (my) variables are local to the thread they are declared in. Unless they are: Implicitly cloned because they exist in the spawning thread prior to a 'child' thread being spawned.
If perl copies all the data to a thread, why doesn't the following code also output 20:
| [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Feb 15, 2013 at 07:28 UTC | |
But in the following thread can the sub see $x because it closes over $x, or can the sub see $x because: It can see it, because the sub closes over it. But it would have been copied to the new thread anyway even if the sub didn't close over it -- because it existed when the thread was spawned -- but it isn't useful within the thread because if it isn't closed over, nothing can see it (nor therefore use it). Hence my comment "I have no idea why this happens. In my opinion it should not.". If perl copies all the data to a thread, why doesn't the following code also output 20: Enable strict or warnings and perl will tell you why. (And note: I didn't say "all the data"; I said "exist in the spawning thread prior to a 'child' thread being spawned." It is a subtle, but very important difference.) But, if you doubt my assertion that non-closed-over variables created after the thread sub is declared but before it is spawned are also cloned, run this and monitor the memory usage using the task manager or your OS equivalent:
What you'll see is something like this. The array is created and memory usage jumps to ~900MB and levels out for 10 seconds before the thread is spawned. It then jumps to ~1.9GB. Despite that the thread can never make any use of the copy that is made, because it is not lexical visible to it. It makes no sense whatsoever, but try getting anyone to change it. But, as I said above, the good news is that it is easy to avoid, by spawning your threads before you populate data structures used by your main thread code. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
by 7stud (Deacon) on Feb 15, 2013 at 19:20 UTC | |
Re: Perl thread confustion
by 7stud (Deacon) on Feb 15, 2013 at 06:40 UTC | |
1. The most recent documentation for threads.pm states that variables are by default thread local? I have also read that everything gets copied over to a new thread. Which is it? Well, first everything is copied over to the new thread, then because everything is a copy, any changes to the copied variables don't effect the values of those variables in other threads, i.e. everything is thread local. | [reply] |
by BrowserUk (Patriarch) on Feb 15, 2013 at 07:38 UTC | |
any changes to the copied variables don't effect the values of those variables in other threads, i.e. everything is thread local. Unless the cloned variables are closed over or globals, they cannot be changed. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
by 7stud (Deacon) on Feb 15, 2013 at 19:29 UTC | |
any changes to the copied variables don't effect the values of those variables in other threads, i.e. everything is thread local. Unless the cloned variables are closed over .... You seem to be carving out an exception for something like this:
But even though the thread closes over $x, it cannot change the $x in main. So, it appears to me that the closed over variable is also thread local. | [reply] [d/l] |
by BrowserUk (Patriarch) on Feb 15, 2013 at 20:28 UTC | |
Re: Perl thread confustion
by sundialsvc4 (Abbot) on Feb 15, 2013 at 13:21 UTC | |
Data sharing is, as BrowserUK said, easy, and that’s most important in a good threaded design. Your programs won’t bugger up their storage-pool and crash. If you need to share data between threads, simply try to minimize the amount of code that actually contends for shared variables ... within sensible reason. It’s often the case that threads communicate with one another by means of thread-safe queues ... work-to-do lists and work-completed lists. This creates a simple way, not only to reduce contention, but to allow the various threads to work at their own naturally varying speeds. If a particular set of shared variables is frequently and contentiously shared by everyone, they would represent a “hot spot” in any design regardless of language used ... they would tend to cause the threads to be synchronous with one another and to spend too much time waiting on locks, which is not what you want to see. (Maybe the threads could instead include updated values in the messages they return to the work-completed queue.) Obviously, design is a nest of competing trade-offs. |