Do you know where your variables are? PerlMonks

### Score: Perl 1, Ruby 0

 on Feb 21, 2006 at 04:03 UTC Need Help??

I should have known better, but I experimented with threading in Ruby on Win32. I started with the condition variables example in the Ruby Programming book. A different version of the book, but online, is available here: Ruby Programming

To my detriment, I quickly skimmed through the introduction for the chapter on Threads and Processes. I then quickly created a nifty script that has a controller thread and a worker thread. The controller thread puts tasks on a shared queue.

I was able to whip something together which quickly suited my needs, or, so I thought.

This morning I was thinking, "Wow, this Ruby could be something." Well, I did some more testing and tweaking of my code, and then finally gave the worker thread some real work to do. I am using the worker thread to run Subversion commands such as:
svn update \path\to\tree
svn cleanup \path\to\tree
svn etc ...
[download]
I was slightly surprised when the worker thead running svn update command caused thread starvation and hung all threads. If only I had read the chapter introduction closely.

I know this is OT for Perl, but I think it illuminates how Perl is a real world language, while Ruby is still somewhat of a pipe dream. By the way, if you know how to work around thread starvation with Ruby and Windows, please let me know. I'd like to get a full preview of Ruby vs. Perl.

I'm also trying Ruby on Rails. I know Perl is as slick as Ruby, but I wish the Perl community made tools easier to install and use. I don't know how many times I had missed opportunities to use Perl about the workplace, but somebody couldn't get Perl and whatever else installed on their box.

Also, if you have any nifty examples of thread pools and/or Subversion Perl utilities, please send them my way. Thanks.

Replies are listed 'Best First'.
Re: Score: Perl 1, Ruby 0
by brian_d_foy (Abbot) on Feb 21, 2006 at 05:52 UTC

Once upon a time, Perl wasn't all that great with threads either. Given any language and a single comparison, we could come to the same conclusion. Every language would be a pipe dream.

However, plenty of people are using Ruby for real work. It exists and you can program with it right now. Indeed, you have. It's not a pipe dream.

If we advocate Perl by putting down other languages, we just look like ignorant jerks. Let's promote Perl by it's merits, not by our failed attempts to use a language which is new to us.

--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review
Re: Score: Perl 1, Ruby 0
by BrowserUk (Pope) on Feb 21, 2006 at 07:33 UTC

As pointed out, Ruby's threads aren't kernel threads. The scheduling is done by Ruby itself which means that individual opcodes aren't preempted. As backticks are a single opcode, regardless of what you put in them, the opcode won't return, and therefore can't be preempted until the command has finished.

However, you can avoid that by using IO.popen and reading the commands output 1 line at a time

t1 = Thread.new {
output = [];
cmd = IO.popen( "dir /s u:\\", 'r' );
while cmd.gets
output.push $_ end Thread.current["output"] = output } ... t1.join got = t1["output"] got.each{|line| puts line} [download] That said, you don't appear to be using the output from the backticks, so you probably shouldn't be using them anyway. As with Perl's threads, you gotta learn to use'em right :) The nice thing about Ruby's threads is that they are very light. When playing with them a while ago I modified the simple example from the Ruby book and started 100,000 of them concurrently and it only required 130 MB. With Perl's threads you'll be limited to 64 120* concurrent under Win32 (a Perl implementation limit not Win32), and they will consume 40 MB for even the simplest of subs. *Update:The limit used to be the same as the pseudo-processes limit, which is still 64, but the threads limit (as of 5.8.6) has been raised to 120. This discovered via the use of the following snippet: use threads; @t = map{ threads->create( sub{ print threads->self->tid; sleep 60; print threads->self->tid; } ) or die "threads->create failed$^E"
} 1 .. 1000;
[download]

Which produces

C:\test>maxthreads.pl
1
2
3
...
116
118
117
threads->create failed Not enough storage is available to process this
[download]

Update2: Having given up trying to unwind the multitude of defines equivalences that surround PERL_GET_CONTEXT

PERL_SET_CONTEXT(aTHX)
PERL_SET_CONTEXT((aTHX = PL_sharedsv_space))
PERL_SET_CONTEXT((aTHX = caller_perl))
PERL_SET_CONTEXT(interp)
PERL_SET_CONTEXT(aTHX)
PERL_SET_CONTEXT(aTHX)
PERL_SET_THX(t)               PERL_SET_CONTEXT(t)
PERL_SET_CONTEXT
PERL_SET_INTERP(i)
Perl_set_context((void*)t)
(PL_current_context = t)
PERL_SET_THX(t)                PERL_SET_CONTEXT(t)
PERL_SET_CONTEXT(proto_perl);
[download]
(and the associated mess that is the whole aTHX pTHX pTHX_ thing), I gave up and took a different tack, that of asking the operating system what TLS indexes are being used. The upshot is that whatever the storage is that is being run out of, it isn't TLS indexes.

From what I can work out, the storage in question is Thread Local Storage, which is a limited resource of 1088 32-bit words per process. The current limit of 120 threads suggests that perl is using 9 or 10 32-bit words of TLS per thread. I think I once suggested that this data could be allocated from the heap, and just a single pointer to it stored in TLS. One extra level of indirection would raise the limit to the OS maximum. There may be considerations relating to the architecture of Perl that make this suggestion impractical.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Thnx for the great info. I went stomping around the 5.8.7 source, and located the TlsAlloc() call (in the ALLOC_THREAD_KEY macro), but can't find more than a couple instances of ALLOC_THREAD_KEY being called...probably requires a hard-core debug effort to track down why/where so many are needed. (Unless any of the p5p'ers out there might shed some light ?)

I also googled about to see if I could find a registry key to tweak Win32's limit, but wo/ luck, it appears to be a hardcoded value.

Which is a bit puzzling, given Win32's preference for threading vs. forking. Considering that dual core CPU laptops are now available, and both AMD and Intel have announced quadcores for next year, I'd hope hardcoding this value might need re-examination, if they intend to play in the high end server market. (And the constant's name seems backwards: TLS_MINIMUM_AVAILABLE actually means the maximum available). Guess I'll have to wait and see what Vista brings.

I guess one can point fingers of shame at both Perl and Win32. Fortunately, 120 threads is sufficient for my needs on Win32 at present.

I went stomping around the 5.8.7 source, and located the TlsAlloc() call (in the ALLOC_THREAD_KEY macro), but can't find more than a couple instances of ALLOC_THREAD_KEY being called...probably requires a hard-core debug effort to track down why/where so many are needed.

I've been down that route before and become very lost in the myriad definition and redefinitions of everything that surrounds the whole PERL_GET_CONTEXT/Perl_get_context/aTHX stuff. I just tried again and got royally stuffed trying unwind the macros. Compilers are good at doing that--people (at least this person) ain't.

Upshot: Took a different tack and queried information about the TLS indexes from the OS from with in the thread itself, and whatever storage is being run out of, it isn't TLS!

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
it appears to be a hardcoded value. Which is a bit puzzling, given Win32's preference for threading vs. forking.

The limit represents 1088 stateful threads per process, but threads do not have to be, and frequently are not stateful. At the C level, you can easily run 2000+ threads per process if they are not stateful. And remember this is concurrent. You can create fast-lived, do & die threads by the bucket load if that is what the design calls for.

Also, many of the things you might consider spawning a thread for, like waiting for IO, there is no need for a separate thread as you can use asyncIO. You supply a callback on the read or write and let the OS call you back when it completes. I've seen a server application that could handle very high numbers of concurrent connections written this way, that only used 2 threads.

I agree with you that 120 threads is more than enough from Perl, given the inherent weight of iThreads. For most purposes, I'd advocate using a mere handful of long running threads rather than zillions of short lived. It just makes best use of the resources available.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
t1 = Thread.new {
output = [];
cmd = IO.popen( "dir /s u:\\", 'r' );
while cmd.gets
output.push \$_
end
}
...
t1.join
got = t1["output"]
got.each{|line| puts line}
[download]
This code doesn't give the intended results. I'm still limited as to how many threads I can kick off.
This code doesn't give the intended results.

Who's (and what) intended purpose? It certainly served my intended purpose--that of discovering the limit of concurrent threads runnable using the Perl executable when built with the default Win32 configration.

I'm still limited as to how many threads I can kick off.
1. Why are you doing this? Is your purpose to simply run lots of threads, or have you a Perl application that you believe would benefit from running >120 concurrent threads.

If the former, the snippet was not intended to, and could not, bypass the inherent limitations of the Perl executable.

If the latter, in most cases you should probably think again about your design. If you really believe that you have an application that does benefit from more than 120 concurrent iThreads, then see point 2.

2. It is possible to modify a (copy of) a win32 Perl executable to allow more than 120 threads to run concurrently. See my post at Use more threads. for the how.

That said, that post only addresses the how, not the why. To usefully make use of more than 120 concurrent iThreads requires much more than just changing the limit. It also requires that you adopt some very specific coding practices to avoid or work around other limitations inherent in the iThreads architecture.

These practices are neither intuative, nor what would be readily recognised as standard Perl working practice. They are described, piecemeal, across a whole bunch of posts (by me and others, notably zentara,jdhedden & renodino) here at PM, but to my knowledge there is no one place here or elsewhere that brings all the details together in a comprehensive reference. In part, because I don't think that anyone has really done enough work in multi-cpu environments to have yet tied down what best working practice should be.

If you still believe you have an application that would benefit from running large numbers of concurrent ithreads, if you were to post a description of that problem, you may well get further advise on implementing it. Along with advice on the possible alternatives to ithreads for achieving your goal.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Score: Perl 1, Ruby 0
by renodino (Curate) on Feb 21, 2006 at 05:07 UTC
Last time I checked, Ruby threads weren't "real" threads, i.e., it implements its own thread scheduler etc. So it may not even be using Win32 threading. While I have no idea if that was the cause of your thread starvation, its certainly the sort of thing that would make me think twice about using Ruby threads for serious threaded apps (esp I/O related, or anything else which needs to dive into the kernel).
Re: Score: Perl 1, Ruby 0
by ghenry (Vicar) on Feb 21, 2006 at 11:12 UTC

If you are trying RoR, you must try Catalyst. It really is a joy to work with.

Also, see the rant in the discussion forums of joelonsoftware: Rails' Ridiculous Restrictions, a Rant.

Interesting read, but it's still early for Rails.

Gavin.

Walking the road to enlightenment... I found a penguin and a camel on the way.....
Re: Score: Perl 1, Ruby 0
by Arunbear (Parson) on Feb 21, 2006 at 15:05 UTC
I know this is OT for Perl, but I think it illuminates how Perl is a real world language, while Ruby is still somewhat of a pipe dream.
It illuminates nothing, because you haven't mentioned how the Perl version of your Ruby script performed.
Re: Score: Perl 1, Ruby 0
by trammell (Priest) on Feb 21, 2006 at 17:16 UTC
I decided to stretch my brain this semester by doing all the programming and scratchwork for the course I'm taking in Ruby.

My main disappointments in Ruby so far have been the lack of autovivification in multi-dimensional arrays (doing dynamic programming without it is a PITA) and the available documentation not meeting my expectations.

I've subscribed to the "Ruby" tag in del.icio.us, but the S/N is swamped by Rails hype. Ruby does make me appreciate the "Huffman coding" of Perl. Haven't bothered with threads so far as they're not necessary to the problem domain.

My main disappointments in Ruby so far have been the lack of autovivification in multi-dimensional arrays (doing dynamic programming without it is a PITA)

Curiously not something I've ever had problems with. I guess I've wrapped stuff up in an object before I need multiple levels of indirection so the autovivification stuff never hits me.

and the available documentation not meeting my expectations.

Buy the Pickaxe book. It's well worth the money.

Ruby does make me appreciate the "Huffman coding" of Perl.

I think that this depends on what you're doing. I'm finding my Ruby tends to be more concise than my Perl 5.

(Happily using Ruby in the real world :-)

and the available documentation not meeting my expectations.
Buy the Pickaxe book. It's well worth the money.

I spent some time using Ruby last year. Bought the Pickaxe too. Ruby is definitely a nice language, but it's problem is not just that it's lacking in docs, but that it's lacking in commitment to docs. The Pickaxe is a pretty good book, granted. But I got the impression that the core Ruby people just aren't all that committed to making sure they have great docs. It's just not a priority for them, and that's fine -- that's their choice. And it's understood that they've got a lot on their plate.

After spending a few months with Ruby, then coming back to Perl, one of the first things I immediately noticed was, "Wow, I almost forgot how darn *good* the Perl docs are." (the other thing I said to myself was, "Wow, I almost forgot how *vast* the CPAN is" -- but that's is beside the point, since Ruby is still young).

Re: Score: Perl 1, Ruby 0
by Anonymous Monk on Feb 21, 2006 at 16:47 UTC
Writing applications to use threads (or concurrency of any stripe) is largely a waste of time unless you really need the performance gains.

Otherwise, the headaches just aren't worth it (TM). There isn't a formal model for debugging concurrent applications; multi-threaded and multi-process systems are a pain to debug, understand, and maintain.

I wouldn't damn a programming language on the basis of obscure features like threading; how about real world work instead?

From my perspective, which goes back to before I ever used Perl or iThreads, almost the last reason for using threads/concurrancy is performance.

The best reason is simplicity. It is just so much easier to code anything that takes a long time as a simple, linear subroutine, and then kick it off into a thread and let your main program get on with whatever else needs to be done.

The alternatives, like finite state machines are hard to code, and totally unportable even to the same OS on a faster processor. You take your slow, linear subroutine and break it into iddy-biddy chunks, carefully sized so that each one only takes as much time as you have available between doing the other things that need to be done, (like interacting with the user).

Move it to another, slower processor, (or just run another heavy process on the same machine), and your user interface is slow as molasses.

Move it to a faster machine and you either refactor all your stateful chunks into fewer, bigger chunks, or you waste half your cycles 'task switching' before it is necessary.

With threads, you write self contained, linear code, stick 'em in a thread and the scheduler takes care of everything else. The only time threads are a pain to debug is when people try to write them as a closed coupled state machine. Which is simply the wrong approach and easy to avoid.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Multitasking doesn't require threads, even on win32 AIUI (I appreciate it might not be as easy as on a real OS, but sucks to be you). Having another task in the same address space as you, but doing something unrelated is just a huge pain for no gain.

--
James Antill
Threading is an obscure feature? What is this real world work you speak of? Or perhaps you do a different type of real world work than I. Undoubtedly so, threading is *important*. Meanwhile, I concur that Ruby is very cool ... and spiffy threading support may come. Heck, I'm still hoping for Ruby-on-Parrot :)

Create A New User
Node Status?
node history
Node Type: perlmeditation [id://531603]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (9)
As of 2016-09-30 17:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Extraterrestrials haven't visited the Earth yet because:

Results (571 votes). Check out past polls.