Beefy Boxes and Bandwidth Generously Provided by pair Networks Joe
Perl Monk, Perl Meditation.
 
PerlMonks

Use more threads.

by BrowserUk (Saint)
 | Log in | Create a new user | The Monastery Gates | Super Search | 
 | Seekers of Perl Wisdom | Meditations | PerlMonks Discussion | 
 | Obfuscation | Reviews | Cool Uses For Perl | Perl News | Q&A | Tutorials | 
 | Poetry | Recent Threads | Newest Nodes | Donate | What's New | 

on Feb 27, 2006 at 03:57 UTC ( #532956=perlmeditation: print w/ replies, xml ) Need Help??

Want the ability to create more concurrent threads in Perl?

For most purposes, the current limit of 120 concurrent is sufficient, but for some applications where individual threads can lay essentially dormant for extended periods, it has always seemed to be an arbitrarily low limit.

It turns out that the culprit is a single line in the Win32 makefiles, namely

$(LINK32) -subsystem:console -out:$@ -stack:0x1000000 $(LINK_FLAGS +) \

This, in conjunction with the use of 0 for the stacksize parameter on the CreateThread call, means that each thread created reserves a whopping 16MB of virtual stack space. Although this reservation will rarely if ever get actually allocated, those reserve allocations add up and eventually prevent another thread being spawned because 120 * 16 MB = 1.875 GB which puts you within spitting distance of the 2 GB per process virtual memory limit. Combined with other memory reservations and allocations made by perl.exe itself mean that you cannot now spawn another thread until something goes away to reduce the processes total memory reservation.

There are two immediate ways around this:

  1. If you build your own perl, then reducing the value in the makefile to (say 0x0100000. More on that later.), will allow you to create well over 1000 threads.

    In extremis, I've succeeded in reducing this value to the point where I've had over 3000 concurrent active threads running in just over 1 GB of ram, but they were not doing much at all.

  2. You can use the MS VC++ compiler tool, editbin /stack:0x00100000 \yourperl\bin\perl.exe to achieve the same effect on binary distributions.

    For my purposes, following the lead of AS's wperl.exe, I made a copy of perl.exe called tperl.exe and applied the modifications to that. I then made an association between a .plt suffix and tperl.exe, in a similar way as I have between .plw and wperl.exe. Now I can use .pl and perl.exe for normal apps (thereby reducing any risk associated with the change), .plt for heavily threaded apps and .plw for gui apps. I guess a .pltw might be on the cards also.

Ramifications of the change

Reducing the stack reservation may sound like a dangerous practice, but it is only a reservation.

In use, the system seems to happily expand the stack for any individual thread well beyond this limit provided virtual memory is available to accommodate it. The value specified only comes into play if other parts of the process consume virtual memory (stack or heap) to the point where they would reduce the 2 GB below the reservation.

By specifying a large reservation, you are guaranteeing that should your thread need to expand it's stack to the reserved size, it will be able to do so. However, this comes at the cost of preventing other parts of the process from increasing their use of virtual memory--including heap--just in case your thread needs that space.

So by reducing the stack reservation, you run the risk that if other parts of your process have expanded their use of VM to the point where your thread can no longer expand it's stack, your process will terminate with a stack overflow or similar. However, if the other parts of your process require that much VM, and you had retained the larger stack reservation, then the process would have been terminated 'Out of memory' anyway.

So far as I can tell, and there seems to be little real documentation on the subject that I can find, there is little risk associated with the reduced reservation.

It's also worth pointing out that in my attempts to persuade perl to consume stack, and as confirmed by a man who knows, one of Perl's design features is that it does not make a great deal of use of the C-stack for most of it's operational needs.

In my limited testing, you generally have to be doing something pretty extreme to force Perl to consume anything more that very modest amounts of stack. In most cases, it only happens if you have runaway recursion (at the C-level), that would consume all available space until it crashed anyway.

The exceptions are:

  1. Complex, backtracking regex on very large strings, which should probably be replaced with better regexes anyway.
  2. Sorting very large datasets, though I found it hard to create the situation where I didn't run out of heap well before I ran out of stack. Maybe if you used the older quicksort algorithm instead of the default heap sort this would be more of a problem, but there doesn't seem to be any good reason for doing so.
  3. Recursive XS or Inline C code. Even then, if you are doing anything useful, as opposed to recursing for it own sake as with something like a C implementation of Ackerman's function, then you're more likely to run out of heap for your data before you run out of stack to process it.

If you use binary builds and don't have access to editbin.exe

The value in the executable that needs to be binary edited is in a well known and easily located place and is fairly trivial to change. Autrijus' Win32::Exe module should be easily tweaked to add this value to it's repertoire of modifiable values. I'll come up with a patch if there is any demand for it, and if Autrijus doesn't beat me to it.

Other OSs

renodino has done some testing with home built versions of Perl for Linux and has achieved similar kinds of increases in the number of simultaneous threads that can be achieved. The downside is that he has been unable to find a binary edit utility for the Linux platform. He's also done some testing on that platform on apps usng DBI and TK and has seen no detrimental effects from the change. I'll leave him to describe what testing he has done and other Linux related information if he chooses/anyone is interested.

A better solution

In the long term, a better solution would be for threads::create() to accept an extra (named?) parameter that allowed the Perl programmer to specify the stack reservation on a per thread basis. That would allow the choice of what size is applicable to made on a thread by thread basis and remove the (slight) possibility that lowering it for the Perl executable could cause large, non-threaded apps to have problems. renodino has some ideas on this, and maybe the p5p guys will consider the option if their combined wisdom doesn't find too many holes in the idea.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Use more threads.
Select or Download Code
Re: Use more threads.
by hv (Parson) on Feb 27, 2006 at 11:27 UTC

    [renodino] has been unable to find a binary edit utility for the Linux platform.

    For setting stacksize, you need the API function setrlimit(2); the manpage refers you also to the bash builtin 'ulimit' and quotactl(1).

    Trying that locally against an example from What perl operations will consume C stack space?:

    zen% ulimit -s 8192 zen% perl -wle '$n=shift; $_="a" x $n; /(ab*)+/' 10080 Segmentation fault (core dumped) zen% ulimit -s 32768 zen% perl -wle '$n=shift; $_="a" x $n; /(ab*)+/' 10080 zen% perl -wle '$n=shift; $_="a" x $n; /(ab*)+/' 32766 zen% perl -wle '$n=shift; $_="a" x $n; /(ab*)+/' 32767 Complex regular subexpression recursion limit (32766) exceeded at -e l +ine 1. zen%
    .. which gets me to the builtin limit.

    Note that the limit may be capped by root, and that more complex systems may use the quota-based accounting method; but any barriers are there to stop people increasing stack size, so they shouldn't cause a problem for your requirements here.

    HTH,

    Hugo

[reply]
[d/l]
Re: Use more threads.
by zentara (Chancellor) on Feb 27, 2006 at 12:43 UTC
    I saw a post on comp.lang.perl.misc asking why there is a new threads-shared module available separately for perl 5.8. The poster said that he made some improvements, but it was too big to put into the main perl5.8.8 release. That is weird isn't it? It says 'bless' is now supported on shared refs.

    Thats a bit beyond me, but you might find it interesting.


    I'm not really a human, but I play one on earth. flash japh
[reply]

      I saw blessing support in new threads::shared ? yesterday.

      The poster said that he made some improvements, but it was too big to put into the main perl5.8.8 release. That is weird isn't it?

      My reading of that was that the changes were extensive and finished too close to the release of 5.8.8, so that there was not enough time to ensure adaquate testing before release. Releasing it to cpan means that those us interested in playing with it get to do so now without imposing the associated risks (if any) on all users of 5.8.8.

      This way, dave_the_m potentially inlists a bunch of testers to check the changes out before it gets considered as a candidate for the next release. I think its a great idea. My only wish is that threads was available separately packaged also.

      It'd be nice if things like the defined-or keyword could be made available in a similar manner. It seems to have been an inordinately long time since I first heard that was mooted for inclusion and it's still not available :(


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
[reply]
        Indeed, I must say that I am much more interested in the //= operator than in threads. Defined-or is useful even for small, single-threaded applications, such as the ones I write every day. Sure, it's mostly syntax sugar (mostly), but it's very *nice* syntax sugar (and, yeah, there are also those few instances where you really don't want to evaluate the left side twice, but that's more of a special case even than threads).
[reply]
Re: Use more threads.
by renodino (Curate) on Feb 27, 2006 at 16:05 UTC
    renodino has done some testing with home built versions of Perl for Linux...

    Clarification: I didn't build a new Perl on Linux. Using the stock Perl 5.8.6 in FC4, I ran some tests. Linux died at 289 threads. I also ran tests on Solaris 10 (which dies at ~1900 threads, and starts thrashing the swapper around 1300 threads), and OS X 10.3.9, which dies around 450 threads.

    Perhaps as importantly, I found a link that sheds a bit more light on the subject.

    My current approach (which I hope to build/test today) is to add a couple new APIs to threads: set_stack_size() and get_stack_size(). The added code is pretty simple, though it may not be applicable to the root thread (the various editbin/setrlimit/ulimit solutions may address that issue).

    Its important to point out that this issue isn't just about using more threads (tho thats my personal requirement); given the huge default stack size on Win32 and Linux, one of the biggest complaints of threaded Perl apps - its voracious memory appetite - may be addressed by just trimming the stack reserve down to a reasonable/minimal number.

    Update:

    After adding the set/get_stack_size() methods and applying the associated changes to the CreateThread()/pthread_attr_setstacksize() calls, and then calling set_stack_size(65536), I can crank out 1200 threads on Win32 (tho theres definitely some swapping kicking in at around 900 threads).

    Likewise, on Linux FC4, I can get 1000 threads on a fairly small machine (an old 1GHz laptop w/512 meg), tho it starts thrashing at around 1000 threads. And the vsz report from ps shows a vast reduction in memory usage. (Since I can't get more than 120 threads using the original threads on Win32, I can't really make a useful memory usage comparison)

    Note that in both cases, I'm using the stock perl 5.8.6 wo/ any ulimit'ing or editbin'ing.

    I'm going to try it on OS X and Solaris and see what shakes out.

    FWIW: my method for doing this was to copy the threads and threads::shared source directories into their own, and rename everything to "morethreads" package root. The module tests don't seem to pass w/ flying colors, but it may be related to using the unofficial threads::shared 0.95 against perl 5.8.6.

    Update 2:

    After testing on OS X 10.3.9 and Solaris 10, they both seem a bit less sensitive to the stack setting. Both reported ulimit -s == 8192 (ie, 8Meg).

    When I ran a comparison test on OS X between stock threads, and my hacked morethreads, the overal performance was about the same, tho ps -o vsz reported about half as much memory being used when I set_stack_size(65536). So I'm assuming something in either the perl build, or the OS is throttling the per-thread stack size.

    On Solaris, the test showed an even closer vsz between stock and hacked threads. Stock was always about 15-20 megs higher than hacked, so I'm assuming theres a build or OS limit there as well.

    Following up on my Linux tests, ulimit -s reported 10240. The vsz differences were dramatic: at 200 threads, the stock version reported nearly 2Gig, while the hacked version reported around 125Meg.

[reply]

      The ulimit information is barking up the wrong tree. The posix thread stack size routines are the right way to go. ulimit will limit the maximum size of the stack, not the initial reserve. While limiting the maximum size will place a hard upper limit on the memory footprint, it will do nothing to reduce the lower limit. To do that, you must reduce the stack reserve (as the original post says). You can set the initial reserve at link time with the ld option --stack, which defaults to 2MB in the GNU binutils.

      To modify this in the binary on a *nix box, you can "relink" it:

      $ ld --stack 0x1000 perl -o tperl $ nm -s perl | grep stack_reserve 00200000 A __size_of_stack_reserve__ $ nm -s tperl | grep stack_reserve 00001000 A __size_of_stack_reserve__

      The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon
[reply]
[d/l]
[select]

Back to Meditations


Login:
Password
remember me
What's my password?
Create A New User

Node Status
node history
Node Type: perlmeditation [id://532956]
Approved by Albannach
Front-paged by GrandFather
help
Community Ads
Chatterbox
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users
Others lurking in the Monastery: (10)
planetscape
bassplayer
atcroft
herveus
thezip
Eyck
ssandv
Madams
gnosti
spstansbury
As of 2009-11-21 01:55 GMT
Sections
The Monastery Gates
Seekers of Perl Wisdom
Meditations
PerlMonks Discussion
Categorized Q&A
Tutorials
Obfuscated Code
Perl Poetry
Cool Uses for Perl
Perl News
Information
PerlMonks FAQ
Guide to the Monastery
What's New at PerlMonks
Voting/Experience System
Tutorials
Reviews
Library
Perl FAQs
Other Info Sources
Find Nodes
Nodes You Wrote
Super Search
List Nodes By Users
Newest Nodes
Recently Active Threads
Selected Best Nodes
Best Nodes
Worst Nodes
Saints in our Book
Leftovers
The St. Larry Wall Shrine
Offering Plate
Awards
Craft
Snippets Section
Code Catacombs
Quests
Editor Requests
Buy PerlMonks Gear
PerlMonks Merchandise
Planet Perl
Perlsphere
Use Perl
Perl.com
Perl 5 Wiki
Perl Jobs
Perl Mongers
Perl Directory
Perl documentation
CPAN
Random Node
Voting Booth

Future historians will find that the material characteristic of the current era is...

Aluminium
Plastic
Oil
Water
Carbon dioxide
Copper
Iron
Silicon
Salt
Uranium
Hydrogen
Other

Results (725 votes), past polls