Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Replacing closures (to work around threads crash)

by eyepopslikeamosquito (Canon)
on Nov 22, 2004 at 03:52 UTC ( #409473=perlquestion: print w/ replies, xml ) Need Help??
eyepopslikeamosquito has asked for the wisdom of the Perl Monks concerning the following question:

After getting the dreaded "Free to wrong pool" crash in a multi-threaded Perl program (in Telnet.pm), I wrote this little program to demonstrate it:

#!/usr/bin/perl -w # Simulate "Free to wrong pool" crash from Telnet.pm, line 1987 # This program crashes with Perl 5.8.4/5.8.5 (on multi-cpu boxes only) +. use strict; use threads; sub do_one_thread { my $kid = shift; warn "kid $kid before local\n"; for my $j (1..99999) { my @warns; { local $^W = 1; local $SIG{"__WARN__"} = sub { push @warns, @_ }; } } warn "kid $kid after local, sleeping 1\n"; sleep(1); warn "kid $kid exit\n"; } sub do_threads { my $nthreads = shift; my @kids = (); for my $i (0..$nthreads-1) { my $t = threads->new(\&do_one_thread, $i); warn "parent $$: continue\n"; push(@kids, $t); } for my $t (@kids) { warn "parent $$: waiting for join\n"; $t->join(); warn "parent $$: thread exited\n"; } } do_threads(2);

This crash seems related to Perl bug #31851 ("Threading crash with closures"). It seems Perl closures are not currently thread-safe. Though all the CVs are cloned for each thread, they share the same OP tree, and the code that updates the reference count of the OP tree is not thread safe because it's missing locks (OP_REFCNT_LOCK/OP_REFCNT_UNLOCK) around some refcount decrementing (OpREFCNT_dec) and some refcount incrementing (OpREFCNT_inc).

Alas, the fix for #31851 has not yet made it back into Perl 5.8.x branch. Moreover, I'd like to have this program not crash when running on earlier Perl versions. Accordingly, I'd like to find a way to replace closures with something else that is thread-safe.

I am currently changing, for example:

local $SIG{"__WARN__"} = sub { push @warns, @_ };

to:

local $SIG{"__WARN__"} = eval <<'CLOSURE_HACK'; sub { push @warns, @_ } CLOSURE_HACK

which seems to get rid of the crashes. Given I don't understand the code I am changing very well, can you see cases where routinely adding a leading eval like this will cause trouble? Is there an alternative way to replace closures with something else?

Comment on Replacing closures (to work around threads crash)
Select or Download Code
Re: Replacing closures (to work around threads crash)
by diotalevi (Canon) on Nov 22, 2004 at 05:27 UTC
    A closure is a hard reference to compiled code with some bound parameters. You could relax all of that to end up with the name of a function and a list of parameters to call it with. So instead of $lex = "a value"; $sub = sub { $lex }, $sub = [ 'sub_that_does_foo', $lex ]. You might even just relax symbolic references for a bit and then you could call the function like $sub->() even. If you use no strict 'refs', do it in a small block to keep the scope of that small.
Re: Replacing closures (to work around threads crash)
by ikegami (Pope) on Nov 22, 2004 at 05:31 UTC

    Instead of doing the eval 99999 times, how about

    my $per_thread_closure_maker = eval <<' CLOSURE_HACK'; sub { my ($warns) = @_; return sub { push @$warns, @_ }; }; CLOSURE_HACK for my $j (1..99999) { my @warns; { local $^W = 1; local $SIG{"__WARN__"} = &$per_thread_closure_maker(\@warns); } }
Re: Replacing closures (to work around threads crash)
by pg (Canon) on Nov 22, 2004 at 05:32 UTC

    I ran your code three times on WinXP, there was no crash though. I am using 5.8.4. What is your platform?

Re: Replacing closures (to work around threads crash)
by castaway (Parson) on Nov 22, 2004 at 05:53 UTC
    This is perl, v5.8.4 built for i686-linux-thread-multi
    Results:
    kid 0 before local parent 1145: continue kid 1 before local parent 1145: continue parent 1145: waiting for join kid 1 after local, sleeping 1 kid 0 after local, sleeping 1 kid 1 exit kid 0 exit parent 1145: thread exited parent 1145: waiting for join parent 1145: thread exited
    .. What crash?

    C.

Re: Replacing closures (to work around threads crash)
by eyepopslikeamosquito (Canon) on Nov 22, 2004 at 06:25 UTC

    Update: I've now run the test program above on 4 different multi-cpu boxes and it always crashes there (sometimes I need to up the number of threads from 2). I've also run it on two different single CPU machines and I cannot get it to crash, no matter what I do.

    It crashes for me with both Linux 2.4 (Perl 5.8.5) and Windows XP (ActiveState Perl 5.8.4). These are both hyperthreaded (or multi-cpu) machines. I notice the original bug report #31851 states that it crashes on a "multiprocessor box". Anyone out there running a multiprocessor box? Anyone else got it to crash? I suppose you could try fiddling with the 99999 above and/or the number of threads in this line:

    do_threads(2); # try increasing the number of threads

    to see if it makes any difference. More details of my crashes below.

    On Linux 2.4.18-3smp: # perl z.pl parent 24192: continue kid 0 before local parent 24192: continue parent 24192: waiting for join kid 1 before local Memory fault # perl -V Summary of my perl5 (revision 5 version 8 subversion 5) configuration: Platform: osname=linux, osvers=2.4.18-3smp, archname=i686-linux-thread-multi uname='linux rh73 2.4.18-3smp #1 smp thu apr 18 07:27:31 edt 2002 +i686 unknown ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemulti +plicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS +-fno-strict-aliasing -pipe -I/ usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -I/usr/in +clude/gdbm', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -fno-stri +ct-aliasing -pipe -I/usr/local /include -I/usr/include/gdbm' ccversion='', gccversion='2.96 20000731 (Red Hat Linux 7.3 2.96-11 +0)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=1 +2 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', + lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldl -lm -lcrypt -lutil -lpthread -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=/lib/libc-2.2.5.so, so=so, useshrplib=false, libperl=libperl. +a gnulibc_version='2.2.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL +_IMPLICIT_CONTEXT Built under linux Compiled at Nov 21 2004 09:48:50 @INC: /home/knob/thperl585/lib/5.8.5/i686-linux-thread-multi /home/knob/thperl585/lib/5.8.5 /home/knob/thperl585/lib/site_perl/5.8.5/i686-linux-thread-multi /home/knob/thperl585/lib/site_perl/5.8.5 /home/knob/thperl585/lib/site_perl . On Windows XP (Intel hyper-threaded machine): C:\TEMP>perl z.pl kid 0 before local parent 3140: continue parent 3140: continue parent 3140: waiting for join kid 1 before local Free to wrong pool 232ae8 not 23d420 at z.pl line 13. C:\TEMP>perl -V Summary of my perl5 (revision 5 version 8 subversion 4) configuration: Platform: osname=MSWin32, osvers=4.0, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef usethreads=undef use5005threads=undef useithreads=define usemultip +licity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cl', ccflags ='-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D +_CONSOLE -DNO_STRICT -DHAVE_DE S_FCRYPT -DNO_HASH_SEED -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS - +DUSE_PERLIO -DPERL_MSVCRT_READ FIX', optimize='-MD -Zi -DNDEBUG -O1', cppflags='-DWIN32' ccversion='', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64 +', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -l +ibpath:"C:\perl58\lib\CORE" - machine:x86' libpth="C:\Program Files\Microsoft Visual Studio\VC98\lib" libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib + comdlg32.lib advapi32.lib sh ell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.lib mp +r.lib winmm.lib version.lib o dbc32.lib odbccp32.lib msvcrt.lib perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool +.lib comdlg32.lib advapi32.li b shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib wsock32.li +b mpr.lib winmm.lib version.l ib odbc32.lib odbccp32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib gnulibc_version='undef' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug -opt: +ref,icf -libpath:"C:\perl58\l ib\CORE" -machine:x86' Characteristics of this binary (from libperl): Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES PERL +_IMPLICIT_CONTEXT PERL_IMPLICI T_SYS Locally applied patches: ActivePerl Build 810 22751 Update to Test.pm 1.25 21540 Fix backward-compatibility issues in if.pm Built under MSWin32 Compiled at Jul 30 2004 09:49:05 @INC: C:/perl58/lib C:/perl58/site/lib .
Re: Replacing closures (to work around threads crash)
by BrowserUk (Pope) on Nov 22, 2004 at 07:28 UTC
    Is there an alternative way to replace closures with something else?

    The trouble with your question, beyond that like others, I cannot reproduce your problem, is that your example code doesn't show why you are trying to use closures. That is to say, devoid of the original context, your example code doesn't make any sense (to me). It makes it very hard to recommend an alternative, when it's not clear for what purpose you are trying to use them in the first place.

    In your example, you are closing over a local, unshared array. That means each thread will get it's own copy of that array. However, it's not clear to me what purpose that array will serve?


    Examine what is said, not who speaks.
    "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      your example code doesn't show why you are trying to use closures

      I'm not trying to use closures myself. I am using the Net::Telnet module and Telnet.pm uses closures in a number of different places. My main headache is that I don't understand (nor really want to understand ;-) Telnet.pm, I just want it to stop crashing. So I'm looking for a "safe" way of getting rid of the closures without having to go to the bother of actually understanding Telnet.pm.

      BTW, I manufactured the above test program to get a repeatable (yeah, I know, but, hey, it's repeatable for me ;-) and quick crash to make debugging easier. The original program crashed intermittently once an hour or so, which was a real pain to debug -- the clue re closures was got from the "Free to wrong memory pool at Telnet.pm line 1987" message that accompanied most (but not all) the original crashes.

      Update: Thanks to everyone for reporting the lack of crashes. After further testing, it seems the above test program always crashes on multi-cpu machines yet never crashes on single CPU ones (at least that's what I see).

        Can you give a short example actually using Net::Telnet and threads? Do you actually need to run Net::Telnet inside a thread? Maybe you need IO::Select instead?

        C.

        Having now looked at the code in Net::Telnet around line 1987, I do understand your dilemma. I should have looked before, all the clues needed were present in your OP--but I didn't. Sorry.

        That code, the evaled subs forming closured over lexical arrays combined with signals, pseudo-signals and timeouts (via alarm which wasn't implemented on Win32 until recently; and which I've never managed to get to do anything sensible) along with localise globals et al, is probably the most thread-unfreindly I have seen. That you are running on an SMP box may well be the straw that broke the camel's back.

        The only suggestion I have is that you try applying the : shared atribute to every closed over variable/array in the module, and apply the lock( @closed_over ); wherever you see a closure being referenced (especially modified).

        Both of these would become noops when the module was used in a non-threaded environment, so shouldn't affect it's operation there.

        I have no way to try this idea out, so it is total speculation as to whether it would make one iota of difference.

        In the absence of my having an SMP box and 2 or 3 places I could Telnet into, I have to accept (however reluctantly :) that castaway's suggestion that maybe a non-threaded solution is the right way to go for this, at least until the experts have applied their tuits to the underlying cause of the problem.


        Examine what is said, not who speaks.
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Replacing closures (to work around threads crash)
by rongoral (Beadle) on Nov 22, 2004 at 19:06 UTC
    Hm. Not really sure why I posted that there. I'll take my cookies and go away home. (so embarassed).
Re: Replacing closures (to work around threads crash)
by Anonymous Monk on Nov 23, 2004 at 18:07 UTC
    People without an actual multiprocessor box can emulate one by running perl under valgrind --tool=addrcheck

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://409473]
Approved by ysth
Front-paged by bronto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (12)
As of 2014-09-23 20:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (241 votes), past polls