Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^4: does threads (still) memleak?

by faxm0dem (Novice)
on Nov 24, 2008 at 13:16 UTC ( [id://725578]=note: print w/replies, xml ) Need Help??


in reply to Re^3: does threads (still) memleak?
in thread does threads (still) memleak?

That seems a very reasonable suggestion. How would you suggest to recycle threads in practice?

Replies are listed 'Best First'.
Re^5: does threads (still) memleak?
by BrowserUk (Patriarch) on Nov 24, 2008 at 13:55 UTC
    How would you suggest to recycle threads in practice?

    In simple terms, instead of creating a new thread to process each piece of asynchronous work, you start by creating a pool of threads that sit and do nothing until your main thread passes them a piece of work to do. They then process that piece of work and instead of dieing when it is complete, they go back to waiting for another new piece of work.

    Here's a simple example:

    #! perl -slw use strict; use threads; use Thread::Queue; sub worker { my $Q = shift; while( my $workItem = $Q->dequeue ) { printf "[%d] Processing workiterm '%s'\n", threads->tid, $work +Item; sleep rand( 2 ); ## process $workitem } } our $WORKERS ||= 10; my $Q = new Thread::Queue; my @threads = map{ threads->create( \&worker, $Q ) } 1 .. $WORKERS; while( <> ) { ## Get workitems from stdin chomp; $Q->enqueue( $_ ); ## And queue them to the worker pool } $Q->enqueue( (undef) x $WORKERS ); ## Signal no more work $_->join for @threads; ## Wait for tehm to finish; exit; ## Done

    You'd use it like this:

    > ls * | perl -s tdemo.pl -WORKERS=5 [1] Processing workiterm '2of12inf.dic' [1] Processing workiterm '2of12inf.txt' [1] Processing workiterm '3' [4] Processing workiterm '345241' ...

    Or like this:

    >tdemo -WORKERS=3 work.dat [2] Processing workiterm '00001' [1] Processing workiterm '00002' [2] Processing workiterm '00003' [1] Processing workiterm '00004' [2] Processing workiterm '00005' [1] Processing workiterm '00006' [3] Processing workiterm '00007' [2] Processing workiterm '00008' ...

    Of course, this doesn't do very much--just prints and sleeps--but this simple basic structure can be used to service a huge variety of different concurrent programming tasks. Not all, in particular, socket servers require a somewhat modified approach, but still a good proportion of concurrency tasks can be handled this way. And as you only create a limited number of threads, you gain performance because you're not constantly discarding old threads only to replace them near identical new ones.

    And, the bit that is very relavent in the context of this thread, if there are any small, per-thread memory leaks, they never become an issue, because you are creating so few threads.

    To be able to tailor a reply to your particular situation, you'd have to tell us what it is you are currently doing with those 10,000 threads you are creating :)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thank you for your very helpful example.
      To be able to tailor a reply to your particular situation, you'd have to tell us what it is you are currently doing with those 10,000 threads you are creating :)
      Well, in my real world code, I'm only creating new threads (~10) every 15 seconds. Each thread is collecting system data, which is later sent over to a collector. So I'll most certainly work as you suggested, namely recycle the threads.

        One thing. In my example, for the sake of simplicity, I use printf from multiple threads, without any form of locking. I can get away with this, as long as the output is going to the screen, because on my system the OS serialises the output.

        If your system doesn't do this, or (for example) if you want to be able to re-direct the output file, then you should (I should) be using a semaphore. A simple way to do that is to wrap print/printf something like:

        my $stdOutSem :shared; sub tprint { lock $stdoutSem; print ref( $_[0] ) eq 'GLOB' ? shift() : () "[@{[ threads->tid ]}] +:", @_; } sub tprintf { lock $stdoutSem; printf ref( $_[0] ) eq 'GLOB' ? shift() : () "[%s]" . $_[0], $_[ 1 + .. $#_ ]; }

        Refactor and season to taste.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^5: does threads (still) memleak?
by zentara (Archbishop) on Nov 24, 2008 at 15:55 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://725578]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-04-26 05:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found