Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Threads memory consumption

by mojo2405 (Acolyte)
on Dec 05, 2013 at 06:00 UTC ( #1065707=perlquestion: print w/ replies, xml ) Need Help??
mojo2405 has asked for the wisdom of the Perl Monks concerning the following question:

Hi all , I am using threads in my code , and somewhere in Perl documentation I saw that I shall use queues for threading - if I dont want any memory leaks. Today , I have big scripts that can simoultaniously run 24 threads , which I can see - after the threads end, the memory doesnt clean itself. I read about queues to threads in Perl , but I didnt got the idea, so I need a little help there. Is the idea is creating some pool of lets say 10 threads , and every time there is a job - the job is sent to an available thread ? my need is to each thread to return something when it ends - which I put inside a hash , and then use this hash. Can some one help me with that and explain me the theory ? thanks ! this is my current code
#Start new thread my $t1 = threads->new (\&$functions_name1,@parameters1); push(@threads,$t1); my $t2 = threads->new (\&$functions_name2,@parameters2); push(@threads,$t2); my $index=0; my %hash_results = (); foreach my $thread (@threads) #wait for all threads untill the end +and insert results to hash { $hash_results{$index}=$thread->join; }

Comment on Threads memory consumption
Download Code
Re: Threads memory consumption
by Laurent_R (Parson) on Dec 05, 2013 at 07:12 UTC
    I am not sure about what you've read on the subject, but it is not only about memory usage, but also about efficient use of your CPU power and other resources: it would generally be a bad idea to launch simultaneously 200 threads if you have 200 computations to be made, because the overhead would obliterate to a large extent the advantage of making parallel computation. Usually, it is better to have a limited number of threads (the number depends on your hardware, especially the number of CPUs of CPU cores) running at the same time and to launch a new one when another one exits. Meanwhile, you need to store in a queue (says an array of hashes) the parameters for the threads wainting to be launched.
      So , can you give me an example of how I do it ?
Re: Threads memory consumption
by BrowserUk (Pope) on Dec 05, 2013 at 07:35 UTC

    Press "search". Read, digest, download & try; ask questions


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      That - I already did. I was not asking unless if I needed something more specific and suitable for me. I already searched the formus - and still didnt find an answer for my question

        If you cannot see the answer to your questions in one or more of the first 5 posts that search displays; nothing I can tell you will help.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Just in case, the first step was implied: "Follow this link". That will take you to a pre-filled super search page. If you click "Search" there, then you'll get specific results, ones you might not have found in your own searches.

        - tye        

Re: Threads memory consumption
by Random_Walk (Parson) on Dec 05, 2013 at 09:15 UTC

    something along these lines may get you started...

    use strict; use warnings; use threads; use Thread::Queue; my $wq = Thread::Queue->new(); # Build a queue where you put work my $thread_limit = 4; # how many threads do you want? my @thr = map { # create threads and store the handles in @thr threads->create( sub { while (my $item = $wq->dequeue()) { # finish when undef # Do something with item } } ); } 1..$thread_limit; while (I get $data to process) { $wq->enqueue($data); } # Tell the threads we are done $wq->enqueue(undef) for @thr; # Wait for all the threads to finish $_->join() for @thr;

    Now go and read the docco for Threads and Thread::Queue. Its pretty good.

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!
      thanks alot ! do you think it will solve my memory consumption problem ?

        Without seeing more of your code its impossible to tell.

        I have seen multi threaded code eat all the memory when a large shared hash is first built containing all the work to be done, and the threads are then created. In this way each thread gets a copy of the hash, and your memory usage is terrible. Using a queue you use a lot less memory. If however you have some other code in there leaking memory, then changing the way you call threads is unlikely to solve it.

        Are you using strict and warnings? Do you declare all your variables in the smallest possible scope? Can you create a small test version of your code that shows this memory leaking behaviour?

        Cheers,
        R.

        Pereant, qui ante nos nostra dixerunt!
Re: Threads memory consumption
by Anonymous Monk on Dec 05, 2013 at 13:27 UTC
    The key idea is in Thread::Queue ... the idea that you have only enough (adjustable number of) threads that your system can efficiently handle, that each of them receives work from a queue and sends back to that queue, and otherwise are as independent and autonomous as possible.
Re: Threads memory consumption
by zentara (Archbishop) on Dec 05, 2013 at 14:29 UTC
    From my experience with threads and memory consumption, you have 2 choices.

    1. Reuse existing threads over and over, passing in new data to them.

    2. Better yet, especially if you are on Linux, use forks instead of threads. The forks will release memory when they exit. The only useful reason for incurring the overhead associated with threads is if you need to share data between threads in real-time. Even then, forks with SysV IPC shared memory works better and faster. Threads is just a lazy easy way of getting IPC, but it comes at a cost, as you have seen.


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      my need is to return data from each thread / fork child. it can be hash / object / string or whatever . This is the reason I'm using threads instead of forks. As far as I know - there is no way returning data from fork child.. only with threads... am I correct ?
        It is possible to pass result to parent process, and is quite easy. You can use Parallel::ForkManager for example:
        use 5.010; use strict; use warnings; use Parallel::ForkManager; use Data::Printer; my $pm = Parallel::ForkManager->new(2); $pm->run_on_finish( sub { # result from the child will be passed as 6th arg to callback my $res = $_[5]; p $res; } ); for (1..3) { $pm->start and next; # from here and till $pm->finish child process running # do something useful and store result in $res my $res = { aaa => $_ }; # this will terminate child and pass $res to parent process $pm->finish(0, $res); } $pm->wait_all_children;
Re: Threads memory consumption
by sundialsvc4 (Monsignor) on Dec 06, 2013 at 01:10 UTC

    I happen to like to use queues ... there are several to choose from ... knowing that all of them know how to send hashrefs and Perl objects.   Therefore, in a very simple but common scenario, you might have one process that receives incoming connections, builds “request” objects from them, and places these onto a queue.   An arbitrary but adjustable number of workers sit on that queue, retrieve requests from it, and execute those requests ... perhaps by calling some method (say, execute() ...) on the object that it just received.   This method, say, produces a result ... or stores the result in the object itself.   The worker then places the request/response onto a “completed work” queue ... from which it is retrieved and the results sent back to the requesting user.

    This, of course, is essentially the magic that’s used by FastCGI in any web-server on this planet:   your incoming HTTP data is gathered up and queued to a worker, who generates an HTTP response packet and queues it back to the web-server for delivery.   The FastCGI workers typically process hundreds or thousands of requests during their lifetimes.

    The advantage of this sort of general design is that, no matter how “busy” the server gets, the only evidence and the only consequence is that “the queues get rather long.”   The system might be running like a cat on a hot tin roof ... all of the workers being 100% active 100% of the time ... but it will not become congested.   It’ll become just as busy as you’ve allowed it to be, but not one whit more.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1065707]
Approved by boftx
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2014-09-02 09:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (21 votes), past polls