Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^5: ithreads memory leak

by BrowserUk (Patriarch)
on Apr 08, 2015 at 19:45 UTC ( [id://1122846]=note: print w/replies, xml ) Need Help??


in reply to Re^4: ithreads memory leak
in thread ithreads memory leak

I thought that PERL should be doing garbage collection once the threads terminate.

It does. But -- unlike fork processes -- that memory is returned to the process free memory pool, not to the operating system.

That is, the memory used by a ended thread is freed, and can be reused for the next thread -- that is why when you only use one thread at a time, the memory used by the process stays steady. The memory allocated from the OS for the first thread; can be reused when you start the second thread.

When you run more than one thread concurrently, the process will obviously need more memory; but as old threads end and new ones start; the memory will be recycled for those new threads.

In other words; you aren't seeing a memory leak; just memory usage, so stop worrying about it.

Now, if you are running out of memory, then you are either: a) not terminating old threads cleanly; b) or using more threads than you have memory to support.

If the former; there is usually a fairly obvious reason why threads fail to end cleanly; but I'd need to see the real code.

If the latter; it is usually a design flaw. Most modern systems (4GB min) can support upwards of 100 concurrently threads -- with care, I have had over 3000 concurrent threads running on my 8GB system -- but just because you can have a large number of threads concurrently running; it rarely if ever makes sense to do so. And a design that requires it is usually the wrong design.

Sometimes running out of memory with low(ish) numbers of threads indicates that you are inadvertently duplicating large chunks of memory that you did not intend to.

And finally, there are steps you can take to reduce the memory usage of your threads by careful consideration of what and when you load modules. The default habit of useing modules at the top of the program means that they all get replicated into every thread; regardless of whether those threads need them or not. It's convenient; but can cause excessive memory use.

The bottom line is; if you have a genuine problem with memory usage of your threaded code; and you want to address it; you will have to post the real code, not snippets.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^6: ithreads memory leak
by DNAb (Novice) on Apr 08, 2015 at 22:38 UTC

    I can post the code later on tomorrow, that's not really an issue. Eventually I do run out of memory on the system, there is currently 512MB assigned but I can go to 1GB. Threads are killed using a signal which calls threads->exit(). I'm not really sure this is part of the issue though, since again the issue is present in just that small piece of code, and I don't kill threads there, I just detach them.

    I guess what you are saying is this is basically just how memory management is in PERL? How else would you deal with a long running application using threads not eating up all the memory?

    And, of course, thank you!

      I guess what you are saying is this is basically just how memory management is in PERL?

      It's not: "how memory management is in Perl"; it's: how memory management is!

      In general, it is expensive for a process to request more memory from the OS; so once a process (via your C compiler memory management functions) have requested memory from the OS, they are reluctant to give it back because it would be expensive to re-request it from the OS again. Better to simply keep it around until the next time it is needed.

      This is, for the most part, the way all memory managers work.

      How else would you deal with a long running application using threads not eating up all the memory?

      I'll say it again. There is no problem with running many dozens of concurrent threads. With 512MB, based on the numbers you gave above, you should be able to run ~50 threads concurrently without problems.

      And so long as no more than 50 are ever running at any given time; you should be able to start & end thousands of threads without ever breaking that same limit.

      But whether it is a good idea to do so is a different question that can only be answered by seeing your code.

      It is not the number of threads you start; nor how long the process runs for that is the limiting factor; but rather the number of them that are running at the same time. It's not rocket science to see that the more you run concurrently; the more memory it will use. Use too many at the same time and you'll run out of a finite resource.

      Threads are killed using a signal which calls threads->exit().

      Killing threads is a really bad idea; and IS likely to cause memory leaks.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        I was looking through lists of "memorable nodes", collected on some monks' home pages, that's why I came here to such old discussion.

        I can't quite agree:

        It's not: "how memory management is in Perl"; it's: how memory management is!

        In general, it is expensive for a process to request more memory from the OS; so once a process (via your C compiler memory management functions) have requested memory from the OS, they are reluctant to give it back

        It looks like, memory requested by XS code is eventually returned to OS (right?). And memory, allocated in separate threads, upon their completion, is returned, too. (That's why if I want long-running program to have sane memory-footprint, I'd place memory hungry code in a thread. Or even, cf. my last question about child processes and IPC, to separate it completely.)

        But, "pure Perl" mono-thread is, indeed, reluctant to give anything back.

        Here's on Win32 5.020 (your numbers may vary, and sorry for a bit barbaric "memory monitoring code". CPAN Windows-compatible modules seem to fail with threads):

        use strict; use warnings; use threads; use PDL; PDL::no_clone_skip_warning; sub mem { qx{ typeperf "\\Process(perl)\\Working Set" -sc 1 } =~ /(\d+)\.\d+\"$/m; ( my $s = $1 ) =~ s/(\d{1,3}?)(?=(\d{3})+$)/$1,/g; printf "%-30s: %12s\n", @_, $s } mem 'initially'; my $p = zeroes 50_000_000; mem 'we\'ve made a huge piddle!'; undef $p; mem 'and now it\'s gone'; sub test_arr { my @a = 1 .. ${ \10_000_000 }; mem 'we\'ve made a huge array!'; @a = undef; mem 'and now it\'s gone'; } async( \&test_arr )-> join; mem 'and now a thread is gone, too'; print "\nbut let\'s try it in main thread!\n\n"; test_arr; mem 'finally';

        The output:

        initially : 16,846,848 we've made a huge piddle! : 418,107,392 and now it's gone : 17,321,984 we've made a huge array! : 628,912,128 and now it's gone : 628,928,512 and now a thread is gone, too : 20,258,816 but let's try it in main thread! we've made a huge array! : 625,430,528 and now it's gone : 625,430,528 finally : 625,430,528

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1122846]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (10)
As of 2024-04-23 08:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found