Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Perl Threads and multi-core CPUs

by haidut (Novice)
on Sep 10, 2008 at 02:38 UTC ( [id://710237]=perlquestion: print w/replies, xml ) Need Help??

haidut has asked for the wisdom of the Perl Monks concerning the following question:

OK, another try with

tags as suggested:-) Hi all,

Sorry if this is a dump/redundant question but I couldn't find a definitive answer anywhere so I decided to ask for collective wisdom.

I am working on data classification and machine learning project. So I have a large data set that I need to process. The job will run on multi-core CPU and since the data set items are independent the set can be split for processing into multiple units in order to take advantage of the multicore CPU.

Obviously the first things that come to mind are threads and forking. I wrote a version based on forking and it works fine but it is a RAM hog b/c when you fork, every new process is a copy of the parent and the parent in my case is quite large b/c it loads an AI model that consumes about 1GB of RAM. So each child becomes a 1GB monster and I run the risk of either thrashing the swap, which kills performance or running out of RAM altogether if another process kicks in somehow.

With threads it seems that it would be easier since threads have access to global variables defined in the parent, so all spawned threads would share the same AI model and I won't have multiple 1GB copies of the parent. In the thread case I obviously have to worry about locking but that's not an issue as I can implement it. The bigger issue is that it seems that Perl threads live INSIDE the spawning process, so they don't get scheduled on separate CPUs but simply compete for run time within the spawning process. I tried some tests and indeed on Linux the "top" command shows only one Perl process running on one of the 8 available CPUs even though I have 8 threads running. So with threads I am not achieving any speedup on multi-core CPUs.

Does anybody know if Perl supports kernel threads that the OS can then schedule on multiple CPUs? I read the Perl thread tutorial and all it says is that each thread loads a new Perl interpreter. But from what I see that doesn't result in a new runnable object separate from the spawning process that can be scheduled to run on a CPU other than the one used by the spawning process. That said, are there any modules on CPAN that provide through parallelization of tasks so that Perl can take advantage of multiple CPUs? Any help is appreciated. Thanks.

Replies are listed 'Best First'.
Re: Perl Threads and multi-core CPUs
by BrowserUk (Patriarch) on Sep 10, 2008 at 03:04 UTC
    The bigger issue is that it seems that Perl threads live INSIDE the spawning process, so they don't get scheduled on separate CPUs but simply compete for run time within the spawning process.

    Wherever did you get that idea from? Because it is totally wrong.

    Perl's threads are underlain by system threads, whether on Win32 or *nix, and as such, get scheduled by the system and are eligable to run on all available processors. (Unless you take extraordinary steps to prevent them from doing so.)

    I tried some tests and indeed on Linux the "top" command shows only one Perl process running on one of the 8 available CPUs even though I have 8 threads running.

    Top is fooling you.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Perl Threads and multi-core CPUs
by perrin (Chancellor) on Sep 10, 2008 at 03:21 UTC
    Those forked processes are not really taking up all that memory. The copy-on-write feature of Linux (and most modern OSes) makes them share all of the read-only parts of the parent process, although you can't see this in top. To prove it to yourself, look at the output of free and then spawn a few copies. You'll see you don't actually lose a GB of RAM each time.

    Perl threads typically use more RAM than forked processes. This is because they copy every data structure except the ones you mark as shared and get no copy-on-write help from the OS.

      This is because they copy every data structure except the ones you mark as shared

      Only if the data structures exist before the threads are spawned. So don't do that. Spawn your threads before you load or create your large data structures. It ain't rocket science.

      And, as has been mentioned before, most forked perl processes do not benefit from COW much either:

      • If you take a reference, the data is modified and therefore copied.
      • If you bless or re-bless the data, magic is added and it get copied.
      • If you perform math on data that was loaded as text, it gets converted to numeric representation, therefore is modified, and hence copied.
      • Interpolate a numeric scalar into a string, it gets copied.
      • int a real or string value, it gets copied.
      • chomp a scalar, it gets copied.
      • study a scalar, it gets copied.
      • Change a variable from readonly to readwrite, it gets copied.
      • Iterate a hash, or reset the iterator, stuff gets copied.
      • Even just searching a scalar could cause the Boyer Moore tables to be generated.

      And that's just a few of the apparently read-only operations that will cause COW to trigger. Doing anything non-readonly, like adding or deleting an element to a hash or array and you induce wholesale copying. Increment or decrement a variable. Modify it 'in-place' with s/// or tr or substr.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        If you don't load the data until after the threads are spawned, isn't it the same deal, i.e. you mark it shared or else it's per-thread? I don't see the advantage in waiting, unless you know you only need the data in some of your threads.

        COW works. It's not perfect, and less will be shared over time, but it provides a huge benefit, as anyone running a forking mod_perl or FastCGI server can see. It would be great if threads could take advantage of COW, but at the moment they don't.

      Doesn't refcounting kinda defeat COW?
        Not really. It will erode some of the shared pages over time, but in practice COW saves a lot of memory.
Re: Perl Threads and multi-core CPUs
by misterwhipple (Monk) on Sep 10, 2008 at 02:50 UTC
    I don't have the answer you need, but this might help: You may have better success attracting an answer if you sprinkle a few <p> tags in there. Long, unparagraphed text is difficult for tired old eyes like mine to read. Good luck!

    cat >~/.sig </dev/interesting

Re: Perl Threads and multi-core CPUs
by jbert (Priest) on Sep 10, 2008 at 13:43 UTC
    Obviously the first things that come to mind are threads and forking. I wrote a version based on forking and it works fine but it is a RAM hog b/c when you fork, every new process is a copy of the parent and the parent in my case is quite large b/c it loads an AI model that consumes about 1GB of RAM.

    You might want to check whether each child process has it's own copy of that 1Gbyte of data.

    Have a look at the 'SHR' column in top to see how much memory is shared.

    Otherwise, if the children don't need the AI model, you could try forking them before you load the model.

Re: Perl Threads and multi-core CPUs
by misterwhipple (Monk) on Sep 10, 2008 at 03:27 UTC
    Some random thoughts:

    (Oops, you already said you're using Linux.) What Perl module/function are you using for fork()ing?

    The forks module seems encouraging on this subject.

    This is outside my experience, but could you accomplish what you need by forking before you generate your AI model, when the parent process is small, then using shared memory to make the model available to the child processes?

    cat >~/.sig </dev/interesting

      The forks module seems encouraging on this subject.

      Can you imagine the time and memory costs of sharing 1GB of data, by converting it to Storable format and then transmitting it to each of your 'forks' via a socket?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        OK, just to confirm - does the Perlmonks community agree that Perl threads are actually runnable on separate CPUs/cores? Can anybody tell me on how to check that separate Perl threads are running on different cores in case "top is foolong you" as suggested already? I did a test to measure the time it takes to run a single process that processes all data and then another test for 8 Perl threads that each process 1/8 of the data set and the time measurement was almost exactly the same. I am doing something wrong?
Re: Perl Threads and multi-core CPUs
by Anonymous Monk on Sep 10, 2008 at 02:43 UTC
    With threads it seems that it would be easier since threads have access to global variables defined in the parent, so all spawned threads would share the same AI model and I won't have multiple 1GB copies of the parent
    Threads get copies just like fork.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://710237]
Approved by SankoR
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2024-03-19 02:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found