Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

sharing datas between threads/forks (?)

by Anonymous Monk
on Apr 08, 2008 at 13:23 UTC ( #678974=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello, i have a program which in the beginning creates large hashes and array with data and then works with them mainly read only. until now we just start the program 4 times on the same computer to use its 4 processor cores. but now the data is about 1.5gig, so on our 4Gig machines we can only start it 2 times wasting 2 cores doing nothing. Of course the better idea is to load the data once and then let each instance of the program using this. To do this I spend the last 2 weeks in changing the programm to use threads and threads::shared, just to find out that the result is way too slow. While it does use all 4 cores, the 4 threads running parallel on 4 cores are slower then the old single thread edition running on one core :( somewhere well hidden in a doc here on perl monks I read that even shared variables are not really shared but copied, and such completely useless for me. So my big question is: are there alternatives for me? As I said I want to have the data in memory only once, but having 4 processor cores working with them. Sounds easy, but I haven't found anything so far... In case it matters, I am using Perl 5.10.0 with Linux Thanks ahead for your help!

Comment on sharing datas between threads/forks (?)
Re: sharing datas between threads/forks (?)
by Corion (Pope) on Apr 08, 2008 at 13:27 UTC

    Load the 1.5Gig data and then fork. That way, each child will have the data. The data will be copied when a child writes to it, so it's best not to write to that data unless it's absolutely necessary.

      Is there some magic going on behind a curtain, such that parent data isn't really copied until the child alters it?
Re: sharing datas between threads/forks (?)
by gamache (Friar) on Apr 08, 2008 at 13:38 UTC
    I want to have the data in memory only once, but having 4 processor cores working with them. Sounds easy, but I haven't found anything so far...

    Getting threads to work with shared read/write memory is surprisingly hard. Ask any C++ programmer. Here's a tutorial on how to do it in Perl; it definitely exceeds the scope of a messageboard post. I have never uses Perl threads myself; back when I thought I needed them, Perl threads weren't stable yet, and I haven't actually needed them ever.

    In a nutshell, life gets easier when you can use the data 100% read-only. Then you're not limited to threads as the only "easy" solution. You can just as easily whip up a forking version with parent-child communication or even a network server for the data, to which several lightweight forked processes connect.

      Yeah, Perl threads are only useful if you need to share data in realtime between threads, even then then they are slower than forking and using shared memory IPC. BUT, when the data shared is minimal, threads are easier to setup and deal with. Pure C threads work so much better, Perl's handling of threads adds alot of complications. For instance, in C a Perl shared::variable is just a global variable, and memory gets returned to the system when a thread is joined. Threads in Perl can give you the wrong impression of how well threads work in C.... see gdk-threads

      I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: sharing datas between threads/forks (?)
by perrin (Chancellor) on Apr 08, 2008 at 14:06 UTC
    You probably want to use load and then fork strategy. If you have too much writing to do for that to work (the sharing of copy-on-write does not pick up changes in one child), either use a traditional database or something like BerkeleyDB. BerkeleyDB shares the data in memory and does not require remote calls, so it's faster than any RDBMS and allows read/write on the shared data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://678974]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (13)
As of 2014-10-30 17:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (208 votes), past polls