Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

If I am tied to a db and I join a thread, program chrashes

by lance0r (Novice)
on Jun 04, 2009 at 05:21 UTC ( [id://768289]=perlquestion: print w/replies, xml ) Need Help??

lance0r has asked for the wisdom of the Perl Monks concerning the following question:

I'm using Kubuntu 8.01 and perl 5.10.0 The following code produces a Segmentation error and I don't have a clue as to why. I'm trying to run multiple threads on different pieces of the database, which is a hash of hashes. I need to do this in order to use more of my Quad core CPU to make my code run faster. I vaguely think there might be something about MLDBM that is thread intolerant. Any hints would be appreciated. thanks.
#! /usr/bin/perl -w use strict; use threads; use MLDBM qw(DB_File Storable); use Fcntl; my $file = "/home/silly/g/data/100HOHsynTriNormCV.db"; my %syntrihash; tie %syntrihash,'MLDBM',$file, O_RDONLY or die "tie failed for db $!\n +"; my $checking = keys %syntrihash; print "$checking\n"; my $thr1 = threads->create({'context' => 'array'}, \&subthr1, "test1") +; my @return1 = $thr1 -> join(); print "@return1 \n"; sub subthr1{ my ($message) = @_; print "Thread Message is $message\n"; return (1,2,3); }
Here is the result of running the program:
silly@bluetit:~/perl/threads$ thread.pl 100 Thread Message is test1 1 2 3 Segmentation fault

Replies are listed 'Best First'.
Re: If I am tied to a db and I join a thread, program chrashes
by jethro (Monsignor) on Jun 04, 2009 at 09:47 UTC

    perl threads will not make your program faster on a CPU with more cores. Someone recently tested it and the perl implementation of threads is actually making it worse on multi-core machines in most cases

    Better use real processes and for example the module MLDBM::Sync to make concurrent access to it reentrant.

    Also possible would be using a real separate database engine like mysql or postgres. This would already split the task into two processes and concurrent access would be handled completely by the database

      perl threads will not make your program faster on a CPU with more cores. Someone recently tested it and the perl implementation of threads is actually making it worse on multi-core machines in most cases

      As posted, that is nothing but FUD!

      1. Who tested?
      2. What did they test?
      3. How did they test?
      4. "Most cases" of what?

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        The test was done by Marc Lehmann and he showed his results at the german perl workshop this year. Sadly his talk is not available online and I had to cite from memory when I answered. I have it before me now and can translate some points for you:

        1) Perls interpreter-threads are a sort of backport of ActiveStates implementation of forks through windows threads. The whole perl interpreter gets copied with all variables. Every function internally gets an additional parameter that tells perl where to find the variables (I guess he means for synchronising). This makes perl slower even if you don't use threads, makes it instable and doesn't work well will XS modules. There is no common address space, so you don't get any of the advantages of threads and have to pay the price of the synchronisation

        2) Threads don't work well in multi-core systems because every cpu has its own MMU and Cache. Because threads use resources together, all MMUs and caches have to be synchronized often. For example if a thread allocates memory, every cpu has to be halted and their state synchronized. Perls thread implementation doesn't do that (see above), but pays with the additional indirection on every variable access which costs 15 to 200% compared to a perl without thread support (even when not using threads).

        3) Marc did tests with a matrix multiplication (selected because it uses much variable sharing). Slowest was the one with 4 interpreter-threads on a quad-core machine. 20 times faster was the same 4 interpreter threads on a single core(!). 300 times faster than the interpreter threads was an implementation of cooperative/non-preemptive threads (Coro, written by Marc Lehmann) on a single core.

        To answer your question 4 now, perls interpreter threads seem not to work well on multi-cores in those cases where they actually make extensive use of the sharing of their variables (that is if Marcs results are not fake, fabricated or erroneous(sp?)). Some of his points you can read in his documentation to Coro if you are interested

      Thank you esteemed jethro for your insight. I did test 4 threads running the same code at the same time (obviously with no tie to a database) and while memory use went up, the total time was about 1.5 times a single run, not the expected 4 times longer i would have without threads. CPU usage went up to 95%+ instead of 25% that was used without threads.

      Plus I am back on track thanks to monk clinton, who told me to tie to the database inside the thread, one tie for each thread. Seems to work so far.

      And thanks for the tip on MLDBM::Sync. I am using it without problems so far. lance
        Hello Monks and lance0r I am facing the problem with a similar situation but when I want to connect with different schemas on the same machine. Actually I need to run perl reports but need to run them parallel with different schemas. Would that thread make any problem? Initially i was planning to use fork() inbuilt function but then we changed to 'use thread' but now its giving me segmentation fault error. Can anyone please help me out here?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://768289]
Approved by ikegami
Front-paged by tweetiepooh
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (7)
As of 2024-04-23 09:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found