Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Thread Design help

by perlCrazy (Monk)
on Sep 08, 2010 at 11:13 UTC ( #859301=perlquestion: print w/ replies, xml ) Need Help??
perlCrazy has asked for the wisdom of the Perl Monks concerning the following question:

Requirement: Need to connect to multiple database and host remotely, get the data back and write into text file.
Approach : Thinking to use threads. Create thread pool of 20 threads and connect simultaneously and get data back. at present i am stuck, not able to ensure that how to get connection handle for each host/data server using threads.
Would be grateful if you can point me in right direction. Thanks example:
use strict; use threads; use Data::Dumper; use threads::shared; #number of service threads to create my $threads = 20; my %dataEntity = ('TEST' => "usr:passwd", 'TEST' => "usr:passwd" ); # create a pool of three service threads foreach (1.. $threads) { threads->create(\&getData); } ## foreach my $dataServer (keys %dataEntity){ ##not sure how will i use thread here } sub getData { my $self = threads->self; my $thread_line; ##how }

Comment on Thread Design help
Download Code
Re: Thread Design help
by Corion (Pope) on Sep 08, 2010 at 12:05 UTC

    I'd use BrowserUk's approach of having a thread pool and feeding jobs to that pool through a queue. Each thread then returns its results to the main program again through a queue.

    You can easily turn your serial program into such a queue-based program after you have a working serial program. You haven't made clear that your existing program already works as a single-threaded program, so I can't give you much working code, but given your vague outline above, I'd use something like this:

    #!perl -w use strict; use threads; my $THREADS = 20; my $request = Thread::Queue->new; my $response = Thread::Queue->new; my %dataEntity = ( ... ); # Submit all requests for my $dbname (keys %dataEntity) { $request->enqueue([ $dbname, $dataEntity{ $dbname } ]); }; # Tell each thread that we're done for (1..$THREADS) { $request->enqueue( undef ); }; # Launch our threads for (1..$THREADS) { async(\&getData); }; sub getData { while (my $job = $request->dequeue()) { my ($dbname, $credentials) = @$job; # connect to DB, retrieve information $response->enqueue( "Results from $dbname" ); }; # tell our main thread we're done $response->enqueue( undef ); }; while (my $payload = $response->dequeue or $THREADS--) { print Dumper $payload; };
      Thanks for response.
      When i run the code, getting error : Invalid value for shared scalar at /opt/perl-5.8.6_1/lib/5.8.6/Thread/Queue.pm line 90, <> line 10.
      Posting my code here.
      use strict; use threads; use Data::Dumper; use Thread::Queue; my $THREADS = 5; my %dataEntity; while(<>){ chomp; next if !length($_); my ($dsName,$passwd) = split /\|/, $_; $dataEntity{$dsName} = $passwd; } my $request = Thread::Queue->new; my $response = Thread::Queue->new; # Submit all requests for my $dbname (keys %dataEntity) { $request->enqueue([$dbname,$dataEntity{$dbname}]); }; # Tell each thread that we're done for (1..$THREADS) { $request->enqueue(undef); }; # Launch our threads for (1..$THREADS) { async(\&getData); }; sub getData { while (my $job = $request->dequeue()) { my ($dbname, $credentials) = @$job; #connect to DB, retrieve information my $dbh = getConn($dbname,$credentials); my %results; my $resArrRef = $dbh->selectall_arrayref("select srvname,dbnam +e from syslogins",{ Slice => {} }); foreach my $row ( @$resArrRef ) { $results{$row->{srvname}} = $row->{dbname}; } $response->enqueue(\%results); }; # tell our main thread we're done $response->enqueue( undef ); }; while (my $payload = $response->dequeue or $THREADS--) { print Dumper $payload; }

        You don't show &getConn. How am I supposed to debug your code?

        On the off-chance that I'm psychic, I'll still venture a guess. &getConn caches a database handle. Objects cannot be shared across threads, and &getConn tries to access shared data. You will need to connect to a database from within each thread that wants to access it.

        If your process is simply collecting information in parallel, maybe just launch n instances of it instead and have them write to either a database or append to a file. Threads and DBI together are a recipe for disaster if you're not careful.

        How the below blocks will get executed ? This is my my first thread program, want to understand in details.
        My requirement : i will have 1000's of data server which will be diveided in different groups. Now if I start this program, will all 1000's of dataserver be processed by 10 or 20 threads.
        Also this has to be keep running based on some interval time. Once we put all serverrs in enqueue, how each thread will pick up the data.
        note: each dataserver will be on different host and port so connection will be independent. Thanks for help.
        # Tell each thread that we're done for (1..$THREADS) { #####what is this block doing $request->enqueue(undef); }; # Launch our threads for (1..$THREADS) { async(\&getData); };
Re: Thread Design help
by zentara (Archbishop) on Sep 08, 2010 at 16:58 UTC
Re: Thread Design help
by roboticus (Canon) on Sep 08, 2010 at 18:03 UTC

    perlCrazy:

    You're already getting help with your threading question. So I'll just mention that if your data servers are fast, then querying them in parallel might do you no good, as you may be saturating your network bandwidth with a single dataserver. In that case, I'd just query them serially.

    ...roboticus

      in some cases one dataserver might take 1-2 hours to get back, depending on data size. if i run sequentially for 1000s datasever might not able to accomplish task.
Re: Thread Design help
by Proclus (Beadle) on Sep 09, 2010 at 23:20 UTC
    Whenever I need parallelism and nonblocking applications, I look at POE. Most likely that someone else before me has tackled my problem and actually turned the solution into a great POE module.

    In your case this one may help:
    POE::Component::Pool::DBI

    You might still need a threading solution and it mixes well with POE, at least in my GUI apps.

    Also, I have to say that BrowserUK always manages to turn cumbersome Perl threads into an art form.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://859301]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2014-10-26 03:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (149 votes), past polls