Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^7: Thread Design help

by BrowserUk (Patriarch)
on Sep 08, 2010 at 19:59 UTC ( [id://859407]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Thread Design help
in thread Thread Design help

The combination of this line:

$request->enqueue( [$dbname,$dataEntity{$dbname}] );

And the very down-level version of threads you are using is almost certainly the source of your problem.

You might be able to fix it without upgrading by changing the code to be:

use threads::shared; ... for my $dbname (keys %dataEntity) { my @args :shared = ( $dbname,$dataEntity{$dbname} ); $request->enqueue( \@args ); };

But I would still suggest upgrading threads, threads::shared and Thread::Queue ASAP.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^8: Thread Design help
by perlCrazy (Monk) on Sep 11, 2010 at 09:09 UTC
    Thanks for reply and help Browser UK.
    Need you help and opinion for disigning this collector.
    Requirement: I will have many dataservers and host that will be divided in multiple groups( wil contain dataservers). We need to connect to each of dataserver/host and execute some commands/utility and process the data, bring back the data and write into a file.
    ex: DSA1,DSA2...DSAN
    1. conenct to DSA1, process and get the data and write into a file
    2. once we get the data back from DSA1, may be after 1/2 hour will need to conenct again and get back data...this will continue depending on interval time.
    above step 1 and 2 should run for many servers ( may be 1000s). Do you think thread queue is best way for this requirement ?
    is below code is the best way to move forward ? how will I run the job again for particular group or DSA on demand basis?
    Thanks a lot for your help in advance. any suggestions/directions will help.
    #!/usr/bin/perl use strict; use threads; use Data::Dumper; use Thread::Queue; warn "Using threads $threads::VERSION"; warn "Using Thread::Queue $Thread::Queue::VERSION"; my $THREADS = 3; my %dataEntity; while(<>){ chomp; next if !length($_); my ($dsName,$passwd) = split /\|/, $_; $dataEntity{$dsName} = $passwd; } my $request = Thread::Queue->new; my $response = Thread::Queue->new; # Submit all requests for my $dbname (keys %dataEntity) { $request->enqueue([$dbname,$dataEntity{$dbname}]); }; # Tell each thread that we're done for (1..$THREADS) { $request->enqueue(undef); }; # Launch our threads for (1..$THREADS) { async(\&getData); }; sub getData { ## my $idx = 1; while (my $job = $request->dequeue()) { my ($dbname, $credentials) = @$job; #connect to DB, retrieve information #my $dbh = getConn($dbname,$credentials); my %results; #my $resArrRef = $dbh->selectall_arrayref("select srvname,dbna +me from syslogins",{ Slice => {} }); # package some dummy results my $resArrRef = [ { srvname => "server:$dbname:".$idx++, dbname => $dbname, +}, { srvname => "server:$dbname:".$idx++, dbname => $dbname, +}, { srvname => "server:$dbname:".$idx++, dbname => $dbname, +}, { srvname => "server:$dbname:".$idx++, dbname => $dbname, +}, ]; foreach my $row ( @$resArrRef ) { $results{$row->{srvname}} = $row->{dbname}; } $response->enqueue(\%results); } # tell our main thread we're done $response->enqueue( undef ); ## } while ($THREADS) { while (my $payload = $response->dequeue()) { print Dumper $payload; }; $THREADS-- } sub getConn { my ($DB,$pwd) = @_; return $dbh; }

      You have reposted the code I gave you in 859377. You will need to do some actual programming yourself. This is not a code writing service.

      If you want to request servers repeatedly, put them into the queue repeatedly.

      1. conenct to DSA1, process and get the data and write into a file
      • Connect:

        What type of connection? DBI, SSH, tcp?

      • Process:

        Process locally (in your program), or remotely on the server?

        How long will that processing run?

      • Get the data:

        How much data?

      • Write to file:

        Just read & write as is, or read, process locally and write?

      • Connect again may be after 1/2 hour:

        Exactly half an hour? Or as quickly as possible after all other servers have been serviced?

        What determines the frequency of reconnection? How important is the timing? Must it be done to the second, or is 'best endevours' good enough?

      • should run for many servers ( may be 1000s):

        You don't yet know? How many 1000s?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Thanks for quick response. please see my comments
        1. Connect : if datasever then DBI, if host then ssh or tcp
        2. Process: remotely on the server if host.
        if database then execute few sql query
        3. How long will that processing run? >> for few server it might take hour, for few it will take less.
        4. Get the data: How much data?
        >> data will vary from dataserver to dataserver, depending on activities on server. it can be in KB but not more than 1-2 MB. Since we are planning to run very frequently so we can handle data properly.
        5. Write to file:
        Just read & write as is, or read, process locally and write?
        >>read, process locally and then write.
        6. Connect again may be after 1/2 hour: Exactly half an hour? Or as quickly as possible after all other servers have been serviced?
        >> this we will decide, depending onnature od dataserver we can decide interval time.
        7. What determines the frequency of reconnection? How important is the timing? Must it be done to the second, or is 'best endevours' good enough?
        >>depends on interval time or we can decide the best way. idea is to collect data after every 10 minutes or 30 minutes, from each server
        8. should run for many servers ( may be 1000s):
        You don't yet know? How many 1000s?
        >>This will keep growing in future. initially we are targetting for 1000.
        Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://859407]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (5)
As of 2024-04-19 10:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found