Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

forking and dbi

by dlspinhir (Initiate)
on Dec 12, 2002 at 21:57 UTC ( #219438=perlquestion: print w/ replies, xml ) Need Help??
dlspinhir has asked for the wisdom of the Perl Monks concerning the following question:

I am a network admin looking to store some info about our switches/routers in a postgres db. We have several hundred devices, so I am looking to poll data via snmp, and store it in my db. Since we have so many devices it takes quite a while to poll them all sequentially. For this reason I am attempting to fork off child processes in order to speed things up a bit.

I am having a problem with my database handles being closed before they should be. I get these errors when I run my script:

Database handle destroyed without explicit disconnect. Database handle destroyed without explicit disconnect. DBD::Pg::db do failed: server closed the connection unexpectedly at ./ +vlan.pl line 203.

Here is the main part of the program. I stripped out the sql for the sake of shortening it down a bit.

# DESCRIPTION # This script gets a list of non-access layer, failover switches from +a database.It will # then poll those switches for all active vlans and add them to the da +tabase. # The script will automatically delete any vlans from the db that have + been # removed from our switches, and are not marked with the 'dnd' flag. #=================================================================== # MAIN #------------------- # open db my $parent_dbh = DBI->connect($DATASOURCE, $USER); if (!$parent_dbh) { exit(0); } #--------- #------------------- # get list of switches from db my $switch_ref = $parent_dbh->selectall_arrayref($get_switch); #--------- #------------------- # fork off child processes, poll snmp, update database for (my $count = 0; $count < @$switch_ref; ++$count) { my $switch = $switch_ref->[$count]; my $pid; #------------------- # parent process if ($pid = fork) { push @pids, $pid; } #--------- #------------------- # children processes elsif (defined $pid) { #------------------- # set some variables my $ip = $switch->[0]; my $stp_domain_id = $switch->[1]; my $ro_community = $switch->[2]; my $params = {'community' => $ro_community, 'default_max_repetitions' => '48' }; #--------- #------------------- # walk mib for vlan info from this switch. my @ret = &snmpwalk($ip, $params, "ifalias"); #--------- #------------------- # open db my $child_dbh = DBI->connect($DATASOURCE, $USER); if (!$child_dbh) { exit(0); } #--------- #------------------- # set 'updated' field of the 'vlan' table to false for all # vlans in current domain $child_dbh->do($set_updated); #--------- #------------------- # insert data into vlan table foreach my $val (@ret){ my ($vlan, $name) = split(':', $val, 2); # if entry for this vlan exists, do # SQL UPDATE. else insert new data. my $result_h = $child_dbh->selectrow_array($exists_vlan); if ($result_h > 0){ $child_dbh->do($update_data); } else { $child_dbh->do($insert_data); } } #--------- $child_dbh->disconnect; exit; } #--------- #------------------- # error, the parent is saying this... else { die "Fork failed: $!\n"; } #--------- } #--------- #------------------- # wait for children to finish before proceeding foreach my $pid (@pids){ waitpid($pid,0); } #--------- #------------------- # delete all vlans that have 'dnd'='f' and where 'updated'='f' # this is query that fails because of a closed db handle $parent_dbh->do($delete_vlan); #--------- $parent_dbh->disconnect; exit(0); #=======================================

Comment on forking and dbi
Select or Download Code
Re: forking and dbi
by adrianh (Chancellor) on Dec 12, 2002 at 22:06 UTC

    The short (indeed the long) answer is that you can't share a single database connection between processes. You'll need to open a new $dbh in each fork.


    Update: 14/12/02

    I lied. Oops.

    It turns out that some DBDs do support access to the same $dbh in different processes - so in specific cases you can do this. You just have to be careful to make sure that you do not close the handle in one process if you want to continue using it in another.

    However, if you want to write code that is portable across databases then you should consider avoiding this, because not all DBDs support it.

      Right. Here's a bit of code I used recently to explore these issues. It uses (the very nice) Parallel::ForkManager, although the fundamentals are the same for normal forking. Since each handle lives in its own process, transactions are unique to each one, which is a boon for efficiency and makes it safe to have multiple users writing to the DB:

      #!/usr/bin/perl use strict; use warnings; use Parallel::ForkManager; use constant NUM => shift; # number of children to spawn. my $pfm = new Parallel::ForkManager(NUM); my $id = 0; # a unique id for each process # a sub to be run *from within the parent thread* at child creation. my $init = sub { my ($pid, $ident) = @_; print "++ $ident started, pid: $pid\n"; }; # a sub to be run *from within the parent thread* at child termination +. my $finalize = sub { my ($pid, $exit_code, $ident) = @_; print "-- $ident finalized, pid: $pid\n"; }; # set the subrefs $pfm->run_on_start($init); $pfm->run_on_finish($finalize); # now for the fun bit. for(1..NUM) { # spawn a new process and break to begin the next iteration, as each + # child process will continue the loop but the parent should not. my $pid = $pfm->start(++$id) and next; $main::db = DBI->connect( ... ); # use your specific initializatio +n args # Do your stuff, sit in a loop to service requests, etc. $main::db->commit(); # or rollback, or something depending on condit +ions. $main::db->disconnect(); $pfm->finish(); # exterminate! exterminate! } # make sure we don't leave any zombies roaming about. $pfm->wait_all_children();

      Now if you're running 5.8, you may have the option of sharing the database handle between threads, although I haven't been able to test this yet. Probably though you'd want to take advantage of the stringent separation of data between threads so you'd have similar behavior as when forking.

Re: Forking and DBI
by tadman (Prior) on Dec 12, 2002 at 22:14 UTC
    What you might consider doing is making sure that you aren't forking with any database handles open. Shut them all down, fork, and bring them back as required. For example:
    my $dbh = DBI->connect(...); $dbh->do(...); $dbh->disconnect(); foreach (...) { ... fork ... } $dbh = DBI->connect(...); $dbh->do(...); $dbh->disconnect();
    This would look a lot better if you organized things into functions.

    I think what is occuring is that the forked scripts are trying to close out the DBI connection that they inherit from the parent process. Since you can't close them more than once, you're in trouble just having them around at all when forking.
Re: forking and dbi
by Mr. Muskrat (Abbot) on Dec 12, 2002 at 22:29 UTC

    It appears to me that you are executing parent code in both the parent and the child.

    I stripped your code down considerably and placed some print statements in there... Also added code so that it would work without the dbi stuff.

    my @array = qw(1 2 3); my $switch_ref = \@array; for (my $count = 0; $count < @$switch_ref; ++$count) { my $switch = $switch_ref->[$count]; my $pid; if ($pid = fork) { push @pids, $pid; } elsif (defined $pid) { print "Child: $pid\n"; } else { die "Fork failed: $!\n"; } } foreach my $pid (@pids){ waitpid($pid,0); } print "parent: exiting.\n";

Re: forking and dbi
by ehdonhon (Curate) on Dec 12, 2002 at 23:13 UTC

    That's a lot of forking all at once, especially if you have several hundred objects you are polling. I have a feeling that you are exceeding the max connections to your database, and thus causing this code to execute:

    if (!$child_dbh) { exit(0); }

    That would explain your 'handle destroyed without disconnect messages'. To test this theory, try putting a disconnect command right before that exit and see if you still get the same warnings.

    My first suggestion would be that you look at Parallel::ForkManager. You can then tune your program to only run x processes at a time. It will make your program more scalable and keep it from getting out of control.

    My second suggestion would be to try ditcing the $child_dbh all together. I have a feeling that all of the child processes should be able to share the same $parent_dbh. (I might be mistaken here, give it a try).

      I have a feeling that all of the child processes should be able to share the same $parent_dbh.

      The problem is that when you fork, you duplicate the entire contents of the current process. However, sometimes a process has certain characteristics that are not pure data, but that are tied specifically to the process in which they were initiated. In the case of database handles, the issue is the socket connections established prior to the fork that uniquely relate the process that created them to the database itself. Those sockets can't be treated merely as pure data and hence cloned, because they constitute a relationship between two processes. And that relationship gets broken during the cloning procedure.

      AFAIK the only way to fork and share a database handle is to use interprocess communication, or to have some kind of "smart" forking procedure that reexecutes particular pieces of the code that occur before the fork and which cannot be directly cloned. But that would be hard :^P

        There are several issues with forking and using database handles opened in the parent.

        At its most basic the underlying database client libraries must support this - the Sybase libraries do, but I have heard that the Oracle libraries do not.

        Second, because the client usually maintains state associated with the connection you need to make sure that no operation is pending at the time of the fork. Otherwise you are likely to have a very confused connection.

        Third, DBI maintains some state as well, and this has to be taken into consideration as well.

        I know for a fact that I can fork a perl script that uses Sybase::CTlib and use the connection in the child. I can even fork it multiple times and use the same connection in multiple children, as long as the use is coordinated so that only one child has access to the connection at a time (see for example my Apache::Sybase::ConPool module.)
        I have not tried this with DBI, but I would expect it to work, as long as no open statements exist at the time of the fork.

        Michael

Re: forking and dbi
by mpeppler (Vicar) on Dec 12, 2002 at 23:27 UTC
    In addition to the other comments you may want to look at the InactiveDestroy DBI attribute. You can set it so that a handle created in the parent doesn't get destroyed when the child exits.

    Michael

      Thanks for all of the advise. I tried setting the InactiveDestroy attribute for both the parent and child db handles, and the problem seems to have cleared up. I will definitely look into the Parallel::ForkManage module as well. Just briefly looking at it on cpan, it looks like a pretty cool module.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://219438]
Approved by Mr. Muskrat
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2014-12-18 04:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (41 votes), past polls