Sharing database handle

Marcello has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sharing database handle by Fletch (Bishop) on Apr 18, 2002 at 13:52 UTC
Unless you're very cozy with the database backend (enough so that you can figure out how to do some sort of locking of whatever communications mechanism it's using to talk to its server), you really don't want to do this. This is why things such as Apache::DBI usually set up persistent handles after the child processes have forked. Check out POE::Component::DBIAgent and POE for another possibility. As for the second, DBI has a ping method. Check the docs.	[reply]
Re: Sharing database handle by perlplexer (Hermit) on Apr 18, 2002 at 13:57 UTC
"Have I told you... lately..." that (Super Search)++ ? Check this out - Sharing a database handle over multiple processes. From 'perldoc DBI' `ping $rc = $dbh->ping; Attempts to determine, in a reasonably efficient way, if the database server is still running and the connection to it is still working. Individual drivers should implement this function in the most suitable manner for their database engine.` [download] --perlplexer	[reply] [d/l]
Re: Sharing database handle by TheHobbit (Pilgrim) on Apr 18, 2002 at 13:59 UTC
Hi, you do not say so, but I suppose you use DBI. In that case, if your database driver implements it, you could use the ping() method to check the connection. What would you like to do, is (more or less) the same thing that the Apache::DBI module do. Maybe you could get inspiration from that module (I only use it, never went into his bowels) Cheers Leo TheHobbit GED/CS d? s-:++ a+ C++ UL+++ P+++>+++++ E+ W++ N+ o K? !w O? M V PS+++ PE-- Y+ PPG+ t++ 5? X-- R+ tv+ b+++ DI? D G++ e(++++) h r++ y+++()	[reply]
Re: Sharing database handle by Elian (Parson) on Apr 18, 2002 at 15:13 UTC
It's not wise, and it won't always work. Some databases, Oracle specifically though there may be others, will fail horribly if you try this--the data behind the handle is process-specific, and handles won't work right if at all when you try this.	[reply]
Re: Re: Sharing database handle by TheHobbit (Pilgrim) on Apr 18, 2002 at 15:30 UTC
Hi, well, this <emph>only</emph> implies that you'll have to devote a process (either the father or one of the siblings) handle the handle (bad pun, sorry) and the (other) children will have to talk with that process to use the connection. IIRC, that's the way Apache::DBI works. Cheers Leo TheHobbit GED/CS d? s-:++ a+ C++ UL+++ P+++>+++++ E+ W++ N+ o K? !w O? M V PS+++ PE-- Y+ PPG+ t++ 5? X-- R+ tv+ b+++ DI? D G++ e(++++) h r++ y+++()	[reply]
Re: Re: Re: Sharing database handle by perrin (Chancellor) on Apr 18, 2002 at 18:51 UTC
No, Apache::DBI doesn't do any inter-process communication. It's dirt simple. It just opens a separate database handle in each process and keeps it open.	[reply]
Re: Sharing database handle by mamboking (Monk) on Apr 18, 2002 at 18:16 UTC
Instead of sharing a connection you probably want to do connection pooling. You could have your app open up a bunch of connections at application startup then as a child process is spawned it gets a connection from the pool. When it is finished it returns the connection to the pool. If there aren't any available connections then your process can block until one becomes available. Since opening and closing database connections is very costly this should greatly help your performance. Also, by pre-allocating the connections, you always have control over how many connections are out there.	[reply]
Re: Re: Sharing database handle by perrin (Chancellor) on Apr 18, 2002 at 18:40 UTC
I don't think that will actually work. The trouble is, you can't easilly pass DBI objects (which involve open sockets and XS code) back and forth between processes.	[reply]
Re: Re: Sharing database handle by belg4mit (Prior) on Apr 24, 2002 at 02:24 UTC
Here is a connection pooler, sqlrelay. Good thing is it has interfaces for numerous languages as well (so you can pool what would otherwise be per-language puddles). I haven't gotten around to setting it up yet (tuit deficit), but look forward to doing so. `-- perl -pew "s/\b;([mnst])/'$1/g"`	[reply]
Re: Sharing database handle by gav^ (Curate) on Apr 18, 2002 at 22:28 UTC
It would have been very handy to know what database you're talking about. With MySQL, connecting is fast, in fact they list it when talking about optimization. You might find it more portable (and perhaps faster) to tweak your database to allow more connections and to do anything suggested to improve performance on that end. Another option maybe for the child processes not to write to the database at all (especially in the case with logging) and for the parent to dump the data into the db at regular intervals. Hope this helps... gav^	[reply]
Re: Re: Sharing database handle by graff (Chancellor) on Apr 20, 2002 at 00:40 UTC
Another option maybe for the child processes not to write to the database at all (especially in the case with logging) and for the parent to dump the data into the db at regular intervals. Well, this may be an option, but it's probably not a good idea. You'd hate to have a bunch of inserts or updates queued in the parent and then have that process go down before it can finish the queue, or find out that the connection dropped for whatever reason since the last time it processed a queue. Has anyone (esp. Marcello) seen data about how much it costs loadwise to have a bunch of child connections? My own (anecdotal, not systematic) experience has been that the number of connections is less important than what those connections are doing. If the SQL demands of any one process are heavy, having more than a couple going at once will hurt, no matter how you do it. If it's easy stuff, then number of concurrent requests is not much of an issue.	[reply]
Re: Sharing database handle by Zapawork (Beadle) on Apr 18, 2002 at 19:58 UTC
Here is another idea, You are already spawning child processes so why not set up either a bidirectional pipe or a seperate IPC to allow the child to send requests to the parent. In this fashion the parent could have one main routine to poll the child processes for queries, test the connect (dbi->ping, dbi is just a great package) and then execute and return the information to the child. In this model you would seem to accomplish your goal, you wuold allow all of your child processes to communicate through one ODBC connection, you'd have only one ODBC connection and for fun you'd have one place to check for improper data within queries. Dave -- Saving the world one node at a time	[reply]
Re: Re: Sharing database handle by graff (Chancellor) on Apr 20, 2002 at 01:02 UTC
This is a nifty thought, but I'd worry that it would make the parent/server a bit monstrous and hard to get right. I guess if each child request is read, executed and answered back to the child in one block, it's certainly manageable. There still is the empirical question: is it better to serialize all activity through a single, very active connection, as opposed to having a bunch of connections (leaving it to the DBMS to coordinate the actions, which is, in part, what it was written to do)? I don't really know that yet (sorry).	[reply]
Re: Re: Re: Sharing database handle by Zapawork (Beadle) on Apr 21, 2002 at 18:05 UTC
Sorry about the late response, Proof to the pudding, so to speak, for this type of practive goes back to the ODBC models that all of your modern day websties use. Having one central connection allows you to control access while guaranting preformance. If you need data in a large scale form, you usually implement application servers to cache and give performance enhancements. Hope this helps Dave -- Saving the world one node at a time	[reply]
Re: Sharing database handle by elwarren (Priest) on Apr 18, 2002 at 21:02 UTC
Sounds like it might be a good use for DBI::Proxy	[reply]
Re: Sharing database handle by BUU (Prior) on Apr 18, 2002 at 18:33 UTC
Is there anyway to do something similar to connnection pooling just using cgi?	[reply]
Re: Sharing database handle by d_i_r_t_y (Monk) on Apr 21, 2002 at 03:10 UTC
hi marcello, I have an application which forks of child processes to do all the work. At the moment, every child process creates its own database connection, does its work and closes the database connection again. This causes a lot of connections to be made to the database. how many children are you forking?! seriously, we have done the same thing for our biological data since the amount of work preceding each insert/update justifies the parallel model. i don't think there is a problem with many db connections as long as the work for each thread justifies the connection time overhead. on mysql, connections are cheap and fast and i wouldn't even think twice about doing it (though concurrency/locking with mysql is more or an issue...). on db2, where 1 connection costs about 750msec each (that's client/server on same machine!), you would want to have at least 750msec worth of work for each process to justify the connection overhead. that said, divide and conquer works a treat if you can break down your workload into mutually exclusive chunks. btw, have to say it's sure satisfying to see those CPUs and disks crying for mercy when you have 8 concurrent processes hammering away inserts to the db... matt	[reply]
Re: Re: Sharing database handle by Marcello (Hermit) on Apr 22, 2002 at 13:14 UTC
Hi Matt, I'm forking about 1 process every second. And indeed, I use DBI to connect to a MySQL database. The child processes only perform simple SELECT and INSERT queries, so no heavy stuff. Judging from the reactions above, I guess it's best to keep it the way it is. I was just wondering if I could be more efficient. Grtz Marcello	[reply]
Re: Re: Re: Sharing database handle by d_i_r_t_y (Monk) on Apr 23, 2002 at 08:40 UTC
i'm forking about 1 process every second. continuously?! surely not?!!... perhaps you should post the code? if divide and conquer is not applicable then i have one other suggestion: open a local UNIX socket/pipe fork a predefined set of children (say, 8), have each of the children open a DBI handle to mysql set up the parent process to loop over your insert data spooling it to the socket that the children are all listening to/blocking on; meanwhile each of children are blocking in the $server->accept() call waiting for the parent to send them data upon reading a chunk of data, enter the insert routine and loop back to the accept continue until parent runs out of data an example of this approach (a pre-forking server, like apache) is in the perl cookback in the chapter on sockets or IPC, though i don't have the book in front of me at the moment, so can't give specific recipe. but you're probably right; unless you're loading >1Gb of data (which does or does not need transformation prior to insert), it's probably fast enough. ;-) d_i_r_t_y	[reply]


Perl Monk, Perl Meditation
	PerlMonks