Re: Sharing database handle
by Fletch (Bishop) on Apr 18, 2002 at 13:52 UTC
|
Unless you're very cozy with the database backend (enough
so that you can figure out how to do some sort of locking
of whatever communications mechanism it's using to talk
to its server), you really don't want to do this. This
is why things such as Apache::DBI usually set
up persistent handles after the child processes have forked.
Check out POE::Component::DBIAgent and POE
for another possibility.
As for the second, DBI has a ping method. Check the docs.
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by perlplexer (Hermit) on Apr 18, 2002 at 13:57 UTC
|
ping
$rc = $dbh->ping;
Attempts to determine, in a reasonably efficient way, if
the database server is still running and the connection to
it is still working. Individual drivers should implement
this function in the most suitable manner for their
database engine.
--perlplexer | [reply] [Watch: Dir/Any] [d/l] |
Re: Sharing database handle
by TheHobbit (Pilgrim) on Apr 18, 2002 at 13:59 UTC
|
Hi,
you do not say so, but I suppose you use DBI. In that case, if your database driver implements it, you could use the ping() method to check the connection.
What would you like to do, is (more or less) the same thing that the Apache::DBI module do. Maybe you could get inspiration from that module (I only use it, never went into his bowels)
Cheers
Leo TheHobbit
GED/CS d? s-:++ a+ C++ UL+++ P+++>+++++ E+ W++ N+ o K? !w O? M V PS+++
PE-- Y+ PPG+ t++ 5? X-- R+ tv+ b+++ DI? D G++ e*(++++) h r++ y+++(*)
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by Elian (Parson) on Apr 18, 2002 at 15:13 UTC
|
It's not wise, and it won't always work. Some databases, Oracle specifically though there may be others, will fail horribly if you try this--the data behind the handle is process-specific, and handles won't work right if at all when you try this. | [reply] [Watch: Dir/Any] |
|
Hi,
well, this <emph>only</emph> implies that you'll have to devote a process (either the father or one of the siblings) handle the handle (bad pun, sorry) and the (other) children will have to talk with that process to use the connection.
IIRC, that's the way Apache::DBI works.
Cheers
Leo TheHobbit
GED/CS d? s-:++ a+ C++ UL+++ P+++>+++++ E+ W++ N+ o K? !w O? M V PS+++
PE-- Y+ PPG+ t++ 5? X-- R+ tv+ b+++ DI? D G++ e*(++++) h r++ y+++(*)
| [reply] [Watch: Dir/Any] |
|
No, Apache::DBI doesn't do any inter-process communication. It's dirt simple. It just opens a separate database handle in each process and keeps it open.
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by mamboking (Monk) on Apr 18, 2002 at 18:16 UTC
|
Instead of sharing a connection you probably want to do connection pooling. You could have your app open up a bunch of connections at application startup then as a child process is spawned it gets a connection from the pool. When it is finished it returns the connection to the pool. If there aren't any available connections then your process can block until one becomes available. Since opening and closing database connections is very costly this should greatly help your performance. Also, by pre-allocating the connections, you always have control over how many connections are out there. | [reply] [Watch: Dir/Any] |
|
I don't think that will actually work. The trouble is, you can't easilly pass DBI objects (which involve open sockets and XS code) back and forth between processes.
| [reply] [Watch: Dir/Any] |
|
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by gav^ (Curate) on Apr 18, 2002 at 22:28 UTC
|
It would have been very handy to know what database you're talking about. With MySQL, connecting is fast, in fact they list it when talking about optimization.
You might find it more portable (and perhaps faster) to tweak your database to allow more connections and to do anything suggested to improve performance on that end.
Another option maybe for the child processes not to write to the database at all (especially in the case with logging) and for the parent to dump the data into the db at regular intervals.
Hope this helps...
gav^ | [reply] [Watch: Dir/Any] |
|
Another option maybe for the child processes not to write to the database at all (especially in the case with
logging) and for the parent to dump the data into the db at regular intervals.
Well, this may be an option, but it's probably not a good
idea. You'd hate to have a bunch of inserts or updates
queued in the parent and then have that process
go down before it can finish the queue, or find out that
the connection dropped for whatever reason since the last
time it processed a queue.
Has anyone (esp. Marcello) seen data about how much it costs
loadwise to have a bunch of child connections? My own
(anecdotal, not systematic) experience has been that the
number of connections is less important than what those
connections are doing. If the SQL demands of any one
process are heavy, having more than a couple going at once
will hurt, no matter how you do it. If it's easy stuff, then
number of concurrent requests is not much of an issue.
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by Zapawork (Beadle) on Apr 18, 2002 at 19:58 UTC
|
Here is another idea,
You are already spawning child processes so why not set up either a bidirectional pipe or a seperate IPC to allow the child to send requests to the parent. In this fashion the parent could have one main routine to poll the child processes for queries, test the connect (dbi->ping, dbi is just a great package) and then execute and return the information to the child.
In this model you would seem to accomplish your goal, you wuold allow all of your child processes to communicate through one ODBC connection, you'd have only one ODBC connection and for fun you'd have one place to check for improper data within queries.
Dave -- Saving the world one node at a time | [reply] [Watch: Dir/Any] |
|
This is a nifty thought, but I'd worry that it would make the
parent/server a bit monstrous and hard to get right. I guess
if each child request is read, executed and answered back to
the child in one block, it's certainly manageable.
There still is the empirical question: is it better to
serialize all activity through a single, very active connection, as
opposed to having a bunch of connections (leaving it to the
DBMS to coordinate the actions, which is, in part, what
it was written to do)? I don't really know that yet (sorry).
| [reply] [Watch: Dir/Any] |
|
Sorry about the late response,
Proof to the pudding, so to speak, for this type of practive goes back to the ODBC models that all of your modern day websties use.
Having one central connection allows you to control access while guaranting preformance. If you need data in a large scale form, you usually implement application servers to cache and give performance enhancements.
Hope this helps
Dave -- Saving the world one node at a time
| [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by elwarren (Priest) on Apr 18, 2002 at 21:02 UTC
|
Sounds like it might be a good use for DBI::Proxy | [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by BUU (Prior) on Apr 18, 2002 at 18:33 UTC
|
Is there anyway to do something similar to connnection pooling just using cgi? | [reply] [Watch: Dir/Any] |
Re: Sharing database handle
by d_i_r_t_y (Monk) on Apr 21, 2002 at 03:10 UTC
|
hi marcello,
I have an application which forks of child processes to do all the work. At the moment, every child process creates its own database connection, does its work and closes the database connection again. This causes a lot of connections to be made to the database.
how many children are you forking?!
seriously, we have done the same thing for our biological data since the amount of work preceding each insert/update justifies the parallel model.
i don't think there is a problem with many db connections as long as the work for each thread justifies the connection time overhead. on mysql, connections are cheap and fast and i wouldn't even think twice about doing it (though concurrency/locking with mysql is more or an issue...). on db2, where 1 connection costs about 750msec each (that's client/server on same machine!), you would want to have at least 750msec worth of work for each process to justify the connection overhead. that said, divide and conquer works a treat if you can break down your workload into mutually exclusive chunks.
btw, have to say it's sure satisfying to see those CPUs and disks crying for mercy when you have 8 concurrent processes hammering away inserts to the db...
matt
| [reply] [Watch: Dir/Any] |
|
Hi Matt,
I'm forking about 1 process every second. And indeed, I use DBI to connect to a MySQL database. The child processes only perform simple SELECT and INSERT queries, so no heavy stuff.
Judging from the reactions above, I guess it's best to keep it the way it is. I was just wondering if I could be more efficient.
Grtz Marcello
| [reply] [Watch: Dir/Any] |
|
i'm forking about 1 process every second.
continuously?! surely not?!!... perhaps you should post the code?
if divide and conquer is not applicable then i have one other suggestion:
- open a local UNIX socket/pipe
- fork a predefined set of children (say, 8), have each of the children open a DBI handle to mysql
- set up the parent process to loop over your insert data spooling it to the socket that the children are all listening to/blocking on; meanwhile each of children are blocking in the $server->accept() call waiting for the parent to send them data
- upon reading a chunk of data, enter the insert routine and loop back to the accept
- continue until parent runs out of data
an example of this approach (a pre-forking server, like apache) is in the perl cookback in the chapter on sockets or IPC, though i don't have the book in front of me at the moment, so can't give specific recipe.
but you're probably right; unless you're loading >1Gb of data (which does or does not need transformation prior to insert), it's probably fast enough. ;-)
d_i_r_t_y
| [reply] [Watch: Dir/Any] |