New twist for DBD::mysql

A new DBD::mysql

Patrick Galbraith, senior developer at MySQL AB, has made available a modified version of DBD::mysql. It is not in the CPAN yet, (and I believe it will go there when the maintainer is satisfied with the tests) but it is usable.

What is the fuss about? This version makes a few interesting additions to this popular module:

Support for true prepared statements. Version 4.1 of MySQL introduces prepared statements, with a few additions to the C API. So far, the DBI was emulating prepared statements in MySQL, with the convenience and security of placeholders, but without the efficiency that should come along.
Now you may use the full potential of prepared statements with DBI/DBD::mysql.
To take advantage of this new feature, you have to set an attribute in the database handler, either at connection time or later, when you need it.
```
# either
my $dbh = DBI->connect("dbi:mysql:dbname;mysql_server_prepare=1",
     "user","pass");

# or 
$dbh->{ mysql_server_prepare } = 1;
[download]
```
Alternatively, you can trigger the prepared statement mechanism by setting an environment variable before executing your script:
```
export MYSQL_SERVER_PREPARE=1
[download]
```
Support for placeholders in LIMIT. This was a bug (or a feature, or lack of feature, depending on your angle) that made many people upset. I recall several irate users venting their rage at the mailing list, asking for this feature.
Mostly wanted to create pages of results, now it is available.
```
my $query = qq{SELECT col1, col2, col3 FROM mytable LIMIT ? , ?};
my $sth = $dbh->prepare($query);
my $lines = 20;

for my $page_no (1..10) {
    $sth->execute($page_no, $lines);
    # do something with the page.
}
[download]
```
Support for embedded MySQL server. A new module DBD::mysqlEmb comes with the same distribution, and it lets you use MySQL databases without a server. The client application will include an embedded server library, and you can manipulate a MySQL database without the hassle of installing the server.
It's like DBD::SQLite but it isn't "lite." The only significant limitation is that it should be used by a single user only.

Some documentation is available at this OSCON presentation (PDF).

Installing it

As I mentioned before, it is not on CPAN yet. You must get it from the author's CVS and fiddle a little bit to get it up and running.

The main problem is that this new version requires the client library that comes with MySQL 4.1, and if you are already using MySQL 3.23/4.0 in your box, you should install the 4.1 version, and since it is not advisable to trust new software before testing it, you shold install it without removing the working server. This is not very difficult, but not easy for the average Joe either. Therefore, some caution should be necessary. Moreover, you may need to compile the server source rather than using the binaries if your system is using old libraries (at least for the embedded server this is what I had to do).

benchmarking

This change is not easy to measure up, because the benefits of a prepared statement may vary significantly depending on the query complexity. Nonetheless, I made some measurements, and even with a simple query, the difference is visible. (Please refer to Speeding up the DBI for a detailed explanation of the profiling methods used here)

$dbh1->{ mysql_server_prepare } = 0;  # emulated prepared statement
$dbh2->{ mysql_server_prepare } = 1;  # real prepared statement

my $query = qq{select book_id from books where author_id = ?  };

use DBI::Profile;
$dbh1->{Profile} = DBI::Profile->new;
$dbh2->{Profile} = DBI::Profile->new;
$dbh1->{Profile} = 4;
$dbh2->{Profile} = 4;

my $sth1 = $dbh1->prepare($query);
my $sth2 = $dbh2->prepare($query);

timethese ( 5000,
    {
        emulated => sub {
            for (1 .. 15) {
                $sth1->execute($_);
                my $rows = $sth1->fetchall_arrayref();
            }
        },
        prepared => sub {
            for (1 .. 15) {
                $sth2->execute($_);
                my $rows = $sth2->fetchall_arrayref();
            }
        }
   }
);
print "emulated\n", $dbh1->{Profile}->format;
print "prepared\n", $dbh2->{Profile}->format;
$dbh1->{Profile} =0;
$dbh2->{Profile} =0;
[download]

The benchmarking code will simply prepare two qwueries, one without using the new feature and one taking advantage of it.

Benchmark: timing 5000 iterations of emulated, prepared...
  emulated: 24 wallclock secs ( 6.88 usr +  2.52 sys =  9.40 CPU)
  prepared: 20 wallclock secs ( 6.84 usr +  2.31 sys =  9.25 CPU)

The result shows that there is a (little) convenience with the true prepared statements. The real difference, though, comes when we analyze the profiling output.

emulated
DBI::Profile: 22.655414s (150003 calls) benchdbd.pl @ 2004-09-14 10:31:07
'FETCH'             => 0.000006s
'STORE'             => 0.000237s
'execute'           => 20.919974s / 75000 = 0.000279s avg 
'fetchall_arrayref' => 1.735029s / 75000 = 0.000023s avg 
'prepare'           => 0.000168s

prepared
DBI::Profile: 19.629611s (150003 calls) benchdbd.pl @ 2004-09-14 10:31:07
'FETCH'             => 0.000007s
'STORE'             => 0.000103s
'execute'           => 17.339136s / 75000 = 0.000231s avg 
'fetchall_arrayref' => 2.289909s / 75000 = 0.000031s avg 
'prepare'           => 0.000456s

Notice first that the 'prepare' method takes longer when using the real prepared statements. This is because the emulated 'prepare' is just a way of passing parameters, and it does not call the dataabse server at all. The difference is all in the 'execute' method, where the emulated version takes 15% longer than the new implementation.

Update Sep 14, 2004 at 11:23 GMT+1
I also noticed that fetchall_arrayref is taking longer but I forgot to mention it here. (Thanks to demerphq for reminding me).
I assume (but this is really a wild guess, mind you) that this is due to the larger data structure needed for prepared statement that is being released at this moment.
Anyway, hopefully the author is going to look at this page and if an explanation exists it will come up eventually.

 _  _ _  _  
(_|| | |(_|><
 _|

Comment on New twist for DBD::mysql Select or Download Code

Replies are listed 'Best First'.
Re: New twist for DBD::mysql by tantarbobus (Hermit) on Sep 14, 2004 at 19:44 UTC
Patricks modifications will not go to CPAN right now; I want to have an alpha/beta period first, and then I will put it up on CPAN -- There has jsut been too much code churn and not enough testing to put it directly on CPAN. And I am still not all too happy with the way the embedded driver is being handled. The embedded driver compiles into the DBD::mysql namespace which means that you can only have one of the regular or embedded driver installed -- talk about suck. So I am trying to decide on how to handle this: maybe DBD::mysql::embeded, DBD::mysqlembed (and then how to acutually do it cleanly)? Of course you still will probably not be able to use both in the same application. The embedded stuff is probably not big enough of issue to hold up the alpha, though. That being said, there will be a bugfix version going up to CPAN shortly which should fix a few makefile issues with Mac OSX, Red Hat 9 with default perl, and a fix to column info when using later version of mysql. And there is also 3.0 version of DBD::mysql is in the works, and it is very close to going up for an Alpha/Beta cycle. These are the changes that I was working on while Patrick was adding prepared statement support (So this version still needs to have his latest prepare() patch merged in) As for LIMIT ?,?. The reason why it was not supported since 2.9002 is that it allowed for sql injection attacks, and it is not trivial to fix, in fact, I just scanned over Patrick's code and found a bug in the LIMIT handling code (grrr, I really hate this LIMIT stuff). I am of 1/2 a mind to make it optional because all of the complaints (and vitriol, and invective) that I have recieved about this one change -- What do you think of `$dbh->{allow_sql_injection_attacks}` ;) Of course, this whole issue should go away when you are using prepared_statements, and I'd rather just say that you have to use prepared statements before you get the LIMIT ?,? stuff (Assuming that the sever gets support for LIMIT ?,? with prepared statements). And the truth is, you could always use LIMIT ?; you just had to call bind_param() once to set the types before you started executing.	[reply] [d/l]
Re^2: New twist for DBD::mysql by Anonymous Monk on Sep 15, 2004 at 04:21 UTC
Its more than ok to put up alphas/betas on cpan for testing, so as long you name them properly (underscore or letter in name, as in 1.01_2, or 1.01a).	[reply]
Re: New twist for DBD::mysql by demerphq (Chancellor) on Sep 14, 2004 at 11:05 UTC
It looks like the fetchall_arrayref time has gone up quite a bit... Any idea why? Is it just spurious? --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi Flux8}	[reply]
Re: New twist for DBD::mysql by dragonchild (Archbishop) on Sep 14, 2004 at 12:39 UTC
Excellent! ++ for the news. I'm looking forward to true preprared queries. Now, will this replace the current DBD::mysql or will it be an option when you install DBD::mysql, auto-detecting which version of the client library you have and asking only if you have both? I would seriously hope that this distribution doesn't fork based on client library ... ------ We are the carpenters and bricklayers of the Information Age. Then there are Damian modules.... sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon.* - flyingmoose I shouldn't have to say this, but any code, unless otherwise stated, is untested	[reply]
Re^2: New twist for DBD::mysql by tantarbobus (Hermit) on Sep 14, 2004 at 19:49 UTC
DBD::mysql will support server-side, regualar prepare, embedded, and non-embedded all in one distribution. Athough the embedded non-embedded pair might be "forked" within the same distribution so that they can be in different namespaces allowing you to install both on the same system.	[reply]
Re: New twist for DBD::mysql by bsdz (Friar) on Sep 14, 2004 at 11:12 UTC
Brilliant news! I already use MySQL 4.1 for its subquery functionality. These additional features for the Perl client will be of great use especially for online reporting systems where a user would like to limit the number of rows returned. And wow! An embedded MySQL server client - that sounds very interesting. I'm off to checkout the CVS now for testing :-)	[reply]

Back to Meditations