comment on

A new DBD::mysql

Patrick Galbraith, senior developer at MySQL AB, has made available a modified version of DBD::mysql. It is not in the CPAN yet, (and I believe it will go there when the maintainer is satisfied with the tests) but it is usable.

What is the fuss about? This version makes a few interesting additions to this popular module:

Support for true prepared statements. Version 4.1 of MySQL introduces prepared statements, with a few additions to the C API. So far, the DBI was emulating prepared statements in MySQL, with the convenience and security of placeholders, but without the efficiency that should come along.
Now you may use the full potential of prepared statements with DBI/DBD::mysql.
To take advantage of this new feature, you have to set an attribute in the database handler, either at connection time or later, when you need it.
```
# either
my $dbh = DBI->connect("dbi:mysql:dbname;mysql_server_prepare=1",
     "user","pass");

# or 
$dbh->{ mysql_server_prepare } = 1;
[download]
```
Alternatively, you can trigger the prepared statement mechanism by setting an environment variable before executing your script:
```
export MYSQL_SERVER_PREPARE=1
[download]
```
Support for placeholders in LIMIT. This was a bug (or a feature, or lack of feature, depending on your angle) that made many people upset. I recall several irate users venting their rage at the mailing list, asking for this feature.
Mostly wanted to create pages of results, now it is available.
```
my $query = qq{SELECT col1, col2, col3 FROM mytable LIMIT ? , ?};
my $sth = $dbh->prepare($query);
my $lines = 20;

for my $page_no (1..10) {
    $sth->execute($page_no, $lines);
    # do something with the page.
}
[download]
```
Support for embedded MySQL server. A new module DBD::mysqlEmb comes with the same distribution, and it lets you use MySQL databases without a server. The client application will include an embedded server library, and you can manipulate a MySQL database without the hassle of installing the server.
It's like DBD::SQLite but it isn't "lite." The only significant limitation is that it should be used by a single user only.

Some documentation is available at this OSCON presentation (PDF).

Installing it

As I mentioned before, it is not on CPAN yet. You must get it from the author's CVS and fiddle a little bit to get it up and running.

The main problem is that this new version requires the client library that comes with MySQL 4.1, and if you are already using MySQL 3.23/4.0 in your box, you should install the 4.1 version, and since it is not advisable to trust new software before testing it, you shold install it without removing the working server. This is not very difficult, but not easy for the average Joe either. Therefore, some caution should be necessary. Moreover, you may need to compile the server source rather than using the binaries if your system is using old libraries (at least for the embedded server this is what I had to do).

benchmarking

This change is not easy to measure up, because the benefits of a prepared statement may vary significantly depending on the query complexity. Nonetheless, I made some measurements, and even with a simple query, the difference is visible. (Please refer to Speeding up the DBI for a detailed explanation of the profiling methods used here)

$dbh1->{ mysql_server_prepare } = 0;  # emulated prepared statement
$dbh2->{ mysql_server_prepare } = 1;  # real prepared statement

my $query = qq{select book_id from books where author_id = ?  };

use DBI::Profile;
$dbh1->{Profile} = DBI::Profile->new;
$dbh2->{Profile} = DBI::Profile->new;
$dbh1->{Profile} = 4;
$dbh2->{Profile} = 4;

my $sth1 = $dbh1->prepare($query);
my $sth2 = $dbh2->prepare($query);

timethese ( 5000,
    {
        emulated => sub {
            for (1 .. 15) {
                $sth1->execute($_);
                my $rows = $sth1->fetchall_arrayref();
            }
        },
        prepared => sub {
            for (1 .. 15) {
                $sth2->execute($_);
                my $rows = $sth2->fetchall_arrayref();
            }
        }
   }
);
print "emulated\n", $dbh1->{Profile}->format;
print "prepared\n", $dbh2->{Profile}->format;
$dbh1->{Profile} =0;
$dbh2->{Profile} =0;
[download]

The benchmarking code will simply prepare two qwueries, one without using the new feature and one taking advantage of it.

Benchmark: timing 5000 iterations of emulated, prepared...
  emulated: 24 wallclock secs ( 6.88 usr +  2.52 sys =  9.40 CPU)
  prepared: 20 wallclock secs ( 6.84 usr +  2.31 sys =  9.25 CPU)

The result shows that there is a (little) convenience with the true prepared statements. The real difference, though, comes when we analyze the profiling output.

emulated
DBI::Profile: 22.655414s (150003 calls) benchdbd.pl @ 2004-09-14 10:31:07
'FETCH'             => 0.000006s
'STORE'             => 0.000237s
'execute'           => 20.919974s / 75000 = 0.000279s avg 
'fetchall_arrayref' => 1.735029s / 75000 = 0.000023s avg 
'prepare'           => 0.000168s

prepared
DBI::Profile: 19.629611s (150003 calls) benchdbd.pl @ 2004-09-14 10:31:07
'FETCH'             => 0.000007s
'STORE'             => 0.000103s
'execute'           => 17.339136s / 75000 = 0.000231s avg 
'fetchall_arrayref' => 2.289909s / 75000 = 0.000031s avg 
'prepare'           => 0.000456s

Notice first that the 'prepare' method takes longer when using the real prepared statements. This is because the emulated 'prepare' is just a way of passing parameters, and it does not call the dataabse server at all. The difference is all in the 'execute' method, where the emulated version takes 15% longer than the new implementation.

Update Sep 14, 2004 at 11:23 GMT+1
I also noticed that fetchall_arrayref is taking longer but I forgot to mention it here. (Thanks to demerphq for reminding me).
I assume (but this is really a wild guess, mind you) that this is due to the larger data structure needed for prepared statement that is being released at this moment.
Anyway, hopefully the author is going to look at this page and if an explanation exists it will come up eventually.

 _  _ _  _  
(_|| | |(_|><
 _|

In reply to New twist for DBD::mysql by gmax

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Think about Loose Coupling
	PerlMonks