H. Merijn Brand, author of DBD::Unify, gave a talk at the March
Amsterdam.pm meeting that I found interesting as well as somewhat
relevant to recent discussions. The following is a summary with
some examples (full examples might be available on the Amsterdam.pm
homepage in the future). This is being reprinted with permission.
In fact, I only wrote half the post.
The most efficient way to retrieve data is with bindcolumns.
Benchmarks with both Unify and Oracle showed hashref to be the least
efficient form of retrieving data, followed by array and arrayref.
my $sth = $sth->prepare ("select field from table");
$sth->execute;
$href = $sth->fetchrow_hashref; # slowest
@arr = $sth->fetchrow_array;
$aref = $sth->fetchrow_arrayref;
my $field;
$sth->bind_columns (\$field);
$sth->fetch; # fastest
This efficiency increase comes from
within DBI itself, as was demonstrated by verbose output from DBD::Unify,
which received the same request regardless of the fetching method.
Selecting only the fields you want is more efficient.
In tests, fetching 1 field was much faster than fetching all 8. With
Unify and bindcolumns, 8 fields were retrieved at 7737.02 records per
second, whereas 1 field came back at 13980.40 records per second. For
oracle the numbers were 54249.67 and 138082.63.
Also...
Merijn also discussed his his wrapper module which serves to hide the
database handle which is not needed for read-only database access.
The database handle is only truly needed for commit and rollback.
Disconnect can be ignored without warnings by either an explicit undef
of all statement handles, or implicitly by putting statement handles in
a limited scope. Unsurprisingly, the module also simplifies column
binding through the use of prepex, a function which prepares, binds
and executes queries, so one can do:
my $sth = prepex (
"select field1, field2",
\($field1, $field2),
"from table");
while ($sth->fetch) {
:
}
Selected benchmarks
UNIFY
============== Benchmarks (2500 records)
Benchmark: running array, arrayref, bindcol, hashref, each for at leas
+t 30 CPU seconds...
hashref: 41 wallclock secs (39.26 usr + 0.01 sys = 39.27 CPU) @ 1.
+81/s (n= 71)
array: 35 wallclock secs (33.68 usr + 0.01 sys = 33.69 CPU) @ 2.
+29/s (n= 77) 26.52 %
arrayref: 34 wallclock secs (33.31 usr + 0.01 sys = 33.32 CPU) @ 2.
+49/s (n= 83) 8.73 % 37.57 %
bindcol: 34 wallclock secs (33.28 usr + 0.01 sys = 33.29 CPU) @ 2.
+58/s (n= 86) 3.61 % 42.54 %
============== Benchmarks (~ 200 000 records) [8 fields]
Benchmark: timing 5 iterations of array, arrayref, bindcol, hashref...
hashref: 188 wallclock secs (183.68 usr + 0.47 sys = 184.15 CPU) 5
+155.63 rec/s
array: 140 wallclock secs (133.01 usr + 0.53 sys = 133.54 CPU) 7
+109.56 rec/s 37.90 %
arrayref: 127 wallclock secs (124.30 usr + 0.44 sys = 124.74 CPU) 7
+611.11 rec/s 7.05 % 47.63 %
bindcol: 125 wallclock secs (122.22 usr + 0.49 sys = 122.71 CPU) 7
+737.02 rec/s 1.65 % 50.07 %
============== Benchmarks (~ 200 000 records) [1 field]
Benchmark: timing 5 iterations of array, arrayref, bindcol, hashref...
hashref: 113 wallclock secs (108.99 usr + 0.48 sys = 109.47 CPU) 8
+672.79 rec/s
array: 76 wallclock secs ( 73.85 usr + 0.47 sys = 74.32 CPU) 12
+774.60 rec/s 47.30 %
arrayref: 73 wallclock secs ( 70.89 usr + 0.42 sys = 71.31 CPU) 13
+313.80 rec/s 4.22 % 53.51 %
bindcol: 69 wallclock secs ( 67.44 usr + 0.47 sys = 67.91 CPU) 13
+980.40 rec/s 5.01 % 61.20 %