Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

fetchall_arrayref with slice and row count

by gmax (Abbot)
on Jan 31, 2006 at 17:54 UTC ( #526823=perlmeditation: print w/ replies, xml ) Need Help??

The DBI docs say that fetchall_arrayref can get records in batches, as a compromise between fetching one-by-one and all-at-once.

The example from the manual goes like this.

my $rows = []; # cache for batches of rows while( my $row = ( shift(@$rows) || # get row from cache, or reload +cache: shift(@{$rows=$sth->fetchall_arrayref(undef,10_00 +0)||[]}) ) ) { ... }

This idiom makes sense, because it can fetch records at high speed, as shown in Speeding up the DBI.

A few days ago, brother bradcathey asked my help for a case that was similar to the idiom in the manual, with the addition of a slice, so that fetchall_arrayref produces a AoH instead of a AoA. This case led to an error, as shown in the following code.

#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use DBI; my $driver = shift || 'mysql'; my $db = 'test'; my $user = undef; my $pass = undef; my $dbh; $driver = lc $driver; if ($driver eq 'mysql' ) { $dbh = DBI->connect('DBI:mysql:'.$db . ";mysql_read_default_file=$ENV{HOME}/.my.cnf" , $user, $pass, {RaiseError => 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::mysql: %s\n", $DBI::VERSION, $DBD::mysql::V +ERSION; } elsif ($driver eq 'sqlite') { $db = 'test.db'; if ( -f $db) { unlink $db; } $dbh = DBI->connect('DBI:SQLite:'.$db, $user, $pass, {RaiseError => + 1}) or die "Can't connect: $DBI::errstr\n"; printf "DBI: %s - DBD::SQLite: %s\n", $DBI::VERSION, $DBD::SQLite:: +VERSION; } else { die "driver $driver not supported in this test\n"; } my $max_fields = 3; my $field_size = 9; my $max_records = 10; my $max_commit = 5; my $text = 'abcdefghi'; my $inserted = 0; my $create_query = qq{CREATE TABLE testdbi ( } . (join q{,}, map { qq{id$_ char($field_size) not null} } ( 1..$ma +x_fields )) . qq{, primary key (id1))}; { local $dbh->{PrintError} = 0; eval { $dbh->do(qq{DROP TABLE testdbi}) }; } $dbh->do(qq{begin}); $dbh->do($create_query); my $sth = $dbh->prepare(qq{INSERT INTO testdbi VALUES (} . (join q{,}, map {q{?}} (1..$max_fields)) .q{)} ); for (1..$max_records) { $inserted += $sth->execute( map {$text++} (1..$max_fields) ); if (($inserted % $max_commit) == 0) { $dbh->do(qq{commit}); print "$inserted\n"; $dbh->do(qq{begin}); } } $dbh->do(qq{commit}); print "inserted $inserted records\n"; my $max_rows = 3; #must leave a remainder for the error #multiples of 2 and 5 work print "without slice\n"; my $rowcache = []; my $select_query = qq{SELECT id1, id2 FROM testdbi ORDER BY id1}; $sth = $dbh->prepare($select_query); $sth->execute; my $count = 0; while (my $aref = ( shift(@{ $rowcache } ) || shift (@{$rowcache=$sth->fetchall_arrayref(undef, $max_ro +ws) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; #---- HERE STARTS THE FAILING CODE ------ print "with slice\n"; $rowcache = []; $sth = $dbh->prepare($select_query); $sth->execute; $count = 0; while (my $aref = ( shift(@{ $rowcache }) || shift (@{$rowcache=$sth->fetchall_arrayref({}, $max_rows) || [] } ) ) ) { my $rec = Data::Dumper->Dump([$aref],['rec']); $rec =~ s/\s+/ /g; $count++; print "$count $rec \n"; }; __END__ $ perl test_dbi.pl mysql DBI: 1.49 - DBD::mysql: 3.0002 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' }; DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. DBD::mysql::st fetchall_arrayref failed: fetch() without execute() at +test_dbi.pl line 106. $ perl test_dbi.pl sqlite DBI: 1.49 - DBD::SQLite: 1.09 5 10 inserted 10 records without slice 1 $rec = [ 'abcdefghi', 'abcdefghj' ]; 2 $rec = [ 'abcdefghl', 'abcdefghm' ]; 3 $rec = [ 'abcdefgho', 'abcdefghp' ]; 4 $rec = [ 'abcdefghr', 'abcdefghs' ]; 5 $rec = [ 'abcdefghu', 'abcdefghv' ]; 6 $rec = [ 'abcdefghx', 'abcdefghy' ]; 7 $rec = [ 'abcdefgia', 'abcdefgib' ]; 8 $rec = [ 'abcdefgid', 'abcdefgie' ]; 9 $rec = [ 'abcdefgig', 'abcdefgih' ]; 10 $rec = [ 'abcdefgij', 'abcdefgik' ]; with slice 1 $rec = { 'id1' => 'abcdefghi', 'id2' => 'abcdefghj' }; 2 $rec = { 'id1' => 'abcdefghl', 'id2' => 'abcdefghm' }; 3 $rec = { 'id1' => 'abcdefgho', 'id2' => 'abcdefghp' }; 4 $rec = { 'id1' => 'abcdefghr', 'id2' => 'abcdefghs' }; 5 $rec = { 'id1' => 'abcdefghu', 'id2' => 'abcdefghv' }; 6 $rec = { 'id1' => 'abcdefghx', 'id2' => 'abcdefghy' }; 7 $rec = { 'id1' => 'abcdefgia', 'id2' => 'abcdefgib' }; 8 $rec = { 'id1' => 'abcdefgid', 'id2' => 'abcdefgie' }; 9 $rec = { 'id1' => 'abcdefgig', 'id2' => 'abcdefgih' }; 10 $rec = { 'id1' => 'abcdefgij', 'id2' => 'abcdefgik' };

As you can see from the sample output, the example fails when using both a slice and a row count with DBD::mysql, but it does not fail with DBD::SQLite. However, changing $max_records to 1,000, the script fails with SQLite2 (segmentation fault).

While initially I was thinking that perhaps there was a bug in DBD::mysql, this latter fact made me think that perhaps using fetchall_arrayref with a slice and row count together was never an intended behavior. Why? Because the row count device is intended for speed, while the slice (leading to a hashref per record) is intended for clarity, but has a speed penalty that does not agree with the row count option.

Thus, I advised bradcathey to use the recipe described at DBI recipes /binding a hash, i.e. using bind_col with a static hash. This way, there can be clarity without suffering too much speed penalty.

So, what do you think? It is a bug, or wrong usage?

 _  _ _  _  
(_|| | |(_|><
 _|   

Comment on fetchall_arrayref with slice and row count
Select or Download Code
Replies are listed 'Best First'.
Re: fetchall_arrayref with slice and row count
by blokhead (Monsignor) on Jan 31, 2006 at 19:04 UTC
    I found the same thing to be true in DBI::fetchall_arrayref() error. I guess you could say that the general consensus was that it was a bug. I think it would be nice to have the best of both worlds (slice + limit).

    blokhead

      Not a bug. So wrong usage.
      while ( $sth->{Active} and $rowcache=$sth->fetchall_arrayref({}, $max_rows) ) { ... }
      See the DBI documentation: the last possible fetch in a fetchall will call $sth->finish.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://526823]
Approved by friedo
Front-paged by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2015-07-08 03:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (93 votes), past polls