Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: DBD::Sqlite queries slow - and gives wrong results

by vagabonding electron (Deacon)
on Dec 17, 2017 at 15:02 UTC ( #1205738=note: print w/replies, xml ) Need Help??


in reply to DBD::Sqlite queries slow - and gives wrong results

Just my two cents - this is rather a question than an answer :-)

I played with this query using the list of the highly esteemed monks, inserting each line 10_000 times in the sqlite table employees and using the distinct values of the column "Level" for the table groups. The database is 626 MB. (Update: created index on Level in both tables, without index the script below is slow. ) The query below was executed by SQLiteStudio 3.1.1 on Windows 10 (32 bit) in 0.174 - 0.180 sec. It does show the list as output (1000 per page), however since there are 40_000 lines in it, it takes extra time wenn you switch to the next page of the output. If you change the output preferences to put 40_000 lines on the same page it takes 0.231 sec then. The following script:

#!/perl use strict; use warnings FATAL => qw(all); use DBI; use File::Spec; use Time::HiRes qw (time); my $MEMORY = 1; my $start = time; my $PATH = "C:/TMP/___TMP/_TRASH"; my $dbfile = File::Spec->catdir($PATH, 'Big_DB.db'); my $dbh; if (1 == $MEMORY) { $dbh = DBI->connect('dbi:SQLite:dbname=:memory:',"","",{RaiseError + => 1}) or die "Couldn't connect to database: " . DBI->errstr; } else { $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","",{RaiseError +=> 1}) or die "Couldn't connect to database: " . DBI->errstr; } print "Time till connect: ", time() - $start, $/; $start = time; if (1 == $MEMORY) { $dbh->sqlite_backup_from_file($dbfile); } print "Time till load: ", time() - $start, $/; $start = time; # Marker. my $sth = $dbh->prepare("select Name, Writeups from employees e, group +s g where e.Level = g.Level and g.Level = 'Pope (28)' order by e.Writ +eups desc "); print "Time till prepare: ", time() - $start, $/; $start = time; $sth->execute(); print "Time till execute: ", time() - $start, $/; $start = time; my $answer = 0; while ( my ($name, $writeups) = $sth->fetchrow_array ) { $answer += 1; # print join(';', ($name, $writeups)), $/; } print "Answer: $answer\n"; print "Time after loop: ", time() - $start, $/; $start = time; $dbh->disconnect;

...produced the following output (run many times, the output was similar):

Time till connect: 0.0150408744812012 Time till load: 4.3910698890686 Time till prepare: 0.000697851181030273 Time till execute: 0.110021829605103 Answer: 40000 Time after loop: 0.16477108001709

What I noticed then is that you use selectall_arrayref in the original script. If I change the above script after # Marker to:

my $sql = "select Name, Writeups from employees e, groups g where e.Le +vel = g.Level and g.Level = 'Pope (28)' order by e.Writeups desc "; my $answer = 0; for my $emp (@{$dbh->selectall_arrayref($sql, {Slice => {}})}) { $answer += 1; } print "Answer: $answer\n"; print "Time after loop: ", time() - $start, $/; $start = time; $dbh->disconnect;

... then the output was as follows (again, run many times with similar output):

Time till connect: 0.0147781372070313 Time till load: 4.42248010635376 Answer: 40000 Time after loop: 0.415067911148071

The time needed is of the same order of magnitude though SQLiteStudio does seem to be a bit faster. Now the next question: 0.11 + 0.16 sec is somehow shorter than 0.41 sec. I tried to benchmark this.

#!/perl use strict; use warnings FATAL => qw(all); use DBI; use File::Spec; use Benchmark qw(:all) ; my $PATH = "C:/TMP/___TMP/_TRASH"; my $dbfile = File::Spec->catdir($PATH, 'Big_DB.db'); my $MEMORY = 1; my $dbh; if (1 == $MEMORY) { $dbh = DBI->connect('dbi:SQLite:dbname=:memory:',"","",{RaiseError + => 1}) or die "Couldn't connect to database: " . DBI->errstr; } else { $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","",{RaiseError +=> 1}) or die "Couldn't connect to database: " . DBI->errstr; } if (1 == $MEMORY) { $dbh->sqlite_backup_from_file($dbfile); } cmpthese(-3, { 'PrepExec' => sub { my $sth = $dbh->prepare("select Name, Writeups from employees +e, groups g where e.Level = g.Level and g.Level = 'Pope (28)' order b +y e.Writeups desc "); $sth->execute(); my $answer = 0; while ( my ($name, $writeups) = $sth->fetchrow_array ) { $answer += 1; # print join(';', ($name, $writeups)), $/; } }, 'selectall' => sub { my $sql = "select Name, Writeups from employees e, groups g wh +ere e.Level = g.Level and g.Level = 'Pope (28)' order by e.Writeups d +esc "; my $answer = 0; for my $emp (@{$dbh->selectall_arrayref($sql, {Slice => {}})}) + { $answer += 1; } }, }); $dbh->disconnect;

... and the output is

Rate selectall PrepExec selectall 2.88/s -- -26% PrepExec 3.90/s 35% --

for the in-memory database and

Rate selectall PrepExec selectall 2.69/s -- -25% PrepExec 3.57/s 33% --

for the on-disk database. Is it just me or is it a reasonable assumption that the prepare - execute sequence can be faster that selectall_arrayref in some cases? Btw. I did not see a notable difference in quering the database from memory vs. from disk in my tests. I did not get false results in these tests comparing the output of SQLiteStudio and Perl either.

Thanks for reading this.

Disclaimer: I am an amateur, not a professional programmer (though it does look anyway :-) ).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1205738]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2021-02-27 04:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?