Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
XP is just a number
 
PerlMonks  

Re^4: SQLite Slow insert

by Anonymous Monk
on Feb 19, 2013 at 10:50 UTC ( #1019519=note: print w/ replies, xml ) Need Help??


in reply to Re^3: SQLite Slow insert
in thread SQLite Slow insert

There were a few issues in your code. I fixed them below.

my $sth = $dbh->prepare("INSERT INTO HLDdata (Data) VALUES (?)"); my $count = $dbh->prepare( "select count(*) from HLDdata where Data = +?"); foreach $txt (@TXT) { print "$txt\n"; open(IN,"$txt") || warn("cant open $txt"); while(<IN>){ $line = $_; $count->execute($line); my ($data) = $count->fetchrow(); print "$data \n"; if ($data != 0){ print "record exist not adding\n"; Logit("record exist not adding"); next; } my ($Data) = $line; $sth->execute($Data); $line =~s/\|/\t/g; # print "$line\n"; print OUT $line; } }

Yes, you're probably correct that the checking is taking the most of the time. This is because you're missing an index on HLDdata (data) -- sqlite has to scan the whole table every time you check for a row's existence. You can create the missing index with CREATE UNIQUE INDEX HLDdata_uniq ON HLDdata (Data);

There's a trade-off: you'll now use double the disk space. If $Data tends to be big (above a hundred bytes or so), you are better off storing a hash (sha1, md5, or similar) of $Data in the table in a second column and checking for its existence instead. (SQLite does not appear to have support for functional indices or a hashing function, so you'll have to do it on the Perl side.)


Comment on Re^4: SQLite Slow insert
Select or Download Code
Re^5: SQLite Slow insert
by mmittiga17 (Scribe) on Feb 21, 2013 at 20:16 UTC

    Thank you for your help, it got me sorted. I appreciate the help and have learned from it. Cheers!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1019519]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-04-20 12:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (485 votes), past polls